divergence using the convert extension

Matt Harbison mharbison72 at gmail.com
Fri Jul 6 02:22:29 UTC 2018


On Fri, 29 Jun 2018 22:29:23 -0400, Matt Harbison <mharbison72 at gmail.com>  
wrote:

> On Fri, 29 Jun 2018 10:26:37 -0400, Yuya Nishihara <yuya at tcha.org> wrote:
>
>> On Thu, 28 Jun 2018 17:42:04 -0400, Matt Harbison wrote:
>>> > On Jun 24, 2018, at 4:02 PM, Augie Fackler <raf at durin42.com> wrote:
>>> >> On Fri, Jun 22, 2018 at 7:00 AM, Yuya Nishihara <yuya at tcha.org>  
>>> wrote:
>>> >>> On Thu, 21 Jun 2018 21:02:31 -0400, Matt Harbison wrote:
>>> >>> On Thu, 21 Jun 2018 04:52:09 -0400, Benoit Fouletier  
>>> <bennews at free.fr>
>>> >>>> Obviously since I'm gonna strip stuff anyway, I _will_ diverge  
>>> from the
>>> >>>> original pretty early and will lose all hashes, that's fine, this  
>>> is more
>>> >>>> out of curiosity and making sure nothing too fishy it going on.
>>> >>>
>>> >>> I hit a similar issue last year helping someone recover from repo
>>> >>> corruption.  (Convert was involved in the recovery somehow.)  On  
>>> the first
>>> >>> divergent commit, run `hg log -r $rev --debug` in both repos.   
>>> IIRC, the
>>> >>> manifest line was different.  It was reproducible, but I couldn't  
>>> figure
>>> >>> out how to recreate it with a simple test case.
>>> >>
>>> >> IIRC, there was a bug that hg would create new manifest node even  
>>> if nothing
>>> >> changed. That's probably the reason of the hash change.
>>> >
>>> > That's likely. I wouldn't sweat the difference.
>>>
>>> I’m seeing something similar to the original report in one repo now.   
>>> I’ve
>>> simplified this down to an hg -> hg convert with 4.6.1, and at the  
>>> same cset,
>>> it gets the same divergent hash.  The manifest value is the same  
>>> before and
>>> after converting.
>>>
>>> The really odd thing I noticed is with `hg log -vr $bad_rev`, it shows  
>>> 2 more
>>> files in the post conversion repo.
>>
>> So what's different is the list of files recorded in the changelog (in  
>> short,
>> ctx.files().)
>>
>>> But if I `hg status —changed $bad_rev`, I get waaaay more files listed  
>>> in
>>> both the before and after repo.
>>
>> That's normal if the $bad_rev is a merge commit.
>
> Ah, right.  I overlooked that it was a merge.
>
>>> My interest here is converting it to LFS, and silently slipping it  
>>> onto the
>>> server for minimal disruption.  That won’t work if the hashes change.
>>
>> I have no idea other than using a hacked hg to reproduce old  
>> ctx.files().
>
> Thanks.  That didn't work, but it put me in the ballpark.  I tried  
> replacing ctx.files() with a filtering function, but nothing changed.  I  
> then tried excluding it when building ctx, and separately filtering it  
> out of ctx._status.modified, and got the same totally different hash  
> both times.
>
> I ended up adding code to localrepo.commitctx() in the "committing  
> changelog" phase to drop the two files when it processed a commit with  
> the two known parents of $bad_rev.  That worked, until the same two  
> files had the same problem in the next merge.  So I ran `hg log -r  
> 'file(..)'` on the good repo, and coded it to drop the two files from  
> any other commit (the convert_revision value is the identity test).   
> This worked great until about 50 commits later when two other files  
> _disappeared_ from the changelog.  (This was the first time I've seen an  
> octopus merge comment  from convert.)  I tried just adding them to the  
> list of files for that commit in the same changelog area, but they  
> didn't show up, and the manifest was different (and revision bumped by  
> one).  The latter may be because I was doing partial conversions, and  
> stripping bad results (and fixing shamap).  I'll try more on Monday, but  
> I'm really puzzled about what changed in Mercurial.

I've been assuming that it was something changed between 3.7.1 or so and  
4.6.1.  But I can reproduce the same repo by re-running the original  
convert with 4.6.1, and then recreate the divergences by doing the hg ->  
hg convert with 4.6.1 (both changelog and manifest differences).  So it  
may be worth digging into a bit more, but I'm not sure what else to do.  I  
submitted a series to show a similar manifest divergence in an existing  
test.


More information about the Mercurial mailing list