divergence using the convert extension

Matt Harbison mharbison72 at gmail.com
Sat Jun 30 02:29:23 UTC 2018


On Fri, 29 Jun 2018 10:26:37 -0400, Yuya Nishihara <yuya at tcha.org> wrote:

> On Thu, 28 Jun 2018 17:42:04 -0400, Matt Harbison wrote:
>> > On Jun 24, 2018, at 4:02 PM, Augie Fackler <raf at durin42.com> wrote:
>> >> On Fri, Jun 22, 2018 at 7:00 AM, Yuya Nishihara <yuya at tcha.org>  
>> wrote:
>> >>> On Thu, 21 Jun 2018 21:02:31 -0400, Matt Harbison wrote:
>> >>> On Thu, 21 Jun 2018 04:52:09 -0400, Benoit Fouletier  
>> <bennews at free.fr>
>> >>>> Obviously since I'm gonna strip stuff anyway, I _will_ diverge  
>> from the
>> >>>> original pretty early and will lose all hashes, that's fine, this  
>> is more
>> >>>> out of curiosity and making sure nothing too fishy it going on.
>> >>>
>> >>> I hit a similar issue last year helping someone recover from repo
>> >>> corruption.  (Convert was involved in the recovery somehow.)  On  
>> the first
>> >>> divergent commit, run `hg log -r $rev --debug` in both repos.   
>> IIRC, the
>> >>> manifest line was different.  It was reproducible, but I couldn't  
>> figure
>> >>> out how to recreate it with a simple test case.
>> >>
>> >> IIRC, there was a bug that hg would create new manifest node even if  
>> nothing
>> >> changed. That's probably the reason of the hash change.
>> >
>> > That's likely. I wouldn't sweat the difference.
>>
>> I’m seeing something similar to the original report in one repo now.   
>> I’ve
>> simplified this down to an hg -> hg convert with 4.6.1, and at the same  
>> cset,
>> it gets the same divergent hash.  The manifest value is the same before  
>> and
>> after converting.
>>
>> The really odd thing I noticed is with `hg log -vr $bad_rev`, it shows  
>> 2 more
>> files in the post conversion repo.
>
> So what's different is the list of files recorded in the changelog (in  
> short,
> ctx.files().)
>
>> But if I `hg status —changed $bad_rev`, I get waaaay more files listed  
>> in
>> both the before and after repo.
>
> That's normal if the $bad_rev is a merge commit.

Ah, right.  I overlooked that it was a merge.

>> My interest here is converting it to LFS, and silently slipping it onto  
>> the
>> server for minimal disruption.  That won’t work if the hashes change.
>
> I have no idea other than using a hacked hg to reproduce old ctx.files().

Thanks.  That didn't work, but it put me in the ballpark.  I tried  
replacing ctx.files() with a filtering function, but nothing changed.  I  
then tried excluding it when building ctx, and separately filtering it out  
of ctx._status.modified, and got the same totally different hash both  
times.

I ended up adding code to localrepo.commitctx() in the "committing  
changelog" phase to drop the two files when it processed a commit with the  
two known parents of $bad_rev.  That worked, until the same two files had  
the same problem in the next merge.  So I ran `hg log -r 'file(..)'` on  
the good repo, and coded it to drop the two files from any other commit  
(the convert_revision value is the identity test).  This worked great  
until about 50 commits later when two other files _disappeared_ from the  
changelog.  (This was the first time I've seen an octopus merge comment  
 from convert.)  I tried just adding them to the list of files for that  
commit in the same changelog area, but they didn't show up, and the  
manifest was different (and revision bumped by one).  The latter may be  
because I was doing partial conversions, and stripping bad results (and  
fixing shamap).  I'll try more on Monday, but I'm really puzzled about  
what changed in Mercurial.



More information about the Mercurial mailing list