[PATCH STABLE] convert: fix git copy file content conversions

Durham Goode durham at fb.com
Fri Aug 7 00:28:22 UTC 2015


# HG changeset patch
# User Durham Goode <durham at fb.com>
# Date 1438906906 25200
#      Thu Aug 06 17:21:46 2015 -0700
# Branch stable
# Node ID 5d709648db13e38f91facc914bc7d9c6aba311d0
# Parent  79f0cb97d7537a7c2948f8f9b0a89148825a3a1d
convert: fix git copy file content conversions

There was a bug in the git convert code where if you copied a file and modified
the copy source in the same commit, and if the copy dest was alphabetically
earlier than the copy source, the converted version would use the copy dest
contents for both the source and the target.

The root of the bug is that the git diff-tree output is formatted like so:

:<mode> <mode> <oldhash>    <newhash>    <state> <src>   <dest>
:100644 100644 c1ab79a15... 3dfc779ab... C069    oldname newname
:100644 100644 c1ab79a15... 03e2188a6... M       oldname

The old code would always take the 'oldname' field as the name of the file being
processed, then it would try to do an extra convert for the newname. This works
for renames because it does a delete for the oldname and a create for the
newname.

For copies though, it ends up associating the copied content (3dfc779ab above)
with the oldname. It only happened when the dest was alphabetically before
because that meant the copy got processed before the modification.

The fix is the treat copy lines as affecting only the newname, and not marking
the oldname as processed.

diff --git a/hgext/convert/git.py b/hgext/convert/git.py
--- a/hgext/convert/git.py
+++ b/hgext/convert/git.py
@@ -255,12 +255,18 @@ class convert_git(converter_source):
                 entry = l.split()
                 continue
             f = l
+            if entry[4][0] == 'C':
+                copysrc = f
+                copydest = difftree[i]
+                i += 1
+                f = copydest
+                copies[copydest] = copysrc
             if f not in seen:
                 add(entry, f, False)
             # A file can be copied multiple times, or modified and copied
             # simultaneously. So f can be repeated even if fdest isn't.
-            if entry[4][0] in 'RC':
-                # rename or copy: next line is the destination
+            if entry[4][0] == 'R':
+                # rename: next line is the destination
                 fdest = difftree[i]
                 i += 1
                 if fdest not in seen:
diff --git a/tests/test-convert-git.t b/tests/test-convert-git.t
--- a/tests/test-convert-git.t
+++ b/tests/test-convert-git.t
@@ -321,8 +321,9 @@ since bar is not touched in this commit,
   $ cp bar bar-copied
   $ cp baz baz-copied
   $ cp baz baz-copied2
+  $ cp baz ba-copy
   $ echo baz2 >> baz
-  $ git add bar-copied baz-copied baz-copied2
+  $ git add bar-copied baz-copied baz-copied2 ba-copy
   $ commit -a -m 'rename and copy'
   $ cd ..
 
@@ -340,6 +341,8 @@ input validation
   $ hg -q convert --config convert.git.similarity=100 --datesort git-repo2 fullrepo
   $ hg -R fullrepo status -C --change master
   M baz
+  A ba-copy
+    baz
   A bar-copied
   A baz-copied
     baz
@@ -349,6 +352,13 @@ input validation
     foo
   R foo
 
+Ensure that the modification to the copy source was preserved
+(there was a bug where if the copy dest was alphabetically prior to the copy
+source, the copy source took the contents of the copy dest)
+  $ hg cat -r tip fullrepo/baz
+  baz
+  baz2
+
   $ cd git-repo2
   $ echo bar2 >> bar
   $ commit -a -m 'change bar'


More information about the Mercurial-devel mailing list