[PATCH] patch: improve handling of filenames containing spaces on import (issue3723)
Alastair Houghton
alastair at alastairs-place.net
Fri Dec 7 11:37:10 UTC 2012
On 7 Dec 2012, at 10:28, Pierre-Yves David <pierre-yves.david at logilab.fr> wrote:
> On Thu, Dec 06, 2012 at 03:44:47PM -0600, Matt Mackall wrote:
>> On Thu, 2012-12-06 at 21:00 +0000, Alastair Houghton wrote:
> For a baseline, if I try to import the git-diff generated by hg into an
>> empty git repo with git-apply:
>>
>> diff --git a/foo b/foo b/foo b/foo
>> new file mode 100644
>> --- /dev/null
>> +++ b/foo b/foo
>> @@ -0,0 +1,1 @@
>> +b
>>
>> ..it almost works. By almost, I mean it creates 'foo\ b/foo' (literal
>> backslash out of thin air) rather than 'foo b/foo'. LOLWUT. This also
>> breaks in the same way with the known-to-agree-with-diff-and-patch
>> standard diff that Mercurial produces.
>
> I was unable to reproduce that here using:
>
> git version 1.7.2.5
> GNU bash, version 4.1.5(1)-release (x86_64-pc-linux-gnu) qs /bin/sh
>
> What version are you using?
I think when I originally tested I had quite an old version of git on
my machine (I don't actually use git very often---we use Mercurial
here). I just tried with 1.8.0.1, and it seemed to work.
Further testing seems to indicate that git doesn't escape for spaces,
but it will do for non-ASCII characters and control characters, among
others. e.g. I just renamed a file “foo” to “fö” and got
diff --git a/foo b/foo b/foo b/foo
deleted file mode 100644
index 6178079..0000000
--- a/foo b/foo
+++ /dev/null
@@ -1 +0,0 @@
-b
diff --git "a/foo b/f\303\266" "b/foo b/f\303\266"
new file mode 100644
index 0000000..6178079
--- /dev/null
+++ "b/foo b/f\303\266"
@@ -0,0 +1 @@
+b
from git diff HEAD
So, I think:
(a) Git has already fixed this bug. The current version (1.8.0.1)
works with a binary file too; I tried
hg init a
cd a
mkdir 'foo b'
dd if=/dev/random of="foo b/foo" bs=1024 count=4
hg ci -Am0
hg up null
hg export --git tip > p
cd ..
git init b
git apply ../a/p
Looking at git's current source code,
https://github.com/git/git/blob/master/builtin/apply.c
it only uses the "diff --git" line to obtain the filenames in
the case where they aren't anywhere else, and requires the two
names it finds to match exactly.
(b) hg should certainly handle the patch without complaint (my patch
fixes that)
(c) hg should consider handling escaping on the input for git patches
(my patch does *not* fix that)
(d) Agree with Matt; hg can't generate escaped filenames until some
time after (c) if fixed.
FWIW, (c) and (d) seem to raise all kinds of filename encoding issues;
git's escaping system seems poorly designed (in particular, it really
should support \u and \U so that it's possible to unambiguously
specify filenames in a patch, no matter what the system it's being
applied on does; git doesn't appear to support those escapes, but I
guess it just punts on filename encoding anyway). IIRC I read some
pages on the Mercurial wiki about encoding issues, so you guys have
thought about that already, right?
Kind regards,
Alastair.
--
http://alastairs-place.net
More information about the Mercurial-devel
mailing list