[PATCH] patch: improve handling of filenames containing spaces on import (issue3723)

Alastair Houghton alastair at alastairs-place.net
Fri Dec 7 11:37:10 UTC 2012


On 7 Dec 2012, at 10:28, Pierre-Yves David <pierre-yves.david at logilab.fr> wrote:

> On Thu, Dec 06, 2012 at 03:44:47PM -0600, Matt Mackall wrote:
>> On Thu, 2012-12-06 at 21:00 +0000, Alastair Houghton wrote:
> For a baseline, if I try to import the git-diff generated by hg into an
>> empty git repo with git-apply:
>> 
>> diff --git a/foo b/foo b/foo b/foo
>> new file mode 100644
>> --- /dev/null
>> +++ b/foo b/foo	
>> @@ -0,0 +1,1 @@
>> +b
>> 
>> ..it almost works. By almost, I mean it creates 'foo\ b/foo' (literal
>> backslash out of thin air) rather than 'foo b/foo'. LOLWUT. This also
>> breaks in the same way with the known-to-agree-with-diff-and-patch
>> standard diff that Mercurial produces.
> 
> I was unable to reproduce that here using:
> 
> git version 1.7.2.5
> GNU bash, version 4.1.5(1)-release (x86_64-pc-linux-gnu) qs /bin/sh
> 
> What version are you using?

I think when I originally tested I had quite an old version of git on
my machine (I don't actually use git very often---we use Mercurial
here).  I just tried with 1.8.0.1, and it seemed to work.

Further testing seems to indicate that git doesn't escape for spaces,
but it will do for non-ASCII characters and control characters, among
others.  e.g. I just renamed a file “foo” to “fö” and got

  diff --git a/foo b/foo b/foo b/foo
  deleted file mode 100644
  index 6178079..0000000
  --- a/foo b/foo 
  +++ /dev/null
  @@ -1 +0,0 @@
  -b
  diff --git "a/foo b/f\303\266" "b/foo b/f\303\266"
  new file mode 100644
  index 0000000..6178079
  --- /dev/null
  +++ "b/foo b/f\303\266" 
  @@ -0,0 +1 @@
  +b

from git diff HEAD

So, I think:

(a) Git has already fixed this bug.  The current version (1.8.0.1)
    works with a binary file too; I tried

      hg init a
      cd a
      mkdir 'foo b'
      dd if=/dev/random of="foo b/foo" bs=1024 count=4
      hg ci -Am0
      hg up null
      hg export --git tip > p
      cd ..
      git init b
      git apply ../a/p

    Looking at git's current source code,

      https://github.com/git/git/blob/master/builtin/apply.c

    it only uses the "diff --git" line to obtain the filenames in
    the case where they aren't anywhere else, and requires the two
    names it finds to match exactly.
(b) hg should certainly handle the patch without complaint (my patch
    fixes that)
(c) hg should consider handling escaping on the input for git patches
    (my patch does *not* fix that)
(d) Agree with Matt; hg can't generate escaped filenames until some
    time after (c) if fixed.

FWIW, (c) and (d) seem to raise all kinds of filename encoding issues;
git's escaping system seems poorly designed (in particular, it really
should support \u and \U so that it's possible to unambiguously
specify filenames in a patch, no matter what the system it's being
applied on does; git doesn't appear to support those escapes, but I
guess it just punts on filename encoding anyway).  IIRC I read some
pages on the Mercurial wiki about encoding issues, so you guys have
thought about that already, right?

Kind regards,

Alastair.

--
http://alastairs-place.net







More information about the Mercurial-devel mailing list