[Bug 5771] New: Mercurial patch parser behavior for "--git" patches creating files with spaces differs from Git when the patch file omits trailing tabs
mercurial-bugs at mercurial-scm.org
mercurial-bugs at mercurial-scm.org
Tue Jan 16 17:18:14 UTC 2018
https://bz.mercurial-scm.org/show_bug.cgi?id=5771
Bug ID: 5771
Summary: Mercurial patch parser behavior for "--git" patches
creating files with spaces differs from Git when the
patch file omits trailing tabs
Product: Mercurial
Version: default branch
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: bug
Priority: wish
Component: Mercurial
Assignee: bugzilla at mercurial-scm.org
Reporter: selenic at yghe.net
CC: mercurial-devel at mercurial-scm.org
This is a tangled mess that dates back to at least 1991 and not really a bug in
Mercurial, but I figured I'd discuss it upstream for completeness in case
someone else runs into it eventually or it's otherwise helpful.
Briefly: `hg import` can parse and apply Git patches, but the behavior of `hg
import` for certain inputs (patch files which are invalid, but look reasonable)
differs from the behavior of `git apply` for the same patch files.
When a change creates a file with a space in it, like `X Y.txt`, `git` emits a
patch with a trailing tab literal:
```
diff --git ...
...
+++ b/X Y.txt\t
...
```
All of `git apply`, `hg import`, and `patch` apply this patch correctly.
However, if the semantic, trailing tab literal is removed, `git apply` parses
the file path `X Y.txt` while Mercurial parses the file path `X`.
`patch` agrees with Mercurial, but this is a `diff --git`, so arguably
Mercurial should do what Git does instead. In this case, Git's behavior is also
what the user is likely to expect.
Since 2006 (see
<https://github.com/git/git/commit/1a9eb3b9d50367bee8fe85022684d812816fe531>)
there is no way to generate these patches directly with Git. They can only be
generated by damaging a patch file in a way that strips trailing tabs (e.g.,
maybe emailing it and then copy/pasting it) or by a third-party tool generating
synthetic patches and missing this implementation detail (as here; the need to
apply a trailing tab literal to files with spaces in the names evaded me when
implementing patch generation behavior in Phabricator).
Mercurial uses this logic to parse the filename (in `patch.py`):
```
def parsefilename(str):
# --- filename \t|space stuff
s = str[4:].rstrip('\r\n')
i = s.find('\t')
if i < 0:
i = s.find(' ')
if i < 0:
return s
return s[:i]
```
Simply removing the test for spaces would make the behavior match Git's
behavior in this specific, narrow case. However, it might change behavior in
other cases, and I wasn't able to dig up the provenance of this logic (it
vanishes into Chris Mason's `mpatch` library in 2007, and I can't find a
repository for that library since the Oracle repository no longer seems to
exist). So I don't know if this change is safe or reasonable. Since it requires
a herculean effort to even encounter this behavior, the change may not be
worthwhile.
This logic also executes for both `--git` patches and unified patches, and the
implementation might reasonably prefer to match the behavior of `patch` for
patches not marked as `--git`.
Realistically, you should probably just close this issue and forget about it.
See <https://secure.phabricator.com/T8768> for additional context about how we
encountered and resolved this and what Git and `diff` do and some guesses at
why.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Mercurial-devel
mailing list