[PATCH V2] convert: parse quoted keys and values in mapfiles (BC)

Yuya Nishihara yuya at tcha.org
Wed Sep 27 07:05:35 UTC 2017


On Tue, 26 Sep 2017 22:20:14 +0200, Ingmar Blonk wrote:
> On 25-9-2017 15:23, Yuya Nishihara wrote:
> Ok, detecting single and double quote usage is difficult taking escapes 
> and correct closing quotes into account.
> The best I can come up with is the following snippet:
> 
>      try:
>          key, value = shlex.split(line, posix=True)
>      except ValueError:
>          try:
>              # Fall back to old behavior
>              key, value = line.rsplit(' ', 1)
>          except ValueError:
>              raise error.Abort(
>                  _('syntax error in %s(%d): key/value pair expected')
>                  % (self.path, i + 1))
> 
> The only situations when this could go wrong are when one forgets a 
> closing quote or when one expects the old behavior, quotes are present 
> (including closing quote) and the unpack doesn't raise.
> You agree?

Yeah, that's what I have in mind.

> >> @@ -479,7 +481,7 @@
> >>                    raise error.Abort(
> >>                        _('could not open map file %r: %s') %
> >>                        (self.path, encoding.strtolocal(err.strerror)))
> >> -        self.fp.write('%s %s\n' % (key, value))
> >> +        self.fp.write('%s %s\n' % (pipes.quote(key), pipes.quote(value)))
> >>            self.fp.flush()
> >>            super(mapfile, self).__setitem__(key, value)
> > I meant the parsing of branchmap should be a separate function. The mapfile
> > class has write functionality because it's also used as an internal storage.
> > rsplit() was introduced at af1117f37fa7, and digging it further, it was
> > necessary to fix the issue 1224. So changing the syntax of mapfile will
> > probably break existing repositories.
> >
> > https://bz.mercurial-scm.org/show_bug.cgi?id=1224
> 
> I understood, however I looked a bit further into it.
> The mapfile class is basically a dict backed by a file. The dict 
> supports values with spaces, the file format only for the key. Adding 
> the quote() in the write functionality will make the file support all 
> dict keys and values.
> Looking into issue 1224, the generated shamap will contain the spaces of 
> the subversion directory paths as keys. The rsplit() allows the spaces 
> in the keys. To me this seems to fix only half of the problem where 
> you'd like full support for whitespace.
> The proposed solution using pipes.quote() only adds surrounding quotes 
> when spaces or quotes are in the string itself. This in combination with 
> the above proposed parsing fall back mechanism should cover all cases 
> that I can think of. At least test test-convert-svn-source runs fine.

shamap values should never contain spaces. That's why rsplit() was just
fine for data which the mapfile class was originally designed for. IMHO,
there's no need to break compatibility of the storage across hg versions.

> I understand that for something like Mercurial stability and backwards 
> compatibility is important, but on the other hand I'm trying to avoid 
> writing extra code when not needed (and possibly leaving out an 
> improvement). This is my first patch for Mercurial, so I'm not sure how 
> these things are handled.

If I understand correctly, branchmap can be a plain dict (with no write
function), so the new code would be just ~10 lines.



More information about the Mercurial-devel mailing list