Mercurial Digest, Vol 17, Issue 59

Frank Kingswood frank at kingswood-consulting.co.uk
Fri Sep 15 11:54:27 UTC 2006


 > On Thu, Sep 14, 2006 at 01:56:41PM -0500, Matt Mackall wrote:
 >> On Mon, Sep 04, 2006 at 01:47:55PM +0200, Andrew Beekhof wrote:
 >>> Specifically, I chose to create a new index format, which may or may 
 >>> not have been what Matt intended, and I'm happy to discuss the 
 >>> appropriateness of that decision.
 >> Benoit and I came up with a scheme for extending the changelog in a
 >> backwards-compatible way:
 >>
 >> http://www.selenic.com/mercurial/wiki/index.cgi/ExtendedChangelog

This is absolutely gross. Nul bytes in an otherwise ASCII text file???

Josef Jeff Sipek wrote:
 > Feels like a hack. I can see the reason for tacking the extra field after
 > the timezone, but the whole key-value pair separated by \0 seems 
repulsive.
 > (Reminds me of SVN's properties, which are an evil hack IMHO.) I'm 
all for
 > extensible changelog format, but if anything wouldn't it be better to do:
 >
 > <manifest hash>\n
 > <user>\n
 > <time> <tz>\n
 > <key1>:<value1>\n
 > <key2>:<value2>\n
 > <key3>:<value3>\n
 > file1\n
 > file2\n

This would make the old clients think <key1>:<value1> is the name of a 
file. Much better to have a format that is sufficiently different so 
that an old client can not silently fail.

IMHO it would be much better to go for a format that is designed from 
scratch to be extensible. We could do something similar to MIME:

<manifest hash>\n
user:<user>\n
time:<time> <timezone>\n
key:value\n
key:value\n
\tmore value
\n
file1\n
file2\n
file3\n
...
\n
comment text
...

Having the tag names for user and time will be enough to ensure that 
the  old and new changelog formats can be differentiated easily.

Looking at Benoit's patch, I notice that the comment text is "ideally 
utf-8". With Python's codecs, why don't we enforce this?

Frank



More information about the Mercurial mailing list