[PATCH 1 of 8] use UTF-8 to encode/decode log text
Brendan Cully
brendan at kublai.com
Mon Nov 20 18:52:51 UTC 2006
On Tuesday, 21 November 2006 at 00:48, Andrey wrote:
> On 21 November 2006 (Tue) 00:14, Alexis S. L. Carvalho wrote:
> > Thus spake Andrey:
> > > @@ -60,6 +62,7 @@ class changelog(revlog):
> > > """
> > > if not text:
> > > return (nullid, "", (0, 0), [], "", {})
> > > + text = unicode(text, CHANGELOG_ENCODING)
> >
> > Should we encode/decode the whole changelog text or just the user and
> > comment sections?
> >
> > I'm not sure about the extra section (branch name should be UTF-8, but
> > I don't know if binary data is forbidden), but, at least for now, I
> > think we don't want to encode/decode the list of files.
>
> I see. Seems like only comment should be encoded for now, and maybe extra.
I doubt extra should be either - it's only accessed via code, and it
supports binary data, so it's probably better for the users of
particular fields there to decide for themselves whether to encode
them.
More information about the Mercurial-devel
mailing list