A proposal on solve encoding problem on Windows.

Pierre Asselin pa at panix.com
Sat Oct 22 01:15:36 UTC 2011


Roger Gammans <rgammans at computer-surgery.co.uk> wrote:

> You can guess or make a stab a the enviromnent encoding. But you can't guess the
> repo encoding as this is the encoding on the environment where the commit
> occured.

I don't see why the repo needs an encoding (where by "repo"
I mean the stuff under .hg).

The repository is private Mercurial data.  It is Mercurial, not
the OS, that decides what bytes get stored in the manifest.  If
Mercurial decides that Windows file names are serialized as UTF-8,
then that's that.

The filenames under .hg/store are already mangled to ASCII
and hopefully the OS won't sabotage that.


> So my suggestion was to add a repo encoding property, so that this
> was known on repo which set it. This then means that meaningful
> transcodes between the repo and the local encoding can be done.

You only need a working copy encoding for that.  If somebody
wants to clone a repo under codepage-whatever only the working
copy is affected (with an error if one of the Unicode characters
won't map).

-- 
pa at panix dot com




More information about the Mercurial mailing list