Umlauts in filenames on Windows

Martin Geisler mg at daimi.au.dk
Wed Jan 28 13:01:43 UTC 2009


Stefan Rusek <stefan at rusek.org> writes:

> It seems to me that this discussion has taken a turn that does not
> make sense. Those in favor of doing nothing are presenting an argument
> that says that Unix filenames are not characters but bytes, so no
> action is needed, and someone can write an extension to support
> Unicode on Windows.

Just one more thing... :-) You might have seen it already, but if not,
then please take a look at my little demo program:

  http://bitbucket.org/mg/unitar/

This program works sort of like tar: it gets filenames from the command
line and stores the filenames with their content in a container. It can
unpack a container again. So it has to deal with all the same problems
as Mercurial has to:

* interpreting the byte strings passed on the command line

* listing directories and reading files

* writing files to disk

The challenge is to make the program work flawlessly across systems and
encodings. Fixing it to catch the exceptions and implementing work-
arounds for them should be a good way to practice what the extension
will have to do. There are an example of a failure here:

  http://thread.gmane.org/gmane.comp.version-control.mercurial.general/7852/focus=9055

-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20090128/4ed1c65b/attachment-0001.asc>


More information about the Mercurial mailing list