[PATCH 0 of 1 v4] win32lfn: allow manipulating files with long names on Windows
Matt Mackall
mpm at selenic.com
Tue Jan 25 18:39:19 UTC 2011
On Mon, 2011-01-24 at 23:55 -0500, Aaron Cohen wrote:
> I found a post you made a while ago saying that the fundamentals are
> there to handle the UTF normalization problems in a similar way to the
> way Windows' case insensitivity is being done. Do you happen to have
> time to describe how that would look?
The essentials are that there is some function F that maps names from
their original form to a form that they can be compared for equality.
In other words, Foo and foo are the same because F(Foo) and F(foo) are
the same.
In our dirstate code, we build a table called the _foldmap, which is
basically:
foldmap = {}
for name in files:
foldmap[F(name)] = name
Once we have that mapping, we can stop being confused.
So it all comes down to having the right F. Right now, our F is simply
string.lower(), which works nicely for ASCII names. But Windows'
internal F is actually much more complex than that (and not particularly
well-documented!). It does case-folding on Unicode and (at least in some
situations) tells you that A = Ä.
The same applies for OS X, by the way. We just need to supply an F that
matches the non-standard under-documented Unicode-mangling that it's
doing, and things will be good.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list