[PATCH STABLE V2] i18n: fix case folding problem with problematic encodings
Matt Mackall
mpm at selenic.com
Tue Nov 29 21:38:30 UTC 2011
On Wed, 2011-11-30 at 05:24 +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1322598040 -32400
> # Branch stable
> # Node ID 5bf954f0303aefbcbfc2eefbefc5d7e9f95b98a7
> # Parent e387e760b207383c961ed8accd35583791a33bb0
> i18n: fix case folding problem with problematic encodings
>
> changeset 28e98a8b173d for case folding problem with problematic
> encoding was not enough.
>
> this patch covers up a fault of fix in it.
Eep, way too much in one patch. Each of these bullet points ought to be
its own patch.
> - switch internal format from str to unicode for "util.fspath()"
Broken broken broken on Linux. You can have _any bytes except null and /
in a valid Unix filename_, which means they can't be assumed to be
decodable in any encoding, let alone the current user's personal
encoding. Sensible users will use UTF-8 and UTF-8 only and only exchange
files with other people using UTF-8, but there's no guarantee that users
are sensible.
(NTFS has a related issue: filenames can be arbitrary 16-bit strings,
and needn't map into the valid UTF-16 codepoint space.)
> - switch from "str.lower()" to "encoding.lower()"
Again, lower() is known to be wrong for NTFS. We need to use upper().
https://blogs.msdn.com/b/michkap/archive/2005/01/16/353873.aspx
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list