Unicode support request.

罗勇刚(Yonggang Luo) luoyonggang at gmail.com
Wed Oct 19 14:46:00 UTC 2011


2011/10/19 Lester Caine <lester at lsces.co.uk>

> 罗勇刚(Yonggang Luo)  wrote:
>
>  That's exactly right, this situation is Windows OS currently faced,
>> Windows have
>> lots plenty of locale encoding (cp936, cp1251 cp 1252 and so on,  but at
>> least,
>> the filename is encoded with UTF8 at the filesystem level, not opaque
>> bytes.
>> Mercurial now also suffering such a wrong design.
>> Indeed, Win's different encoding is reasonable, because it's appeared
>> BEFORE
>> Unicode appeared, but as a such new Software, Mercurial just ignore the
>> existence of UTF8? It's not so hard to using UTF8 as the internal
>> encoding. The
>> current opaque encoding Schema just messed everything. Getting Mercurial
>> to be a
>> IMPOSSIBLE to used as a cross-platform, encoding-complete software.
>> I want to say, didn't support for UTF8 is a really big LOSS of mercurial.
>> It's
>> can be avoided.
>>
>
> Yonggang
> First correction to your statements ... Windows CHOSE to use UTF16 encoding
> for wide string file names, rather than having to handle the variable byte
> length of UTF8. Unicode existed long before that choose. As a result of
> doing that, their 'encoded' character strings are in a lot of cases blank
> bytes. ADD to this the arbitrary changes to using upper case characters in
> file names and you get even more combinations of differences. One does not
> always get the same number of bytes from windows for what should be the same
> file name if it has been 'converted' from UTF16 and this is the problem that
> causes trouble.
>
> We have a similar problem with PHP and Unicode support ... You may have
> heard of PHP6 which was dropped mainly because of the problems of changing
> what is currently 'ascii' encoded file names to allow transparent use of
> UTF8. Some years of time were invested without finding a clean solution, and
> people gave up trying to fix all the bugs generated.
>
PHP and Python is language, but mercurial is a software, that not comparable
between these two things. It's just that
PHP and Python 2.x didn't support for Unicode native, but I didn't hear
about things that we can not using PHP and Python 2.x to build
multi-language application.


>

Python3 has implemented the same sort of changes that PHP will need at the
> expense of backwards compatibility, but PHP choose to maintain
> compatibility, and simple manage Unicode data as data rather than addressing
> all the problems that changing file names and all other text to allow UTF8
> introduces.
>
> Database engines have similar problems with using UTF8 in the schema and
> many still limit some areas to 'ascii' simply to allow the systems to work.
>
> It is not that people are ignoring UTF8, but rather that the solution is
> NOT as simple as you seem to think it is. Especially when one is making a
> system that works transparently across all current operating systems :(
>
Indeed, the answer is simple, currently we faced problem is on windows, the
low-level of Win system file-name encoding
is UTF8, not GBK or something else. So we can do such a decision:
On Win system, the mercurial filename is encoded with UTF8, on other system,
leave it as is(opaque).

> .
> People who understand the problem have not been able to solve it ... yet :(
>
> --
> Lester Caine - G8HFL
> -----------------------------
> Contact - http://lsces.co.uk/wiki/?page=**contact<http://lsces.co.uk/wiki/?page=contact>
> L.S.Caine Electronic Services - http://lsces.co.uk
> EnquirySolve - http://enquirysolve.com/
> Model Engineers Digital Workshop - http://medw.co.uk//
> Firebird - http://www.firebirdsql.org/**index.php<http://www.firebirdsql.org/index.php>
>
> ______________________________**_________________
> Mercurial mailing list
> Mercurial at selenic.com
> http://selenic.com/mailman/**listinfo/mercurial<http://selenic.com/mailman/listinfo/mercurial>
>



-- 
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20111019/7e8dcd0f/attachment-0002.html>


More information about the Mercurial mailing list