[PATCH] Also treat "East Asian Ambiguous" characters as full-width
Shun-ichi Goto
shunichi.goto at gmail.com
Fri Oct 30 02:46:52 UTC 2009
# HG changeset patch
# User Shun-ichi GOTO <shunichi.goto at gmail.com>
# Date 1256870784 -32400
# Node ID ba57125215c3a2adc760e98668e523a13f647dac
# Parent 3c30ae2d6f1bc0b74c87ea46b695e59485f79414
Also treat "East Asian Ambiguous" characters as full-width.
"East Asian Ambiguous" characters like 'GREEK SMALL LETTER BETA'
(U+03B2) or 'MULTIPLICATION SIGN' (U+00D7) should be counted as
full-width because it depends on the context.
See also:
"Unicode Standard Annex #11 - East Asian Width"
http://www.unicode.org/reports/tr11/tr11-14.html#Ambiguous
diff -r 3c30ae2d6f1b -r ba57125215c3 mercurial/encoding.py
--- a/mercurial/encoding.py Wed Oct 28 13:36:23 2009 +0900
+++ b/mercurial/encoding.py Fri Oct 30 11:46:24 2009 +0900
@@ -72,6 +72,6 @@
d = s.decode(encoding, 'replace')
if hasattr(unicodedata, 'east_asian_width'):
w = unicodedata.east_asian_width
- return sum([w(c) in 'WF' and 2 or 1 for c in d])
+ return sum([w(c) in 'WFA' and 2 or 1 for c in d])
return len(d)
More information about the Mercurial-devel
mailing list