[PATCH STABLE] i18n: use utf-8 encoding to show about converted revisions (issue3393)
FUJIWARA Katsunori
foozy at lares.dti.ne.jp
Tue Apr 24 12:55:00 UTC 2012
# HG changeset patch
# User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
# Date 1335271935 -32400
# Branch stable
# Node ID 9bbc6c974eea21102e08ccfb9324ae79504ee69a
# Parent 09dd707b522a766b7d5e5fd221c4e68ac735f4d9
i18n: use utf-8 encoding to show about converted revisions (issue3393)
status information of "hg convert" contains byte sequences in two
different encodings, when:
- non utf-8 encoding is chosen as one for Mercurial,
- the language using non-ascii characters in localized messages is
chosen by locale setting, and
- any converted revisions have description using non-ascii characters
this occurs because messages shown by "hg convert" are encoded in
utf-8 forcibly, but descriptions of converted revisions are encoded in
"orig_encoding" via "recode()" method.
this patch avoids re-encoding by "orig_encoding" to unify encoding of
output except for "ascii" encoding: in such case, this patch uses
"ascii" encoding for backward compatibility.
original implementation of "recode()" was introduced by changeset
4c16020d1172 (convert: print commit log message with local encoding
correctly): at that time, many of messages shown by "hg convert" were
not yet internationalized, so encoding collision occurred rarely.
examination of unicode-ness in "recode()" was introduced by changeset
e2cbdd931341.
diff -r 09dd707b522a -r 9bbc6c974eea hgext/convert/convcmd.py
--- a/hgext/convert/convcmd.py Wed Apr 18 11:46:23 2012 -0500
+++ b/hgext/convert/convcmd.py Tue Apr 24 21:52:15 2012 +0900
@@ -25,9 +25,12 @@
def recode(s):
if isinstance(s, unicode):
- return s.encode(orig_encoding, 'replace')
+ return s.encode('utf-8', 'replace')
+ elif orig_encoding == 'ascii':
+ # avoid to show non-ascii characters
+ return s.decode('utf-8').encode(orig_encoding, 'replace')
else:
- return s.decode('utf-8').encode(orig_encoding, 'replace')
+ return s
source_converters = [
('cvs', convert_cvs, 'branchsort'),
@@ -375,9 +378,7 @@
desc = self.commitcache[c].desc
if "\n" in desc:
desc = desc.splitlines()[0]
- # convert log message to local encoding without using
- # tolocal() because the encoding.encoding convert()
- # uses is 'utf-8'
+ # use 'recode()' to ensure writing out byte sequence in UTF-8
self.ui.status("%d %s\n" % (num, recode(desc)))
self.ui.note(_("source: %s\n") % recode(c))
self.ui.progress(_('converting'), i, unit=_('revisions'),
More information about the Mercurial-devel
mailing list