[PATCH stable] templatefilters: make json filter handle multibyte characters correctly
Patrick Mézard
pmezard at gmail.com
Sun Aug 8 16:21:18 UTC 2010
Le 07/08/10 09:36, Yuya Nishihara a écrit :
> # HG changeset patch
> # User Yuya Nishihara <yuya at tcha.org>
> # Date 1281166036 -32400
> # Branch stable
> # Node ID 0e36aafcca8fedbf60e05b985d5f6426045c8e28
> # Parent 36e25f25dec11e68fc3240326999c02b3879ab10
> templatefilters: make json filter handle multibyte characters correctly
>
> It aims to fix javascript error of hgweb's graph view in Japanese 'cp932'
> encoding.
>
> 'cp932' contains multibyte characters ending with '\x5c' (backslash),
> e.g. '\x94\x5c' for Japanese Kanji 'Noh'.
> Due to json filter escapes '\' to '\\', multibyte string ending with
> '\x5c' is translated to "xxx\", resulting javascript parse error on
> a web browser.
>
> This patch changes json() to pass unicode to jsonescape().
>
> diff --git a/mercurial/templatefilters.py b/mercurial/templatefilters.py
> --- a/mercurial/templatefilters.py
> +++ b/mercurial/templatefilters.py
> @@ -156,9 +156,13 @@ def json(obj):
> elif isinstance(obj, int) or isinstance(obj, float):
> return str(obj)
> elif isinstance(obj, str):
> - return '"%s"' % jsonescape(obj)
> + try:
> + return '"%s"' % jsonescape(unicode(
> + obj, encoding.encoding)).encode(encoding.encoding)
> + except (UnicodeEncodeError, UnicodeDecodeError):
> + return '"%s"' % jsonescape(obj)
So, if we fail to decode/encode the string, we still may generate an invalid JSON string, right? Shouldn't we "unicode(obj, encoding.encoding, 'replace')" or something similar instead?
> elif isinstance(obj, unicode):
> - return json(obj.encode('utf-8'))
> + return '"%s"' % jsonescape(obj).encode('utf-8')
Ok.
--
Patrick Mézard
More information about the Mercurial-devel
mailing list