xml style doesn't generate valid xml
Stanimir Stamenkov
s7an10 at netscape.net
Tue Nov 23 21:50:14 UTC 2010
Tue, 23 Nov 2010 21:19:34 +0000, /Haszlakiewicz, Eric/:
> From: Matt Mackall [mailto:mpm at selenic.com]
>
>>> However, for the issue with xmlescape turning things into spaces, I
>>> think that's because there's an explicit line of code in xmlescape
>>> that does that! In templatefilters.py, the last line of xmlescape is:
>>> return re.sub('[\x00-\x08\x0B\x0C\x0E-\x1F]', ' ', text)
>>
>> Ugh, who ordered that.
> (...)
> def fixupcontrols(matchobj):
> return "&#" + "%d" % ord(matchobj.group(0)) + ";"
> return re.sub('[\x00-\x08\x0B\x0C\x0E-\x1F]', fixupcontrols, new_s)
I'm not sure I understand this Python code enough but note it is a
syntax (fatal, well-formedness) error to include control characters
other than tab, carriage return and line feed, even as numeric
character references <http://www.w3.org/TR/xml/#dt-charref>:
> Characters referred to using character references MUST match the
> production for Char. <http://www.w3.org/TR/xml/#charsets>
The XML 1.1 specification <http://www.w3.org/TR/xml11/#charsets> is
more relaxed on this but still doesn't allow the 0 (zero) character
code.
--
Stanimir
More information about the Mercurial
mailing list