Unicode file content conversion

Mads Kiilerich mads at kiilerich.com
Sun Dec 7 18:50:56 UTC 2014


On 12/06/2014 02:16 AM, Eugene Lepekhin wrote:
>
> Hi,
>
> I hope this is the right list for my question. If not please advise a 
> better one.
>
> I have SVN repo that I need to convert to HG. The problem is a few .C 
> files there encoded in Unicode-16 with FF FE bam. I prefer them in 
> UTF-8 as it better handled by HG. Is there any way to convert them 
> during, or before, or after converting repo to HG? I want to see 
> correct diffs in the converted history.
>

You can use the convert extension, with customization based on the 
example on 
http://mercurial.selenic.com/wiki/ConvertExtension#Customization .

You probably need something like

if name.endswith('.c'):
     data = data.decode('utf16').encode('utf8')

/Mads
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20141207/d0d8c6ef/attachment-0002.html>


More information about the Mercurial mailing list