Unicode file content conversion

Eugene Lepekhin eugene.lepekhin at gmail.com
Mon Dec 8 04:48:28 UTC 2014


Thanks, I am trying to implement this, but can't figure out how to use the
custom extension when converting.
Here is what in my mercurial.ini file:

[extensions]

convert =

customsource = C:\Projects\CustomSource.py


Here is my py file:

import hgext.convert.convcmd

import hgext.convert.hg

import codecs

class customsource(hgext.convert.hg.mercurial_source):

    def getfile(self, name, rev):

        data, flags = super(source, self).getfile(name, rev)

        # use case: modify file data

        if name.endswith('.txt'):

            if data.startswith(codecs.BOM_UTF16_LE):

               data = data.decode('utf-16le').encode('utf8')

            if data.startswith(codecs.BOM_UTF16_BE):

                data = data.decode('utf-16be').encode('utf8')

         return data, flags

hgext.convert.convcmd.source_converters.append(('customsource',
customsource, 'branchsort'))


Here is my command line:

C:\Projects\Svn2Hg>hg convert --source customsource svn://localhost
HgWorkDit/custom1
hg convert: option --source not a unique prefix


I also tried:

C:\Projects\Svn2Hg>hg convert --source-type customsource --source-type
svn://localhost HgWorkDit/custom1
assuming destination custom1-hg
initializing destination custom1-hg repository
abort: svn://localhost: invalid source repository type


Any thoughts?

Thanks for help

Eugene

On Sun, Dec 7, 2014 at 10:50 AM, Mads Kiilerich <mads at kiilerich.com> wrote:

>  On 12/06/2014 02:16 AM, Eugene Lepekhin wrote:
>
>  Hi,
>
> I hope this is the right list for my question. If not please advise a
> better one.
>
> I have SVN repo that I need to convert to HG. The problem is a few .C
> files there encoded in Unicode-16 with FF FE bam. I prefer them in UTF-8 as
> it better handled by HG. Is there any way to convert them during, or
> before, or after converting repo to HG? I want to see correct diffs in the
> converted history.
>
>
> You can use the convert extension, with customization based on the example
> on http://mercurial.selenic.com/wiki/ConvertExtension#Customization .
>
> You probably need something like
>
> if name.endswith('.c'):
>     data = data.decode('utf16').encode('utf8')
>
> /Mads
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20141207/0bc82840/attachment-0002.html>


More information about the Mercurial mailing list