Unicode file content conversion
Eugene Lepekhin
eugene.lepekhin at gmail.com
Mon Dec 8 04:48:28 UTC 2014
Thanks, I am trying to implement this, but can't figure out how to use the
custom extension when converting.
Here is what in my mercurial.ini file:
[extensions]
convert =
customsource = C:\Projects\CustomSource.py
Here is my py file:
import hgext.convert.convcmd
import hgext.convert.hg
import codecs
class customsource(hgext.convert.hg.mercurial_source):
def getfile(self, name, rev):
data, flags = super(source, self).getfile(name, rev)
# use case: modify file data
if name.endswith('.txt'):
if data.startswith(codecs.BOM_UTF16_LE):
data = data.decode('utf-16le').encode('utf8')
if data.startswith(codecs.BOM_UTF16_BE):
data = data.decode('utf-16be').encode('utf8')
return data, flags
hgext.convert.convcmd.source_converters.append(('customsource',
customsource, 'branchsort'))
Here is my command line:
C:\Projects\Svn2Hg>hg convert --source customsource svn://localhost
HgWorkDit/custom1
hg convert: option --source not a unique prefix
I also tried:
C:\Projects\Svn2Hg>hg convert --source-type customsource --source-type
svn://localhost HgWorkDit/custom1
assuming destination custom1-hg
initializing destination custom1-hg repository
abort: svn://localhost: invalid source repository type
Any thoughts?
Thanks for help
Eugene
On Sun, Dec 7, 2014 at 10:50 AM, Mads Kiilerich <mads at kiilerich.com> wrote:
> On 12/06/2014 02:16 AM, Eugene Lepekhin wrote:
>
> Hi,
>
> I hope this is the right list for my question. If not please advise a
> better one.
>
> I have SVN repo that I need to convert to HG. The problem is a few .C
> files there encoded in Unicode-16 with FF FE bam. I prefer them in UTF-8 as
> it better handled by HG. Is there any way to convert them during, or
> before, or after converting repo to HG? I want to see correct diffs in the
> converted history.
>
>
> You can use the convert extension, with customization based on the example
> on http://mercurial.selenic.com/wiki/ConvertExtension#Customization .
>
> You probably need something like
>
> if name.endswith('.c'):
> data = data.decode('utf16').encode('utf8')
>
> /Mads
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20141207/0bc82840/attachment-0002.html>
More information about the Mercurial
mailing list