hg qrefresh doesn't honor "--encoding" for "--logfile"
Marc Strapetz
marc.strapetz at syntevo.com
Tue Nov 20 19:19:26 UTC 2012
On 20.11.2012 20:03, Matt Mackall wrote:
> On Tue, 2012-11-20 at 19:03 +0100, Marc Strapetz wrote:
>> On 20.11.2012 17:12, Matt Mackall wrote:
>>> On Tue, 2012-11-20 at 15:50 +0100, Marc Strapetz wrote:
>>>> On 16.11.2012 20:56, Matt Mackall wrote:
>>>>> On Fri, 2012-11-16 at 19:47 +0100, Marc Strapetz wrote:
>>>>>> It seems that:
>>>>>>
>>>>>> hg qrefresh --encoding utf-8 --logfile <file>
>>>>>>
>>>>>> still tries to process the log-file with platform encoding. When using:
>>>>>>
>>>>>> hg commit --encoding utf-8 --logfile <file>
>>>>>>
>>>>>> with the same <file>, contents are properly interpreted as UTF-8.
>>>>>>
>>>>>> $ hg --version
>>>>>> Mercurial Distributed SCM (version 2.2.3)
>>>>>> (see http://mercurial.selenic.com for more information)
>>>>>
>>>>> Please file a bug so we don't lose track of this:
>>>>>
>>>>> http://mercurial.selenic.com/wiki/BugTracker
>>>>>
>>>>> I'm 99.99% sure something else is going on. Neither mq nor commit have
>>>>> any encoding logic at all, so it's impossible for them to behave
>>>>> differently here. All the encoding handling for commit messages happens
>>>>> in a single place, the low-level function that adds changelog messages
>>>>> to the store.
>>>>
>>>> OK. Sorry, this was my fault: actually the
>>>>
>>>> qrefresh --encoding utf-8 --log-file
>>>>
>>>> was working correctly, but later there was another
>>>>
>>>> qrefresh --currentdate --currentuser
>>>>
>>>> without changing the commit message, however also without --encoding,
>>>> which corrupted the commit message again. I've now added --encoding
>>>> utf-8 there as well as to qimport (what seems to be necessary when
>>>> rearranging patches) and all is working well now.
>>>
>>> You might want to set HGENCODING in your environment?
>>
>> Thanks for that hint. We are calling Mercurial from Java, so UTF-8 is
>> natural and convenient. Could there be any drawback when using
>> HGENCODING=utf-8 permanently?
>
> Should be fine.
>
>> For instance, we are using listfile: to
>> pass file arguments to Mercurial commands.
>
> listfile: contents are treated as raw bytes. This is generally the rule,
> see:
>
> http://mercurial.selenic.com/wiki/EncodingStrategy
>
> In particular, the filename encodings you pass on the command line need
> to agree with the encodings of those names used by the standard C APIs.
> If you pass hg some string of bytes <x> as a filename, fopen(<x>) had
> better work.
>
> So if you're on Windows, you need to pass filenames in the ANSI codepage
> encoding for now. On Unix, UTF-8 is fine.. iff the filenames are
> actually in UTF-8 on the filesystem.
>
> Also, you might be interest in:
>
> http://mercurial.selenic.com/wiki/JavaHg
> http://mercurial.selenic.com/wiki/CommandServer
Thanks, Matt! I was looking into JavaHg a while ago and we had a couple
of technical issues and some patch pending. Our medium-term plans are to
switch to JavaHg as I think it's a good way to go.
-Marc
More information about the Mercurial
mailing list