cvs2hg emit .hgeol?

Tom Udale tom at ionoptix.com
Thu Aug 25 18:24:40 UTC 2011


Greg,



>> Unfortunately, I am probably not the person to do this.  I am in a Windows C++ shop so
>> not only do I not know a damn thing about Python, I don't have a Python tool chain set
>> up that would let me poke at it.  I would be starting literally from zero.
>
> Toolchain? You are thinking like a Windows C++ guy. The toolchain for
> Python is Python. And a text editor. I gather some people like IDEs
> (e.g. Eclipse, WingIDE), but I've never really seen the point. I
> prefer to just inhale the code base and start walking up the learning
> cliff. ;-)
>
> That said, customizing a cvs2hg run really would be jumping in the
> deep end. You have to figure out both the zen of cvs2svn and the tao
> of Mercurial. I'm still amazed I got the thing to work as well is it
> does.

I am glad your assessment of customizing cvs2hg was similar to mine - 
jumping into the deep end.  Although, I will take to heart that it is 
easier to work with Python than I am imagining it.






>> Actually, there is some hoseage (if you will).  The CVS repo we had was
>> running on linux.  Everything was checked out on Windows.  For whatever
>> reason (my understanding is that the CVS internal eol format, at least on
>> unix, is unix style) everything in the hg repo is unix style eol. Thus they
>> currently get checked out on my Windows machine this way.
>>
>> This is not actually a problem for most files because the editors and
>> compilers can handle either style eol.  The most notable exceptions are
>> notepad (the default windows editor) which is only a problem in the sense of
>> "it would be nice to work", and the resource compiler, which is not
>> optional.  It completely barfs on RC files with the wrong ending style.
>
> Boy, does this ever sound familiar. I bet you also have some files in
> CVS with inconsistent line endings too -- we did. I dealt with these
> problems in a couple of ways when converting to hg:
>
>    * if it's known to be a text file (*.c, *.java, etc): convert the content
>      of *every revision* to Unix line endings -- so in our hg repo, it's \n
>      all the way
>
>    * Windows RC files (and Visual Studio .vcproj) are not text, despite
> appearances:
>      treat them as binary (i.e. they should have CR LF in the repo, in
> a Unix checkout,
>      and of course in a Windows checkout)
>
> So yeah, I tampered with history by hooking into the guts of cvs2hg.
> The power was intoxicating.


For the edification of those following this, I have found a _very_ 
straightforward way to inject .hgxx files into a cvs2hg generated 
repository.  This will certainly work if your cvs repo is well behaved.

Basically the approach hinges on --existing-hgrepos option to cvs2hg. 
You create an empty hg repository and commit your desired .hgxx files 
and then cvs2hg onto that existing repository.  It is that simple and 
has worked great on the smaller repos I have tried it on so far.

If you don't know the exact entries to put into your .hgxx files, you 
can do it iteratively.  Start with empty .hgxx files in a subdirectory, 
say "dots".  Then make a script to do the conversion:

hg init repo
cp dots/.hg* repo
cd repo
hg add
#you can modify hgrc at this time as well if need be
hg commit -m "Added .hg** files."
cd ..
cvs2svn/cvs2hg --existing-hgrepos=repo cvsroot

Now you will have an hg repo with your .hg* files checked in at cset 0 
and your CVS repo piled on top.

You can mess around with your .hgeol and .hgignore files as needed to 
make things look right at the tip.  Then copy the new, improved .hg* 
into dots, wipe the repo and re-run the script.  Now you will have the 
"correct" (or more correct) .hg* files in from the beginning of history. 
  You can rinse and repeat until all is "perfect".

Where this falls down (I fear) is if your CVS repo is not well behaved 
and you have either files with inconsistent eols or files that are not 
with whatever eol you choose for your canonical hg repo eol style.  In 
that case, you will run into the confusion of hg seeing the files as 
modified.  Assuming these files where added throughout history, you are 
probably going to have trouble correcting it anywhere besides prior to 
(or during) the cvs2hg conversion.

So that bit I have not quite figured out yet.

By the way, since I started writing this email at 8AM, I did decide to 
poke around in the python debugger.  You are right, it is pretty easy. 
Certainly no more cryptic than windbg :)

I even see now that canonicalize_eol is a clear way to set the desired 
eol of a block of text.  I just need to figure out where to call it.



Best regards,

Tom





More information about the Mercurial mailing list