cvs2hg emit .hgeol?

Tom Udale tom at ionoptix.com
Fri Aug 26 22:18:46 UTC 2011


Hi Martin,

> Okay, I see. I naively assumed that your existing files in CVS had the
> right newline formats :-)

I have discovered several things about the eol extension that were 
confusing me before.  Before getting into them, let me make some 
clarifications:

1) All files in my CVS repository are _stored_ internally in Unix 
format, the claimed (and apparently actual) canonical format of CVS.

2) Several of the files I was having trouble getting translated were due 
to errors in the .hgeol file.

3) In the interest of accuracy, it turns out that it was not the RC 
files that were killing us, but rather the DSP files.

With that out of the way, let me get on to the unexpected behavior of 
eol.  The biggest confusion I was having was that then I updated to a 
time before the .hgeol file existed in my repo, I was not seeing 100% 
unix files as I expected.  This is what made me think my CVS repo was 
screwed up (or there was a bug in cvs2hg).

It turns out that if I do a clean conversion of my CVS repo using cvs2hg 
with no .hgeol file at all, I get the expected 100% unix line endings.

The cause of this confusion is that it appears that eol _always_ uses 
the most recent .hgeol file to update.  Thus if you start with a 100% 
unix working directory and add an .hgeol file to it that converts 
everything to DOS, if you check out any revision before the one that 
adds the .hgeol file, you still get the DOS versions.

It is actually more confusing that that however.  It appears that when 
testing for dirty files in the working directory, eol uses the "correct" 
.hgeol file, if one exists.

Here is an example: you start with a repo at cset 0 with no .hgeol, 
working directory is in state 0.

Now you add a .hgeol, creating cset 1 with working directory state 1.

Now you edit the .hgeol, adding new conversions, hereby creating cset 2 
and working directory state 2.

Now hg update null; hg update 0 (not going through null is even more 
confused but I think "by design").

You would expect that you would be at working directory state 0.  You 
are not.  You are at state 2.  If you do a hg status (well, tortiosehg 
refresh), you will find there is nothing to commit.

If you now hg update null update 1, you are still at working directory 
state 2.  However, if you hg status, you get as modified all the files 
added to .hgeol in cset 2.  This implies that the status is working 
against the .hgeol in cset 1 (as expected).

Let me say that again.  It checks out using the cset2 .hgeol settings, 
but statuses those files against the repo using the cset1 .hgeol settings.

My interpretation of what happens at cset0 is that when it goes to 
status, it cannot find a .hgeol at cset0 and simply uses the cset2 
version - and then diffs to an empty set against the working directory 
which was also checked out with the cset2 .hgeol file.

And then there is the extra complication of what happens when you don't 
hg update null when crossing csets that changed .hgeol.  In my 
observation, it appears that going backwards, only the .hgeol file is 
changed and no changes occur to the working directory.  However it does 
appear that when going forward, files in the working directory do get 
updated.

Now, I am doing all this via tortiosehg, so it is possible that it is 
the source of the confusion here rather than the eol extension.

For me, once I get my .hgeol files in from the very beginning, this will 
not be a problem.  But it does seem like a problem for the  utility in 
general.  Because it appears that even if you always update via null, 
you still get odd behavior.

I am trying to figure out how eol works to see if maybe I can find where 
this is happening.  I have not managed yet.



Best regards,

Tom




More information about the Mercurial mailing list