eolext LF CRLF surprises.
Tom Udale
tom at ionoptix.com
Sun Aug 28 22:27:17 UTC 2011
Hello All,
One of the things that very much surprised me about eol was the behavior
of LF and CRLF. I assumed that they controlled only the format in the
working directory, not the format in the repository. So it took me
indeed quite a while to figure out that LF and CRLF force the same
endings both inside and outside the repo.
This is basically an alias for BIN - that is, no translation - with the
single exception that you can somewhat painfully force a change in line
endings if you need to. The painful aspect is that you must commit
because any actual change will the file look modified to the repo.
The reality is that you don't really care what the internal
representation of the file is _unless_ it is different from the
repository canonical format when eol is enabled. And then you only care
for the practical consideration of preventing spurious commits, not
because of some deep seated concern for the repo internals.
Another practical need is for the working directory to be in some form
or another based on the unix/DOSness of the host computer.
So to maximize the utility of eol in the face of differing paths to your
hg repo (which will result in various states of files in the repo) and
in the face of differing working directory needs, you want to be able
to specify both sides of the equation for each file, the repo and the
working directory.
You specify the repo to side to contend with files already checked in
which are not in the canonical format, and you specify the working
directory side as needed for the host system.
It turns out that you need only about one hurricane's worth of time to
implement this. Today I managed to get hg building from sources and
then added 6 new specifiers that along with the three existing ones fill
out all combinations of native, LF and CRLF.
I chose them as follows (decode as repo-working where N means "native"):
N-LF
N-CRLF
[existing Native is the same as N-N]
LF-N
LF-CRLF
[existing LF is the same as LF-LF]
CRLF-N
CRLF-LF
[existing CRLF is the same as CRLF-CRLF]
The envisioned use is as follows:
if your hg repo is completely homogenous in text file eols, you set up
your repo to be canonical in that eol and then pick _always_ one of the
N- variants N-LF, N-CRLF, or Native to get the files into your working
directory as needed. This way you don't have to concern yourself with
whatever the repo format is and you can still maintain repo homogeneity.
If your repo is heterogeneous in eols, you set up your repo to be
canonical in the most common eol and then pick one of the 9 specifiers
to set up your working directory. You pick the left side based on how
the file is in the repo and the right side based on how it needs to be
in the working directory.
The changes needed to add new specifications and converters are, thanks
to the design of eol, trivial (assuming I am not missing some corner
cases). It appears you don't even need to understand how the filters
are ultimately called :)
If anyone is interested, I would be happy to send them along.
Best regards,
Tom
More information about the Mercurial
mailing list