rename is expensive.
Bryan O'Sullivan
bos at serpentine.com
Fri Apr 28 18:24:57 UTC 2006
On Fri, 2006-04-28 at 13:12 -0500, TK Soh wrote:
> This is probably not something new.
That's right.
> Recently I had to rename a directory that contain 400MB+ of files, and
> that immediately add quite a bit to the repo size.
That's an interesting observation. For those who don't know why this
happens, Mercurial doesn't have a long-lived "file ID" that uniquely
identifies a file. This means that when a rename happens, the metadata
file for the old name hangs around, a new metadata file for the new name
is created (containing just one revision), and a record is stored away,
saying "this file got renamed here".
Now, you might be screaming and gnashing your teeth at this
architectural decision, thinking "but that means renames are horribly
expensive, and will fill my hard disk!" It would be more correct to say
that there are nuances here :-)
A huge advantage of the *lack* of persistent file IDs is that you simply
don't have to worry about merges when multiple people create the same
file at different times by importing patches. (If you think that
doesn't happen, ask me offline about all the fun I've had with systems
that *do* use unique file IDs where I've been bitten by exactly this
scenario.)
So on the one hand, you lose a little space efficiency (or, very rarely,
a lot, if you're in the habit of checking in huge files and renaming
them), and on the other, you gain considerable flexibility in just not
having to worry about stuff that continually bothers people who use
other SCMs.
<b
More information about the Mercurial
mailing list