Tags & production questions
Guido Ostkamp
hg at ostkamp.fastmail.fm
Fri May 4 18:43:26 UTC 2007
Hello Mark,
> IMO, you are vastly overrating the importance of all clones being able
> to see everything at once.
>
> How big is your project? How large are your disk drives?
to give you some numbers:
It is a large multisite project with dozens of persons having access to
sources even from multiple countries. We have several VOBs (kind of
Clearcase repositories). A few days ago I did an experiment and
transferred all ClearCase versions from just the main branch of the main
source VOB by finding all versions of all files, sorting them by checkin
time and then replaying all checkins into a fresh Mercurial repository.
This conversion took a whole night.
I ended up with a Mercurial repository of ~950 Megabyte size (including
~500 MB working copy), which contained ~9200 files in ~1300 directories
and had ~38000 changesets. A small number of files are binaries.
As I said, this was only the main branch. We also have ~15 more branches
with main development lines most of which are still maintained where each
branch contains numerous maintenance releases made over the years which
are 'tagged' with labels in ClearCase.
Development mainly takes place on Sun servers running Solaris OS. In a
professional environment, server disk space, which also has to be backed
up at night, is very expensive - also the systems are used for a long
time, and disks have thus not the sizes you are used to on a modern PC.
Typically, each developer has a quota of just a few Gigabytes, let say 5
GB, which he cannot exceed.
In case you don't know Clearcase yourself, you must understand that it has
its own filesystem which allows to define 'views' to the repositories by
applying rulesets defined in a so called 'configspec'. You just get the
stuff mapped in at certain directories, but there is no physical copy in
your directory that uses up any disk space. Only compilation results like
object files, libraries and binaries really use up storage space (at least
in our setup where we do not use wink-in objects).
When we get a bug report for some version out in the field, we have to fix
it for that version and port the fix to at least all newer branches
including the mainline. This means we have to check what's in those
versions by analyzing logs, possibly compare versions from different
branches etc. Typically the fix is developed in the mainline first (if it
does not already exist therein) and then ported back to maintainance
branches.
Thus it is absolutely crucial to have everything available in one
repository. It would be an absolute nightmare to have each of the
maintenance releases of each branch in its own repository.
Having explained that, I should make clear that I'm just one out of many
developers and certainly not in the position to decide which source code
management my employer uses, so they will stay with ClearCase.
However, for my work I would like to maintain a shadow Mercurial
repository which can help me for interim development before I check things
back into ClearCase, this is why I try to find out whether Mercurial could
do the job.
> You can safely clone repositories and checked out files using full
> hardlinks with
>
> $ cp -al REPO REPOCLONE
>
> which is the fastest way to clone. However, the operation is not atomic
> (making sure REPO is not modified during the operation is up to you) and
> you have to make sure your editor breaks hardlinks (Emacs and most Linux
> Kernel tools do so).
Ok, thank you for the hint. Solaris 'cp' does not have these options, but
I think I have a GNU 'cp' available. I've also checked my 'vim' manual and
found the 'set backupcopy=auto,breakhardlink' option which does what you
suggest.
>> Even if there is no working copy around and hardlinks are possible,
>> there is still a possibly large number of extra inodes used for that.
>
> Use reiserfs then.
I don't have this choice. Our Solaris systems use 'ufs' with logging.
Regards,
Guido
More information about the Mercurial
mailing list