Migrating from Clearcase to Mercurial

Simon King simon at simonking.org.uk
Fri Dec 16 00:38:55 UTC 2011


Hi all,

I've been using Mercurial for small projects for a while now, and
really appreciate its power. I've been trying to persuade people at
work to switch from Clearcase, and now I've been given the opportunity
to do just that. However, in order to make the transition as easy as
possible for all our developers, I need to be able to replicate our
existing workflow as much as possible. Briefly, our Clearcase workflow
is that a new branch is taken out for every bugfix or new feature
development, the changes are reviewed whilst on the branch, and then
the branch is merged back to the main development line. Once the
branch has been merged, the repository is labelled, where the label
names contain (amongst other things) an incrementing number. This
means that our main Clearcase repositories currently have a few
thousand branches and a few thousand labels.

Our workflow is mostly driven by a home-grown web interface which we
use to create branches and labels, and conduct code reviews. I am
trying to reimplement this web interface in a way that will work for
Mercurial.

For the branching behaviour, I am heeding the warning on
http://mercurial.selenic.com/wiki/StandardBranching and not using
named branches for every change. Instead, I am going to create a new
server-side clone whenever a developer wants to start a new piece of
work. He will push his changes to that clone where they can be
reviewed. Once the review is complete (and as long as the clone is
fully merged up with the main repository), the server will log the
outgoing changesets, then push them from the branch repo to the main
repo. The branch repo will probably then be deleted to save space on
the server.

I'm less confident about how to deal with the tagging. From a
technical point of view, it's not nearly as important to tag every
merged branch, because the changeset ID is a perfectly good unique
identifier. But socially, I don't think we can do without those
incrementing build IDs; people are too used to referring to builds by
their number, and understanding that build A is more recent than build
B simply because it has a higher number. (We could store build IDs
outside of mercurial, but then developers can't use them with commands
like 'hg merge' and 'hg update')

Firstly, are we going to start seeing performance problems if we have
a few thousand tags in a repository? If so, are the performance
problems only caused by having thousands of tags at a head? As I
understand it, Mercurial examines the .hgtags file for each head in
the repo, so if we purge tags that are no longer interesting from
every head, will the performance be the same as if they had never
existed?

Secondly, we will actually be creating these tags through our web
interface, which means it'll be the server running "hg tag". I think
this means that I need a working copy on the server. I could keep it
updated to default/tip on every push, but this seems a little wasteful
of disk space, requires an extra "update" step on each push, and so
on. I was wondering if instead I could have a named branch called
"tags" which exists solely for tagging.

So, apologies for the rambling email, but I wanted to give some
background about why I'm doing things this way. I'm really looking for
feedback on the tagging questions; will we have performance problems
with thousands of tags, and is there anything wrong with having a
named branch just for .hgtags?

Thanks for any suggestions you can give (even "don't be so stupid,
that's a ridiculous way to work!"),

Simon



More information about the Mercurial mailing list