Counterintuitive tag behaviour (broken design?)

Matt Mackall mpm at selenic.com
Tue Mar 13 16:54:13 UTC 2007


On Tue, Mar 13, 2007 at 04:23:06PM +0100, Johan Herland wrote:
> Hi,
> I've been using Mercurial for a little while, and so far I really like it a 
> lot. However, in the last couple of days, I've started experimenting with 
> tags, and what I found was first puzzling, and after some more 
> experimentation my feeling can be best described as disbelief. Now, this may 
> be a big misunderstanding on my part as to how tags behave in Mercurial, but 
> I'd like to run this by you anyway.

No, you're correct. Mercurial is broken for this use case. See:

http://www.selenic.com/mercurial/bts/issue498

There are ideas on how to fix it, from Georg in particular (cc:ed).
And they look good. But I'd rather not proceed until we're fairly
comfortable that we've found a good solution because I don't want to
churn the tag semantics a lot.

So let's look a bit at the requirements:

a) tags need to be distributed in parallel with the rest of the history
b) conflicts between local and remote tags should get resolved in merges
c) it must be possible to determine who created tags
d) it must be possible to move tags
e) it must be possible to remove tags
f) because tags can change, it must be possible to determine what the
   tags were at a specific time in the project history

And now we add:

g) tags should not change without a tagging event (eg. a commit on a
   branch shouldn't resurrect old tags)
h) any change we make should not break the existing system too horribly

Ok, what does this tell us about the design? First, points (f) and (c)
basically says tags must be version controlled. And this basically
means it must happen exactly in parallel with the project's DAG.
Keeping the tag data in .hgtags meets those requirements with the
added benefit of not adding a second namespace. Also, (b) falls nicely
out of this, though the actually merging could be friendlier.

Point (d) is causing us grief with (d): moving a tag on one head gets
undone when we make a change on a branch. So the theory is to have a
notion of superceding entries in .hgtags:

01234567  foo
deadcafe  foo

The presence of both these lines says we know deadcafe is a more
current value for foo than 01234567, so ignore any branch that claims
01234567 is the tag. Because hg tag currently already always appends
tags, this will work nicely with existing repos. But it gets more
confusing when you hit:

Head A                  Head B
01234567  foo           01234567 foo
deadcafe  foo           deadcafe foo
                        01234567 foo

Clearly, head B says that 01234567 supercedes deadcafe but we're
already ignoring 01234567. We can resolve this by assigning a rank to
each <cset>:<tag> pair as we go, then taking the highest ranking:

Head A                  Head B              
01234567  foo -> 0      01234567 foo -> 0
deadcafe  foo -> 1      deadcafe foo -> 1
                        01234567 foo -> 2

This logic is still hard to get right and there will still need to be
tie-breakers based on what's 'tip-most' to remove ambiguity. So if we
decide that this is the right approach, we need to a) precisely
document it for users with examples and b) make sure the
implementation matches the documentation (aka test cases).

We also currently don't deal with deleting tags very well. You
currently have to delete them on all heads.

We could use the null tag (0000..) to mark a tag as deleted:

01234567  foo
deadcafe  foo
00000000  foo

This is -almost- backward compatible. Old systems will still show the
foo tag, but will check out an empty directory. New systems can just
remove it from the tag list.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial mailing list