Counterintuitive tag behaviour (broken design?)
Johan Herland
johherla at online.no
Wed Mar 14 17:45:14 UTC 2007
On Tuesday 13 March 2007, Matt Mackall wrote:
> [...]
Thanks a lot for a good and informative answer to my first post. Below,
you'll find some more of my thoughts in response to the list of
requirements given.
> So let's look a bit at the requirements:
>
> a) tags need to be distributed in parallel with the rest of the
> history b) conflicts between local and remote tags should get
> resolved in merges c) it must be possible to determine who created
> tags
> d) it must be possible to move tags
> e) it must be possible to remove tags
> f) because tags can change, it must be possible to determine what the
> tags were at a specific time in the project history
>
> And now we add:
>
> g) tags should not change without a tagging event (eg. a commit on a
> branch shouldn't resurrect old tags)
> h) any change we make should not break the existing system too
> horribly
I'd revise (g) into an even stronger statement:
g) The only way a tag may change is through an explicit operation
directly on that tag. For example, only the following operations may
change the tag "foo":
g.1.1) Explicitly moving or removing tag "foo".
g.1.2) Merging tag definitions from different repositories where
tag "foo" is different.
Conversely, any other operation may not change a tag. For example,
the following operations may never change a tag "foo":
g.2.1) Any activity not touching tag definitions
(e.g. commits/updates/merges on files (except .hgtags)).
g.2.2) Any activity on tags unrelated to "foo".
(e.g. creating/moving/removing tag "bar", or even
editing .hgtags without changing or reordering any of
the "foo" entries)
Furthermore, I'll throw the following requirements into the fray:
i) Tags are global (to the repository). I.e. the result of "hg tags" and
"hg up -C <tag>", etc. must be independent of the current working
copy. This concept has been referred to as the "global tag context"
in this discussion.
j) If two users independently create a tag "foo" pointing to different
changesets in their respective repositories, merging the two
repositories MUST result in a conflict that cannot be _automatically_
resolved. This is a feature.
Now, the current solution is to allow full revision control on the tag
definitions (by putting .hgtags in the repository as a "regular" file).
This elegantly allows us to track tag definitions through time, which
is clearly what we want (i.e. we can easily ask questions like: when
did a tag change? who did it? for what reason?). However,
putting .hgtags under full/regular revision control also provides the
possibility of branching .hgtags. Unfortunately, the concept of
branches on .hgtags is easily confused with the concept of branched tag
definitions. I, for one, got these two confused at the start, and I
believed for a while that a tag could refer to different changesets
depending on which branch I was on. Of course, I now clearly see that
branched tag definitions is irreconcilable with requirement (i) (global
tag context), and that (i) is clearly what we want.
Now, since we (because of (i)) clearly do not want branching tag
definitions, we must therefore find a way to extract global tag context
from the existing branches of .hgtags. Currently, we try to resolve
this problem by defining a robust algorithm for doing the merge of tag
definitions. This can - by some stretch of the imagination - be seen as
an attempt to fix a bug in the design itself, the original bug being
allowing branching of .hgtags in the first place.
The above analysis makes me want to look closer at how the system would
work if we disallowed branches on .hgtags. First, I must say that
disallowing branches on a file in the repository that is otherwise much
treated as a regular file, sounds like an ugly hack, and I do not
pretend to know the technical difficulties involved in making this
work. However, let's for a second assume that we could fix these
problems elegantly. We should then ask how disallowing branches
on .hgtags affects the above requirements: As far as I can see, (a)
does not need to be affected. Neither does (c), (d), (e), and (f). (g)
should be elegantly fulfilled since AFAICS the problems with (g) today
are caused by having different branches on .hgtags. Also, (i) should be
automatically resolved by using the most recent revision of .hgtags as
the only source of global tag context. AFAICS, (j) is a subpoint of (b)
in this context. We're then left with (b) and (h). To me, it seems that
(b) must be resolved by merging the tag definitions with a different
algorithm than the one used to merge "regular" files. The algorithm
must include ways to allow the user to manually resolve tag conflicts
(to satisfy (j)). However, AFAICS from the solution currently being
discussed, something similar is needed anyway, since the "regular" file
merge algorithm will fail on the current .hgtags format anyway. We're
then left with (h) which probably will be the real challenge here. How
do we implement the disallowing of .hgtags branches without modifying
the existing system too much. At this point, I'm not sure, but I think
it is worth researching some more.
Finally, there are surely more requirements for desirable tag behaviour
that we should try to formulate in addition to the 10 above. Let's use
this discussion to try to enumerate as many as possible. This would
help us all to get a clearer picture of what we're actually up against.
> Ok, what does this tell us about the design? First, points (f) and
> (c) basically says tags must be version controlled. And this
> basically means it must happen exactly in parallel with the project's
> DAG. Keeping the tag data in .hgtags meets those requirements with
> the added benefit of not adding a second namespace. Also, (b) falls
> nicely out of this, though the actually merging could be friendlier.
I agree that tags MUST be version controlled, but I don't think they
should be branchable. Branchable tags directly violates (i).
> [...]
>
> This logic is still hard to get right and there will still need to be
> tie-breakers based on what's 'tip-most' to remove ambiguity. So if we
> decide that this is the right approach, we need to a) precisely
> document it for users with examples and b) make sure the
> implementation matches the documentation (aka test cases).
I do not agree that "tip-most"-ness can be used as a tie-breaker. The
concept of "tip-most" is arbitrary, and easily breaks requirement (g)
(both original and revised version). If "tip-most"-ness or some other
arbitrary measure is needed to remove ambiguity, there's too much
ambiguity in the first place. If a tie-breaker is needed in ambiguous
cases, I'd much rather prefer one of Alexis' proposed tie-breaker
algorithms.
However, what I'd REALLY like is to get rid of ALL the ambiguity by only
allowing one .hgtags per repository (i.e. no branching of .hgtags).
Sure, we will still need to merge tags between repositories when
pushing/pulling, but the ambiguous cases that pop up here (at least
some of them) require human intervention anyway (according to
requirement (j)), thus we don't need a merge algorithm that resolves
all ambiguity, it only needs to merge the obvious cases, and leave the
rest up to humans.
Have fun!
...Johan
--
Johan Herland, <johherla at online.no>
www.herland.net
More information about the Mercurial
mailing list