Counterintuitive tag behaviour (broken design?)
Johan Herland
johherla at online.no
Wed Mar 14 21:24:33 UTC 2007
On Wednesday 14 March 2007, you wrote:
> On Wed, Mar 14, 2007 at 06:45:14PM +0100, Johan Herland wrote:
> > The above analysis makes me want to look closer at how the system
> > would work if we disallowed branches on .hgtags. First, I must say
> > that disallowing branches on a file in the repository that is
> > otherwise much treated as a regular file, sounds like an ugly hack,
> > and I do not pretend to know the technical difficulties involved in
> > making this work.
>
> The first obstacle is probably insurmountable: the nature of a
> distributed system means that different clients -will- diverge. There
> is no possible mechanism to keep disconnected peers from making
> local changes to the tags.
Sorry for not being clear. I'm only talking about disallowing branches
of .hgtags _in the current, local repository_. What happens between
disconnected repositories is of course outside of our control.
By disallowing branches on .hgtags in the repository, "hg tags" would
not need to merge multiple .hgtags files in order to produce the global
tag context; instead "hg tags" would only need to parse the most
recent .hgtags file in the repository, and the global tag context would
fall automatically out of this one file, plain and easy.
However, we would still be left with the problem of merging tags when
syncing between repositories. However, I'm thinking that the algorithm
in this case does not have to be as sophisticated, since we could leave
the ambiguous cases up to the user (similar to conflicts in regular
files).
I should also stress that the disallowing of branches on .hgtags is just
a thought experiment on my part. I wanted to look at alternatives to
the current approach (making a merge algorithm for .hgtags that,
AFAICS, seems very hard to get right). There might very well be better
alternatives than the one I proposed above, and I just wanted to
encourage a thought process where we try to approach the problem from
different angles.
I should explain why I think the current approach (fix the merge
algorithm for .hgtags) is very hard to get right. Basically this merge
algorithm must fulfill all of the following criteria, and I currently
don't know if that is possible at all:
1) The algorithm MUST always succeed. Since this algorithm will be run
e.g. as part of "hg tags", asking the user to resolve tag ambiguities
is not an option.
2) Requirements (g) and (i) must be fulfilled. This means that the
algorithm can only use information from the tags themselves.
Using "tip-most"-ness or other "arbitrary" measures to resolve
ambiguity is not acceptable. (However, using relative commit times of
tags might be acceptable.)
3) It must behave "intuitively". I.e. in all cases where most users
would agree on a desired behaviour, the merge algorithm must do the
Right Thing, of course without violating any of the other
requirements.
> > > Ok, what does this tell us about the design? First, points (f)
> > > and (c) basically says tags must be version controlled. And this
> > > basically means it must happen exactly in parallel with the
> > > project's DAG. Keeping the tag data in .hgtags meets those
> > > requirements with the added benefit of not adding a second
> > > namespace. Also, (b) falls nicely out of this, though the
> > > actually merging could be friendlier.
> >
> > I agree that tags MUST be version controlled, but I don't think
> > they should be branchable. Branchable tags directly violates (i).
>
> You wrote:
> > i) Tags are global (to the repository). I.e. the result of "hg
> > tags" and "hg up -C <tag>", etc. must be independent of the current
> > working copy. This concept has been referred to as the "global tag
> > context" in this discussion.
>
> At no point does 'hg tags' reference the .hgtags file sitting in the
> working directory.
Sorry, my bad. I still feel, however, that since tags are global to a
repository, it fundamentally does not make sense to allow branches on
the tag definitions (.hgtags) _within_ a repository.
> > j) If two users independently create a tag "foo" pointing to
> > different changesets in their respective repositories, merging the
> > two repositories MUST result in a conflict that cannot be
> > _automatically_ resolved. This is a feature.
>
> Why?
>
> For a tag like "the-official-1.0-release", if this happens, it's user
> error. Not our problem. Yes, there's a potential for someone to do
> something malicious here, but if you're pulling code from people you
> can't trust without reviewing it, tags are the very least of your
> problems.
I was thinking about the case where (friendly) developers independently
and coincidentally creates tags with the same name (e.g. "working"). In
this case, I thought it would be useful if, when syncing the
repositories, the system notified users of the apparent conflict,
instead of making an "arbitrary" decision and possibly redefining some
developers' "working" tag without warning.
However, I now see that if these developer's tags was not meant to be
shared (probably the case with the "working" tag), they should make the
tag local instead.
I guess we can drop the (j) requirement if people do not like it.
However, I still think it would be nice of the system to warn me when
redefining tags while pulling in changes from a remote repository.
Remember that even if the redefined tags end up on a separate branch,
and thus does not directly clobber my existing .hgtags files, they DO
still affect the global tag context.
> For a tag like "the-latest-build-that-actually-works", if this
> happens, it's a don't-care. Tags on both branches are equally valid
> unless one obsoletes the other. Using the most recently committed or
> pulled version of the tag (aka tip-most) is -the right thing to do-.
For tags with a common ancestry, I think it makes sense to use the most
recently commited version of the tag.
Have fun!
...Johan
--
Johan Herland, <johherla at online.no>
www.herland.net
More information about the Mercurial
mailing list