Adding merge --ancestor option?
Matt Mackall
mpm at selenic.com
Sun Aug 5 15:13:42 UTC 2012
On Fri, 2012-08-03 at 09:48 +0200, Mathias De Maré wrote:
>
>
> On Thu, Mar 22, 2012 at 5:33 PM, Angel Ezquerra Moreu
> <angel.ezquerra at gmail.com> wrote:
> On Thu, Mar 22, 2012 at 4:21 PM, Matt Mackall
> <mpm at selenic.com> wrote:
> > On Thu, 2012-03-22 at 15:44 +0100, Angel Ezquerra Moreu
> wrote:
> >> On Thu, Mar 22, 2012 at 2:37 PM, Matt Mackall
> <mpm at selenic.com> wrote:
> >> > On Wed, 2012-03-21 at 20:48 -0400, Greg Ward wrote:
> >> >> On 21 March 2012, Matt Mackall said:
> >> >> > > 1) Alice and Bob are working concurrently from the
> same changeset on
> >> >> > > branch 1.0
> >> >> > > 2) Alice commits on 1.0
> >> >> > > 3) Alice merges to 1.1
> >> >> > > 4) Alice merges to default
> >> >> > > 5) Bob commits on 1.0
> >> >> > > 6) Bob merges to 1.1, gets a conflict, resolves it
> >> >> > > 7) Bob merges to default
> >> >> > > 8) Alice pushes and goes home: she's done her day's
> work
> >> >> > > 9) Bob attempts to push and fails: "push creates
> remote heads"
> >> >> > > 10) Bob pulls
> >> >> > > 11) Bob merges with Alice on 1.0, 1.1, and trunk
> >> >> > > 12) Bob pushes and goes home: he's done his day's
> work
> >> >> > > 13) Carl starts work at the tip of branch 1.0 (Bob's
> merge with Alice)
> >> >> > > 13) Carl merges 1.0 to 1.1: FAIL: he gets Bob's
> conflict!
> >> >> >
> >> >> > This is yet another case where we can't do any
> meaningful
> >> >> > differentiation between possible ancestors (the
> commits in (2) and (5)
> >> >> > in this case). We could perhaps walk the graph and
> notice that (5) has a
> >> >> > descendant merge with a conflict, and thus score it
> higher, but it'll
> >> >> > still be trivial to create scenarios with ties.
> >> >>
> >> >> I was confused at first by how you can detect conflict
> after-the-fact.
> >> >
> >> > Simple. A merge without conflicts will have no files
> listed in the
> >> > changeset. In this scheme, we'd try to pick the merge
> path that had the
> >> > most conflicts already resolved. So we'd notice that one
> of the choices
> >> > of ancestor implied merge 'legs' including Bob's conflict
> resolution
> >> > from (6) and choose it over the one with no resolutions
> in its legs.
> >> >
> >> > This tweak is much more work than its worth, though, as
> it nibbles only
> >> > a small chunk off the ambiguous domain.
> >> >
> >> >> > So there are two ways we can go:
> >> >> >
> >> >> > - allow manual ancestor selection (restricted to
> heads(::x and ::y))?
> >> >> > - invent a merge operator that's well-defined for
> multiple ancestors
> >> >> >
> >> >> > It's not too hard to see how the latter might work, if
> we ignore
> >> >> > renames.
> >> >>
> >> >> That would indeed be nifty. I'll have to screw on the
> old thinking cap
> >> >> and cogitate over this a bit.
> >> >
> >> > I'm starting to write up some design notes for this idea,
> which I'm
> >> > calling "concensus merge".
> >> >
> >> > A quick measurement on the Mercurial repo shows:
> >> >
> >> > 1911 merges
> >> > 83 with two or more merge ancestors
> >> > 1 with three
> >>
> >> Matt,
> >>
> >> is there a simple way (e.g. revset) to repeat that
> measurement? I
> >> suspect that mercurial's history is probably more linear
> than most,
> >> given the patch based workflow, the excellent review
> process and the
> >> high commit quality standards. The fact that there are only
> 2 named
> >> branches probably contributes to that as well.
> >>
> >> I could repeat those measurements on some of our repos to
> give you
> >> another measurement point.
> >
> > I did this:
> >
> > hg log --template '{rev}\n' -r 'merge()' > merges
> > for f in `cat merges`; do echo -n "$f: "; hg log -r
> "heads(::p1($f) and ::p2($f))" --template "{rev} "; echo; done
> > merge-ancestors
> >
> > You can also do something like this:
> >
> > $ hg dbsh
> > loaded repo : /home/mpm/hg
> > using source: /home/mpm/hg/mercurial
> >>>> d = {}
> >>>> for m in repo.revs("merge()"):
> > ... d[m] = repo.revs('heads(::p1(%d) and ::p2(%d))', m, m)
> > ...
> >>>> len(d)
> > 1911
> >>>> len([x for x in d if len(d[x]) >= 2])
> > 83
> >
> > It's actually not clear from this measurement that any of
> these merges
> > were 'ambiguous' based on the current algorithm, which picks
> the first
> > common ancestor furthest from root.
>
>
> Umm, I am a bit surprised. I tried this on 3 of our repos.
> Looking at
> the 3 corresponding merge-ancestors files, none of them has a
> line
> showing more than possible 1 ancestor (if I understood what
> you did
> properly in cases where there are more than 1 ancestor I
> should get a
> line such as "148: 124 131", right?)
>
> In particular, this is the data I got:
>
> - Repo 1: 1270 revisions, 158 merges, 27 branches (16
> inactive, 4 closed)
> - Repo 2: 1054 revisions, 82 merges, 22 branches (11
> inactive, 3 closed)
> - Repo 2: 513 revisions, 41 merges, 10 branches (2 inactive,
> 1 closed)
>
> In all cases the number of merges that may be ambiguous is 0.
>
> We are sometimes seeing merge issues because of multiple common
> ancestors.
> On our repository with 1453 revisions, we see the following numbers:
> 1 common ancestor: 397 merges
> 2 common ancestors: 15 merges
> 3 common ancestors: 1 merge
>
> We have noticed it's possible to try to work around this by merging
> the correct intermediate changesets first, but this is quite
> complicated (we also have a number of users who are quite new to
> Mercurial and find it very hard to understand).
>
> I saw there is a wiki page with a proposal on resolving these merges:
> http://mercurial.selenic.com/wiki/ConsensusMerge
> Is someone looking at this already? We would be glad to help test such
> a change.
That's on my plate, but the 2.3 cycle was way too busy for me to make
any headway on it.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list