[PATCH] revset: improves time complexity of 'roots(xxx)'
Augie Fackler
raf at durin42.com
Mon Jun 22 22:55:05 UTC 2015
On Mon, Jun 22, 2015 at 01:51:10PM -0700, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at fb.com>
> # Date 1434993552 25200
> # Mon Jun 22 10:19:12 2015 -0700
> # Node ID 6a2c2c13784761bbd8b1c46d2888257d234ae591
> # Parent 7fdd1782fc4ee9da87d8af13e806dc9055db2c38
> revset: improves time complexity of 'roots(xxx)'
Queued, thanks
>
> The canonical way of doing 'roots(X)' is 'X - children(X)'. This is what the
> implementation used to be. However, computing children is expensive because it
> is unbounded. Any changesets in the repository may be a children of '0' so you
> have to look at all changesets in the repository to compute children(0).
> Moreover the current revsets implementation for children is not lazy, leading to
> bad performance when fetching the first result.
>
>
> There is a more restricted algorithm to compute roots:
>
> roots(X) = [r for r in X if not parents(r) & X]
>
> This achieve the same result while only looking for parent/children relation in
> the X set itself, making the algorithm 'O(len(X))' membership operation.
> Another advantages is that it turns the check into a simple filter, preserving
> all laziness property of the underlying revsets.
>
> The speed is very significant and some lazyness is restored.
>
> -) revset without 'roots(...)' to compare to base line
> 0) before this change
> 1) after this change
>
> revset #0: roots((tip~100::) - (tip~100::tip))
> plain min last
> -) 0.001082 0.000993 0.000790
> 0) 0.001366 0.001385 0.001339
> 1) 0.001257 92% 0.001028 74% 0.000821 61%
>
> revset #1: roots((0::) - (0::tip))
> plain min last
> -) 0.134551 0.144682 0.068453
> 0) 0.161822 0.171786 0.157683
> 1) 0.137583 85% 0.146204 85% 0.070012 44%
>
> revset #2: roots(tip~100:)
> plain min first last
> -) 0.000219 0.000225 0.000231 0.000229
> 0) 0.000513 0.000529 0.000507 0.000539
> 1) 0.000463 90% 0.000269 50% 0.000267 52% 0.000463 85%
>
> revset #3: roots(:42)
> plain min first last
> -) 0.000119 0.000146 0.000146 0.000146
> 0) 0.000231 0.000254 0.000253 0.000260
> 1) 0.000216 93% 0.000186 73% 0.000184 72% 0.000244 93%
>
> revset #4: roots(not public())
> plain min first
> -) 0.000478 0.000502 0.000504
> 0) 0.000611 0.000639 0.000634
> 1) 0.000604 0.000560 87% 0.000558
>
> revset #5: roots((0:tip)::)
> plain min max first last
> -) 0.057795 0.004905 0.058260 0.004908 0.038812
> 0) 0.132845 0.118931 0.130306 0.114280 0.127742
> 1) 0.111659 84% 0.005023 4% 0.111658 85% 0.005022 4% 0.092490 72%
>
> revset #6: roots(0::tip)
> plain min max first last
> -) 0.032971 0.033947 0.033460 0.032350 0.033125
> 0) 0.083671 0.081953 0.084074 0.080364 0.086069
> 1) 0.074720 89% 0.035547 43% 0.077025 91% 0.033729 41% 0.083197
>
> revset #7: 42:68 and roots(42:tip)
> plain min max first last
> -) 0.006827 0.000251 0.006830 0.000254 0.006771
> 0) 0.000337 0.000353 0.000366 0.000350 0.000366
> 1) 0.000318 94% 0.000297 84% 0.000353 0.000293 83% 0.000351
>
> revset #8: roots(0:tip)
> plain min max first last
> -) 0.002119 0.000145 0.000147 0.000147 0.000147
> 0) 0.047441 0.040660 0.045662 0.040284 0.043435
> 1) 0.038057 80% 0.000187 0% 0.034919 76% 0.000186 0% 0.035097 80%
>
> revset #0: roots(:42 + tip~42:)
> plain min max first last sort
> -) 0.000321 0.000317 0.000319 0.000308 0.000369 0.000343
> 0) 0.000772 0.000751 0.000811 0.000750 0.000802 0.000783
> 1) 0.000632 81% 0.000369 49% 0.000617 76% 0.000358 47% 0.000601 74% 0.000642 81%
>
> diff --git a/contrib/all-revsets.txt b/contrib/all-revsets.txt
> --- a/contrib/all-revsets.txt
> +++ b/contrib/all-revsets.txt
> @@ -21,12 +21,10 @@ parents(20000)
> # Used in revision 186fd06283b4
> (_intlist('20000\x0020001')) and merge()
> # Used in revision 911f5a6579d1
> p1(20000)
> p2(10000)
> -# Used in revision f140d6207cca
> -roots(0:tip)
> # Used in revision b6dc3b79bb25
> 0::
> # Used in revision faf4f63533ff
> bookmark()
> # Used in revision 22ba2c0825da
> @@ -99,28 +97,37 @@ 0::tip
> all()
> draft()
> ::tip
> draft() and ::tip
> ::tip and draft()
> -roots(0::tip)
> author(lmoscovicz)
> author(mpm)
> -42:68 and roots(42:tip)
> ::p1(p1(tip))::
> public()
> :10000 and public()
> :10000 and draft()
> -roots((0:tip)::)
> (not public() - obsolete())
>
> # The one below is used by rebase
> (children(ancestor(tip~5, tip)) and ::(tip~5))::
>
> # those two `roots(...)` inputs are close to what phase movement use.
> roots((tip~100::) - (tip~100::tip))
> roots((0::) - (0::tip))
>
> +# more roots testing
> +roots(tip~100:)
> +roots(:42)
> +roots(not public())
> +roots((0:tip)::)
> +roots(0::tip)
> +42:68 and roots(42:tip)
> +# Used in revision f140d6207cca
> +roots(0:tip)
> +# test disjoint set with multiple roots
> +roots((:42) + (tip~42:))
> +
> # Testing the behavior of "head()" in various situations
> head()
> head() - public()
> draft() and head()
> head() and author("mpm")
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -1753,13 +1753,17 @@ def reverse(repo, subset, x):
> def roots(repo, subset, x):
> """``roots(set)``
> Changesets in set with no parent changeset in set.
> """
> s = getset(repo, fullreposet(repo), x)
> - subset = subset & s# baseset([r for r in s if r in subset])
> - cs = _children(repo, subset, s)
> - return subset - cs
> + parents = repo.changelog.parentrevs
> + def filter(r):
> + for p in parents(r):
> + if 0 <= p and p in s:
> + return False
> + return True
> + return subset & s.filter(filter)
>
> def sort(repo, subset, x):
> """``sort(set[, [-]key...])``
> Sort set by keys. The default sort order is ascending, specify a key
> as ``-key`` to sort in descending order.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel
More information about the Mercurial-devel
mailing list