[PATCH 2 of 2] revset: make children only look at commits >= minrev
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Thu Sep 25 00:09:17 UTC 2014
On 09/23/2014 04:35 PM, Durham Goode wrote:
> Pierre-Yves and I have come to a stalemate in real life, so perhaps the
> community will have an opinion.
>
> I believe these changes are small enough and impactful enough to go in
> as is. They don't cause any regressions in the revset benchmarks, and
> are unlikely to cause significant regressions in any other cases. While
> the same performance might be achievable by refactoring the underlying
> revset classes to be smarter, the current change is 3 lines, buys us a
> 40% perf improvement, and doesn't make a future refactor any more
> difficult.
This change -do- makes it more difficult to improves revset in the
future. But making revset function less homogeneous you are making it
harder to take advantage of future improvement in the core class. You
also make it harder to test the impact of improvement in those core
class. Moreover by doing dirty optimization for the common case you hide
obvious issue, letting them being expressed again later when someone add
new usage of revset. Finally, doing benchmark and regression testing
gets harder too. The more differently each revsets is, the harder is it
to get a good picture of the performance impact with testing only a few
revsets. If all revsets function build it result in a different way we
end needing to test tens of combination for all of them which make such
benchmark eventually impossible.
This is not just theoretical statement. This is my very experience of
someone that works on revset and have to revert this kind of changes to
be able to do so.
The lazy revset implementation is young and crappy. It has been done by
an intern in a few months and left uncompleted. (no offense intended for
the said intern, nothing unexpectedly bad there). What this
implementation now needs is love, not hotfix. Improving dramatically the
current implementation is -simple- and -cheap-. We had O(n²) stupidity
in a common code path for 5 months, and it was solvable by a 3 line
patches to a core class.
The case you are trying to address in your patch (smallest set should be
used as the base when doing "and" operation) have been spotted multiple
time (parents, children), affect multiple real world command (histedit,
rebase, most probably evolve) And a lot of other revset could benefit
from this approach (p1, p2, etc). There is obvious way to fix it once
and for all in the handfull of smartset classes: makes the __and__
function aware of some length hint and use that to swap the operand when
it seems smart to do so. Fixing your case using this approach is not a
huge amount of work.Detecting that this issue is a pattern and deciding
how solve it properly were most of the work and it is already here. You
do not have to produce a complete implementation of this idea to solve
your idea. Just touching the couple of classes involved in the case you
are trying to optimize are enough. And this will speedup some other
revset throughout the code base. Doing only part of this new logic is
okay once you have started the work in that direction other people may
complete the other bricks over time without having to do the whole
diagnostic process you already did.
So please send the couple of few lines patches that would fix this issue
the right way instead of building technical debt. More time it is
necessary to produce such patches have been already invested in this debate.
--
Pierre-Yves David
More information about the Mercurial-devel
mailing list