[PATCH 2 of 5] adjustlinkrev: use C implementation to test ancestor if possible
Jun Wu
quark at fb.com
Tue Nov 8 17:15:06 UTC 2016
Excerpts from Pierre-Yves David's message of 2016-11-08 17:28:21 +0100:
>
> On 11/07/2016 02:01 AM, Jun Wu wrote:
> > Excerpts from Pierre-Yves David's message of 2016-11-07 01:08:18 +0100:
> > […]
> >> What about an intermediate version that use C code to fill a Python heap
> >> object? A good way to get a good grasp on the boost I think having a
> >> small Cython proof of concept would help.
> >
> > That might be helpful. However, I don't have immediate plans for it. Because
> > we will use C code to do fast "isancestor" test for the "changelog.i" walk,
> > and if we pre-build the linkrev database, there is no need to do
> > "changelog.d" walk. So "lazyancestors" won't be actually used in
> > "adjustlinkrev", which is my main focus right now.
>
> Can you setup a plan page for this linkrev database project ?
This is pretty simple and straightforward that probably does not need a plan
page. The database maintains a map from (path, fnode) to [linkrev] (a list
of possible linkrevs). The first version is being reviewed internally. A
preview is at https://bpaste.net/show/8634205271dc . I'll send them upstream
as Augie suggested once it's more polished, and will try to work with
non-filelog case that Martin mentioned on IRC.
> >>> I'd note that I think the C code may have room to improve. And if we are
> >>> going to add some "state" to speed up testing ancestors, I think it should
> >>> be done globally, i.e. on the C index object. So the state is private to the
> >>> C code and the Python code just calls index.isancestor, without knowing
> >>> about the details. As the changelog is shared globally and it may be a waste
> >>> of memory to have separated states for each fctx.
> >>
> >> I'm a bit afraid of the space complexity of such cache. However, one of
> >> you colleague at the sprint some idea around this.
> >
> > The memory usage of the global cache would be something like O(log(N) * N)
> > or O(N), depending on how useful we want it to be.
>
> I'm not sure how you forsee this cache, can you elaborate ? A naive
> persistent cache approach would give a O(N²) space.
If I'm going to write the O(N^2) version, you can safely assume that I
couldn't write fastannotate :)
> > This is better than the "_ancestrycontext" approach - suppose every ctx has
> > an "_ancestrycontext", it's N^2 memory usage, where N is len(repo).
>
> Except we don't do that right now. Especially, if we use many changectx
> with a _ancestrycontext, it is likely to be the same object used for all
> if I remember correctly.
But that does not speed up changelog.d walk. And I double that the use of
_ancestrycontext could lead to incorrect results in some cases if they are
the same object for the chain.
>
> Cheers,
>
More information about the Mercurial-devel
mailing list