Are push times linear in the size of the repo?

Michael O'Connor mkoconnor at gmail.com
Tue Apr 14 18:35:42 UTC 2015


Got it, thanks.  I think I was wrong about the time in
setdiscovery.findcommonheads depending on the total number of changesets.
Is it the case that the performance degradation as the number of heads
increases is basically just due to the discovery algorithm?

FWIW, I think it still is straightforwardly the case that in the two other
cases I mentioned a set of all the changesets is constructed, which seems
unfortunate, although I don't have any suggestions for how to improve the
situation.

On Tue, Apr 14, 2015 at 12:15 PM, Gregory Szorc <gregory.szorc at gmail.com>
wrote:

> On Tue, Apr 14, 2015 at 12:05 PM, Michael O'Connor <mkoconnor at gmail.com>
> wrote:
>
>> I was looking at a breakdown of where the time was spent in pushing one
>> empty changeset in a repo with a few hundred thousand changesets.
>>
>> It looks like there are a few places in the course of a push where some
>> object is created that is linear in the total size of the repo (or the
>> number of changesets in common between local and remote).  For example, in
>> determining which changesets are outgoing (see
>> http://selenic.com/repo/hg/file/52ff737c63d2/mercurial/setdiscovery.py#l131),
>> in determining which phase information to push (see
>> http://selenic.com/repo/hg/file/52ff737c63d2/mercurial/exchange.py#l144),
>> and in determining information about obsolescence of future heads (see
>> http://selenic.com/repo/hg/file/52ff737c63d2/mercurial/discovery.py#l276
>> ).
>>
>> In the third case, a patch was recently accepted to only compute this set
>> when the repo actually has an obsstore, but is there a plan to eventually
>> make pushing more incremental and not depend on the entire repo, or is it
>> thought that this won't be a problem?  Or perhaps I've misread the
>> situation?
>>
>
> In general, push times (ignoring the size of the data being exchanged) are
> tied stronger to the number of DAG heads than the number of commits in a
> repository. A repository with 200,000 commits and 100,000 heads will likely
> take longer to push to than a repository with 10,000,000 commits and 10
> heads.
>
> There is room to optimize discovery on repos with thousands of heads. This
> is a problem Mozilla has and fixing it in core and by extensions has been
> discussed before.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20150414/41d8a62a/attachment-0002.html>


More information about the Mercurial mailing list