caching pull - stable partitioning of bundle requests
Boris FELD
boris.feld at octobus.net
Wed Sep 26 18:13:13 UTC 2018
Hi everyone,
Pulling from a server involves expensive server-side computation that we
wish to cache. However, since the client can pull any arbitrary set of
revision, grouping and dispatching the data to be cached is ahard problem.
When we implemented the new discovery for obsolescence markers, we
developed a "stablerange" method to build an efficient way to slice the
changesets graph into ranges. In addition to solving the obsolescence
markers discovery problem, this "stablerange" principle seemed to be
useful for more usages, in particular,the caching of pulls.
Right now, with the current pull bundle implementation, here is how it
work: you manually create and manually declare bundles containing either
all changesets (that could also be used for clone bundles) or more
specific ones. When the client request some changesets, the server
searches a bundle containing the needed range and send it. This often
involves more than the requested data. The client needs to filter out
the extraneous data. Then the client does a discovery to catch any
missing changesets from the bundle. If the server doesn't find a valid
pull bundle, a normal discovery is done.The manual bundle managements is
suboptimal, the search for appropriate bundles has a bad complexity and
the extra roundtrip and discovery adds extra slowness.
This weekend, we build a "simple" prototype that use "stablerange" to
slice changegroup request in "getbundle" into multiple bundlesthat can
be reused from one pull to another. That slicing happens as part of a
normal pull, during the getbundle call and after the normal discovery
happens. There are no needs for an extra discovery and getbundle call
after it.
With this"stablerange"based strategy,we start from the set of requested
changesets to generate a set of "standard" range covering all of them.
This slicing has a good algorithmic complexity that depends on the size
of the selected "missing" set of changesets. So the associated cost of
will scale well with the size of the associated pull. In addition, we no
longer have to do an expensive search into a list existing bundles. This
helps to scale small pulls and increase the number of bundles we can
cache, as the time we spend selecting bundle no longer depends on the
numbers of cached ones. Since we can exactly cover the client request,
we also no longer need to issue anextra pull roundtrip after the cache
retrieval.
That slicing focus on producing ranges that:
* Have a high chance to be reusable in a pull selecting similar
changesets,
* Gather most of the changesets in large bundles.
This caching strategy inherits the nice "stablerange" properties
regarding repository growth
* When a few changesets are appended to a repository, only a few
ranges haveto be added.
* The overall number of ranges (and associated bundles) to create to
represent all possible ranges has anO(N log(N)) complexity.
For example, here are the 15 ranges selected for a full clone of
mozilla-central:
* |262114 changesets|
* |30 changesets|
* |65536 changesets|
* |32741 changesets|
* |20 changesets|
* |7 changesets|
* |8192 changesets|
* |243 changesets|
* |13 changesets|
* |114 changesets|
* |14 changesets|
* |32 changesets|
* |16 changesets|
* |8 changesets|
* |1 changesets|
*
If we only clone a subset of the repository, the larger ranges get
reused (hg clone --rev -5000):
* |262114 changesets found in caches|
* |30 changesets found in caches|
* |65536 changesets found in caches|
* |32741 changesets found in caches|
* |20 changesets found in caches|
* |7 changesets found in caches|
* |2048 changesets found|
* |1024 changesets found|
* |482 changesets found|
* |30 changesets found|
* |32 changesets found|
* |1 changesets found|
* |7 changesets found|
* |4 changesets found|
* |2 changesets found|
* |1 changesets found|
As you can see, the larger ranges of this second pull are common with
the previous pull, allowing to reuse cached bundles.
The prototype is available in a small "pullbundle" extension. It focuses
on the slicing itself and we did not implement anything fancy for the
cache storage and delivery. We simply store generated bundle on disk and
we read it from disk when it is needed again. Others, like Joerg
Sonnenberger or Gregory Szorc, are already working on the "cache
delivery" problem.
We are getting good result our of that prototypes when testing it on
clones of mozilla-central and netbsd-src. See "Example Result" section
for detail.
The prototype is up and running on our hgweb "mirror" instance
https://mirror.octobus.net/.
The extension comes with a small debug command that produces statistic
of the ranges that multiple random pulls would use.
The "stablerange" implementation currently still[1] live in the evolve
extensions, so we put the extensions in the same repository for
simplicity as "pullbundle". This is not ideal but was a simple solution
in the time we could dedicate. This extension is not part of any
official release yet. To test it you have to install it from the
repository for now: https://www.mercurial-scm.org/repo/evolve/#default
The extension's code is here:
https://www.mercurial-scm.org/repo/evolve/file/tip/hgext3rd/pullbundle.py
The prototype performance is not stellar, but good enough to give useful
result in a reasonable amount of time. A production-grade implementation
of stablerange algorithm and storage will fix that. There is also room
for improvement progression in the algorithm themselves, multiple
sub-problem can be improved. We started having regular meeting with
University researcher working on graph theory, they are interested in
the problem space.
[1] The stablerange principle has been validating in the field and is
ready to get upstreamed.
Example results.
The extensions comewith a command to simulate multiple pullsof a random
set of revisions (from a larger set of revision we define). This starts
with a cold cache for simplicity.
Mozilla Central
We gathering 100 sample pulls within 20443 revisions
* Median pull size: 18 338
* Median number of changegroup used: 132 changegroups.
* Median changeset not cached: 88
* Median ratio of changeset already in the cache: 99.5%
The number of different ranges staysunder control as expected:
* Total changeset served: 1 817 955
* Total changeset cache hit ratio: 96% (from a cold cache)
* Distinct range cached: 12 983, Most of them very small (90% ⤠32
changesets)
* A small number of (larger) ranges get most of the cache hit.
Providing the smaller range from cache might not be a good tradeoff. If
we skip using the cache for smaller ranges we still get interesting results:
Only caching range containing 256 changeset or more:
* Median pull size: 18 940
* Median number of changegroup used: 12
* Median changeset not cached: 1 949
* Median ratio of changeset already in the cache: 90%
* Total changeset served: 1 850 243
* Total changeset cache hit ratio: 87%
* Distinct range cached: 1 150
Another way to reduce the number of server bundle would be to do some
"over serving": using bundle containing some common changesets.
(See the end of the email for full details)
netbsd
This time, we issues more random pull (1000) within a set a bit smaller
set of 10 000 changesets.
This resulted in smaller pulls, that also show good results:
* Median pull size: 1673
* Median number of changegroup used: 51
* Median changeset not cached: 50
* Median ratio of changeset already in the cache: 99%
* Total changeset served: 1601087
* Total changeset cache hit ratio: 96%
* Distinct range cached: 51751
Trying the same 256+ changesets limit on caching, we see a stronger
impact. Probably because of the smaller pulls:
* Median pull size: 1592
* Median number of changegroup used: 2
* Median changeset not cached: 663
* Median ratio of changeset already in the cache: 46%
* Total changeset served: 1 554 227
* Total changeset cache hit ratio: 56%
* Distinct range cached: 1 914
(See the end of the email for full details)
pypy
pypy testing was done using 1 000 pulls within 16687 changesets.
* Median pull size: 11375
* Median number of changegroup used: 1206
* Median changeset not cached: 12
* Median ratio of changeset already in the cache: 100%
* Total changeset served: 11 167 863
* Total changeset cache hit ratio: 99%
* Distinct range cached: 1 139 537
Installing the 256+ changeset limits give less good results. This is
probably the result of a shallower pull space and the amount of merge in
the pypy repository. The pypy repository is significantly more branchy
than the other ones, there issome known way we could improve stablerange
partitioning in this cases (to produce larger ranges).
* Median pull size: 11457
* Median number of changegroup used: 9
* Median changeset not cached: 7276
* Median ratio of changeset already in the cache: 37%
* Total changeset served: 11211093
* Total changeset cache hit ratio: 36%
* Distinct range cached: 8964
Full details:
mozilla central, 100 pull in 20553 revisions, no limit
   mozilla-central> hg debugpullbundlecacheoverlap --count 100 -- 'tip~10000:'
   gathering 100 sample pulls within 20443 revisions
   pull size:
     min: 13176
     10%: 16425
     25%: 17344
     50%: 18338
     75%: 19192
     90%: 19719
     95%: 19902
     max: 20272
   non-cached changesets:
     min: 4
     10%: 10
     25%: 24
     50%: 88
     75%: 237
     90%: 956
     95%: 3152
     max: 17440
   ratio of cached changesets:
     min: 0.0
     10%: 0.947941624918
     25%: 0.987343800064
     50%: 0.995036297078
     75%: 0.998774313882
     90%: 0.999421798208
     95%: 0.999634750848
     max: 0.999795354548
   bundle count:
     min: 74
     10%: 99
     25%: 113
     50%: 132
     75%: 146
     90%: 158
     95%: 169
     max: 186
   ratio of cached bundles:
     min: 0.0
     10%: 0.685082872928
     25%: 0.810810810811
     50%: 0.911392405063
     75%: 0.953020134228
     90%: 0.974683544304
     95%: 0.98125
     max: 0.993377483444
   changesets served:
     total:     1817955
     from cache: 1752642 (96%)
     bundle:      12983
   size of cached bundles:
     min: 1
     10%: 1
     25%: 2
     50%: 4
     75%: 9
     90%: 32
     95%: 64
     max: 8165
   hit on cached bundles:
     min: 1
     10%: 1
     25%: 1
     50%: 2
     75%: 4
     90%: 14
     95%: 20
     max: 100
mozilla central, 100 pull in 20443 revision, only caching ranges
of 256 changeset and above
mozilla-central > hg debugpullbundlecacheoverlap --count 100 --min-cache=256 -- 'tip~10000:'
gathering 100 sample pulls within 20443 revisions
not caching ranges smaller than 256 changesets
pull size:
min: 14060
10%: 16910
25%: 17923
50%: 18940
75%: 19471
90%: 19884
95%: 20029
max: 20309
non-cached changesets:
min: 973
10%: 1398
25%: 1707
50%: 1949
75%: 2246
90%: 2590
95%: 3448
max: 19551
ratio of cached changesets:
min: 0.0
10%: 0.839908649729
25%: 0.884512085944
50%: 0.897293365889
75%: 0.91018907563
90%: 0.926944971537
95%: 0.935139218649
max: 0.946839315959
bundle count:
min: 4
10%: 10
25%: 11
50%: 12
75%: 12
90%: 13
95%: 15
max: 16
ratio of cached bundles:
min: 0.0
10%: 0.909090909091
25%: 1.0
50%: 1.0
75%: 1.0
90%: 1.0
95%: 1.0
max: 1.0
changesets served:
total: 1850243
from cache: 1617379 (87%)
bundle: 1150
size of cached bundles:
min: 256
10%: 256
25%: 256
50%: 512
75%: 1024
90%: 1024
95%: 1024
max: 8165
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 7
75%: 44
90%: 44
95%: 59
max: 98
netbsd-src 1000 pull within 10000 revisions:
netbsd-src > hg debugpullbundlecacheoverlap --count 1000 -- '-10000:'
gathering 1000 sample pulls within 10000 revisions
pull size:
min: 10
10%: 339
25%: 865
50%: 1673
75%: 2330
90%: 2752
95%: 2893
max: 3466
non-cached changesets:
min: 0
10%: 3
25%: 7
50%: 16
75%: 50
90%: 137
95%: 239
max: 2787
ratio of cached changesets:
min: 0.0
10%: 0.781553398058
25%: 0.940663176265
50%: 0.987631184408
75%: 0.996178830722
90%: 0.99843688941
95%: 0.998939554613
max: 1.0
bundle count:
min: 10
10%: 28
25%: 37
50%: 51
75%: 65
90%: 78
95%: 88
max: 121
ratio of cached bundles:
min: 0.0
10%: 0.446808510638
25%: 0.673076923077
50%: 0.826086956522
75%: 0.901639344262
90%: 0.942857142857
95%: 0.96
max: 1.0
changesets served:
total: 1601087
from cache: 1539491 (96%)
bundle: 51751
size of cached bundles:
min: 1
10%: 1
25%: 1
50%: 1
75%: 4
90%: 8
95%: 16
max: 2048
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 2
75%: 3
90%: 8
95%: 13
max: 291
netbsd-src 1000 pull within 10000 revisions, not caching range
smaller than 256:
netbsd-src > hg debugpullbundlecacheoverlap --count 1000 --min-cache=256 -- '-10000:'
gathering 1000 sample pulls within 10000 revisions
not caching ranges smaller than 256 changesets
pull size:
min: 10
10%: 329
25%: 813
50%: 1592
75%: 2271
90%: 2745
95%: 2922
max: 3719
non-cached changesets:
min: 10
10%: 265
25%: 440
50%: 663
75%: 911
90%: 1111
95%: 1229
max: 2852
ratio of cached changesets:
min: 0.0
10%: 0.0
25%: 0.136752136752
50%: 0.461261261261
75%: 0.686327077748
90%: 0.792263056093
95%: 0.829552819183
max: 0.959700093721
bundle count:
min: 0
10%: 0
25%: 1
50%: 2
75%: 3
90%: 4
95%: 4
max: 6
ratio of cached bundles:
min: 0.0
10%: 1.0
25%: 1.0
50%: 1.0
75%: 1.0
90%: 1.0
95%: 1.0
max: 1.0
changesets served:
total: 1554227
from cache: 871680 (56%)
bundle: 1914
size of cached bundles:
min: 256
10%: 256
25%: 256
50%: 256
75%: 512
90%: 512
95%: 512
max: 2048
hit on cached bundles:
min: 2
10%: 3
25%: 7
50%: 61
75%: 113
90%: 113
95%: 117
max: 267
pypy 1000 pulls within 16687 changesets:
pypy > time hg debugpullbundlecacheoverlap --count 1000 -- 'tip~2000:'
gathering 1000 sample pulls within 16687 revisions
pull size:
min: 5835
10%: 9165
25%: 10323
50%: 11375
75%: 12248
90%: 12904
95%: 13181
max: 14221
non-cached changesets:
min: 0
10%: 1
25%: 4
50%: 12
75%: 39
90%: 142
95%: 453
max: 12046
ratio of cached changesets:
min: 0.0
10%: 0.986539780521
25%: 0.99640167364
50%: 0.99889963059
75%: 0.99964598637
90%: 0.999911496593
95%: 1.0
max: 1.0
bundle count:
min: 183
10%: 762
25%: 1045
50%: 1206
75%: 1308
90%: 1387
95%: 1427
max: 1631
ratio of cached bundles:
min: 0.0
10%: 0.972310126582
25%: 0.990171990172
50%: 0.995529061103
75%: 0.997882851094
90%: 0.999203821656
95%: 1.0
max: 1.0
changesets served:
total: 11167863
from cache: 11060195 (99%)
bundle: 1139537
size of cached bundles:
min: 1
10%: 1
25%: 1
50%: 2
75%: 4
90%: 15
95%: 30
max: 2041
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 3
75%: 20
90%: 245
95%: 848
max: 999
pypy 1000 pulls within 16687 changesets:, caching above 256
changesets only:
|
   time hg debugpullbundlecacheoverlap --count 1000 --min-cache=256 --
'tip~2000:'
   gathering 1000 sample pulls within 16687 revisions
     not caching ranges smaller than 256 changesets
   pull size:
     min: 3629
     10%: 9075
     25%: 10278
     50%: 11457
     75%: 12325
     90%: 12961
     95%: 13245
     max: 14330
   non-cached changesets:
     min: 2605
     10%: 5885
     25%: 6619
     50%: 7276
     75%: 7813
     90%: 8319
     95%: 8577
     max: 11815
   ratio of cached changesets:
     min: 0.0
     10%: 0.296190172981
     25%: 0.344310129221
     50%: 0.368110984417
     75%: 0.391840607211
     90%: 0.417344173442
     95%: 0.445362934971
     max: 0.544580009385
   bundle count:
     min: 1
     10%: 7
     25%: 8
     50%: 9
     75%: 10
     90%: 11
     95%: 11
     max: 12
   ratio of cached bundles:
     min: 0.0
     10%: 1.0
     25%: 1.0
     50%: 1.0
     75%: 1.0
     90%: 1.0
     95%: 1.0
     max: 1.0
   changesets served:
     total:     11211093
     from cache: 4066703 (36%)
     bundle:       8964
   size of cached bundles:
     min: 256
     10%: 256
     25%: 256
     50%: 468
     75%: 510
     90%: 1019
     95%: 512
     max: 1916
   hit on cached bundles:
     min: 1
     10%: 1
     25%: 3
     50%: 13
     75%: 63
     90%: 823
     95%: 153
     max: 958|
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180926/c7ff9c32/attachment.html>
More information about the Mercurial-devel
mailing list