[External] remotefilelog: Large Manifest

Son Luong Ngoc son.luong at booking.com
Sun Mar 1 18:42:10 UTC 2020


(cc facebook folks)

Hi Pulkit,

I have tested narrow but its has a heavy load on I/O that I am not sure how to scale it up further :(

Realistically remotefilelog's approach with memcached + infinitepush is a lot more appealing for the scaling requirement. 
I found https://github.com/facebookexperimental/eden/blob/master/eden/scm/edenscm/hgext/treemanifest/__init__.py <https://github.com/facebookexperimental/eden/blob/master/eden/scm/edenscm/hgext/treemanifest/__init__.py> which seems to support a server-client treemanifest transfer similar to remotefilelog but I am not sure how compatible is it.

Gona dig further, appreciate if anybody can share if they have explored facebook's treemanifest extension (outside of facebook context ofc)

Cheers,
Son Luong.

> On Mar 1, 2020, at 12:26, Pulkit Goyal <7895pulkit at gmail.com> wrote:
> 
> Hi,
> 
> On Tue, Feb 25, 2020 at 5:09 PM Son Luong Ngoc via Mercurial
> <mercurial at mercurial-scm.org <mailto:mercurial at mercurial-scm.org>> wrote:
>> 
>> Just to add some more info:
>> The repo was convert from git -> hg using hg-git latest version(Cloned from Heptapod) and HG 5.2.2 (or 5.2.1)
>> 
>> The following is the debug info:
>>> hg debugformat
>> format-variant    repo
>> fncache:           yes
>> dotencode:         yes
>> generaldelta:      yes
>> sparserevlog:      yes
>> sidedata:           no
>> copies-sdc:         no
>> plain-cl-delta:    yes
>> compression:       zlib
> 
> You should definitely try out the zstd compression. It won't make a
> large difference but it will make things better.
>> compression-level: default
>> 
>> My current hypothesis is that because this repo was converted from git with hg-git:
>> - too many HEADs/Branches
>> - a non-linear history because of "merge commit"
>> 
>> I would appreciate if somebody could give me a hint on how to deal with this case :(
>> 
>> Cheers,
>> Son Luong.
>> 
>>> On Feb 21, 2020, at 18:13, Son Luong Ngoc <son.luong at booking.com> wrote:
>>> 
>>> Hey folks,
>>> 
>>> Another question tinkering with RemoteFileLog extension is how to deal with large number of changelog and manifest?
>>> In particular, this is my current store after shallow cloning with RemoteFileLog (on a large repo, total clone time is ~20 mins)
>>> 
>>> ~/test/repo-hg/.hg/store> du -sh ./*
>>> 304M    ./00changelog.d
>>> 81M    ./00changelog.i
>>> 1.8G    ./00manifest.d
> 
> The manifest is quite big. A win will be to partially clone manifests
> too. With narrow extension, that can be achieved using treemanifest.
> Not sure if that works with remotefilelog too.
>>> 80M    ./00manifest.i
>>> 0B    ./data
>>> 0B    ./phaseroots
>>> 4.0K    ./undo
>>> 4.0K    ./undo.backupfiles
>>> 0B    ./undo.phaseroots
>>> 
>>> Client side config
>>> ~> cat ~/.hgrc
>>> [extensions]
>>> remotefilelog =
>>> fsmonitor =
>>> sparse =
>>> 
>>> [remotefilelog]
>>> cachepath = /Users/sluongngoc/test/hgcache
>>> cachelimit = 10
>>> 
>>> Serverside config
>>> 
>>>> cat .hg/hgrc
>>> [paths]
>>> # HG-Git stuff
>>> default = git+ssh://git@gitserver/path/repo.git
>>> 
>>> [remotefilelog]
>>> server = True
>>> serverexpiration = 10
>>> 
>>> Thanks,
>>> Son Luong.
>> 
>> _______________________________________________
>> Mercurial mailing list
>> Mercurial at mercurial-scm.org <mailto:Mercurial at mercurial-scm.org>
>> https://urldefense.com/v3/__https://www.mercurial-scm.org/mailman/listinfo/mercurial__;!!FzMMvhmfRQ!8YYfL1cU1fLBFj3tCOMeg5gsOGMX6nLVb76GLxlvU4oAeupr1GXWxk5kTPtx2PTD$ <https://urldefense.com/v3/__https://www.mercurial-scm.org/mailman/listinfo/mercurial__;!!FzMMvhmfRQ!8YYfL1cU1fLBFj3tCOMeg5gsOGMX6nLVb76GLxlvU4oAeupr1GXWxk5kTPtx2PTD$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20200301/193d7e67/attachment-0002.html>


More information about the Mercurial mailing list