[External] remotefilelog: Large Manifest
Augie Fackler
raf at durin42.com
Tue Mar 3 14:51:21 UTC 2020
> On Mar 1, 2020, at 13:42, Son Luong Ngoc via Mercurial <mercurial at mercurial-scm.org> wrote:
>
> (cc facebook folks)
>
> Hi Pulkit,
>
> I have tested narrow but its has a heavy load on I/O that I am not sure how to scale it up further :(
>
> Realistically remotefilelog's approach with memcached + infinitepush is a lot more appealing for the scaling requirement.
> I found https://github.com/facebookexperimental/eden/blob/master/eden/scm/edenscm/hgext/treemanifest/__init__.py which seems to support a server-client treemanifest transfer similar to remotefilelog but I am not sure how compatible is it.
>
> Gona dig further, appreciate if anybody can share if they have explored facebook's treemanifest extension (outside of facebook context ofc)
I haven't looked into FB's treemanifest extension, but note there are better-supported tree-manifests in core (confusingly named the same thing, not our fault sorry). Since it sounds like you're exploring converting to hg anyway, I'd encourage making a large repository into treemanifest format so narrow can work more effectively.
>
> Cheers,
> Son Luong.
>
>> On Mar 1, 2020, at 12:26, Pulkit Goyal <7895pulkit at gmail.com> wrote:
>>
>> Hi,
>>
>> On Tue, Feb 25, 2020 at 5:09 PM Son Luong Ngoc via Mercurial
>> <mercurial at mercurial-scm.org> wrote:
>>>
>>> Just to add some more info:
>>> The repo was convert from git -> hg using hg-git latest version(Cloned from Heptapod) and HG 5.2.2 (or 5.2.1)
>>>
>>> The following is the debug info:
>>>> hg debugformat
>>> format-variant repo
>>> fncache: yes
>>> dotencode: yes
>>> generaldelta: yes
>>> sparserevlog: yes
>>> sidedata: no
>>> copies-sdc: no
>>> plain-cl-delta: yes
>>> compression: zlib
>>
>> You should definitely try out the zstd compression. It won't make a
>> large difference but it will make things better.
>>> compression-level: default
>>>
>>> My current hypothesis is that because this repo was converted from git with hg-git:
>>> - too many HEADs/Branches
>>> - a non-linear history because of "merge commit"
>>>
>>> I would appreciate if somebody could give me a hint on how to deal with this case :(
>>>
>>> Cheers,
>>> Son Luong.
>>>
>>>> On Feb 21, 2020, at 18:13, Son Luong Ngoc <son.luong at booking.com> wrote:
>>>>
>>>> Hey folks,
>>>>
>>>> Another question tinkering with RemoteFileLog extension is how to deal with large number of changelog and manifest?
>>>> In particular, this is my current store after shallow cloning with RemoteFileLog (on a large repo, total clone time is ~20 mins)
>>>>
>>>> ~/test/repo-hg/.hg/store> du -sh ./*
>>>> 304M ./00changelog.d
>>>> 81M ./00changelog.i
>>>> 1.8G ./00manifest.d
>>
>> The manifest is quite big. A win will be to partially clone manifests
>> too. With narrow extension, that can be achieved using treemanifest.
>> Not sure if that works with remotefilelog too.
>>>> 80M ./00manifest.i
>>>> 0B ./data
>>>> 0B ./phaseroots
>>>> 4.0K ./undo
>>>> 4.0K ./undo.backupfiles
>>>> 0B ./undo.phaseroots
>>>>
>>>> Client side config
>>>> ~> cat ~/.hgrc
>>>> [extensions]
>>>> remotefilelog =
>>>> fsmonitor =
>>>> sparse =
>>>>
>>>> [remotefilelog]
>>>> cachepath = /Users/sluongngoc/test/hgcache
>>>> cachelimit = 10
>>>>
>>>> Serverside config
>>>>
>>>>> cat .hg/hgrc
>>>> [paths]
>>>> # HG-Git stuff
>>>> default = git+ssh://git@gitserver/path/repo.git
>>>>
>>>> [remotefilelog]
>>>> server = True
>>>> serverexpiration = 10
>>>>
>>>> Thanks,
>>>> Son Luong.
>>>
>>> _______________________________________________
>>> Mercurial mailing list
>>> Mercurial at mercurial-scm.org
>>> https://urldefense.com/v3/__https://www.mercurial-scm.org/mailman/listinfo/mercurial__;!!FzMMvhmfRQ!8YYfL1cU1fLBFj3tCOMeg5gsOGMX6nLVb76GLxlvU4oAeupr1GXWxk5kTPtx2PTD$
>
> _______________________________________________
> Mercurial mailing list
> Mercurial at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial
More information about the Mercurial
mailing list