[External] remotefilelog: Large Manifest
Son Luong Ngoc
son.luong at booking.com
Mon Mar 2 17:18:35 UTC 2020
Hi there,
> On Mar 2, 2020, at 18:06, Pierre-Yves David <pierre-yves.david at ens-lyon.org> wrote:
>
>
>
> On 3/2/20 5:46 PM, Pierre-Yves David wrote:
>> On 3/2/20 10:39 AM, Son Luong Ngoc wrote:
>>> Hi Yves,
>> (My first name is actually "Pierre-Yves" as a whole.)
>>>
>>>> On Mar 1, 2020, at 20:51, Pierre-Yves David <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>> wrote:
>>>> +1 using zstd don't have a huge impact on size, but speed up read/write up to about 50%
>>>>
>>>> You can see some number here https://urldefense.com/v3/__https://www.mercurial-scm.org/repo/hg/rev/bb271ec2fbfb__;!!FzMMvhmfRQ!6rrRJJuYzgXSEUsccdjBdOayyRReVBhJmffLSSH9LKaeW20V_3FzB0FhFMjgUoRq$
>>> I can try this but not sure whats the best way to do a format conversion?
>>> A re-convert (from using hg-git) is expensive on large repo.
>>> Trying "hg --config format.revlog-compression=zstd clone local1 local2" now, hopefully I can just copy the hg-git files over after.
>> You can set format.revlog-compression=zstd in your hgrc (repo or user) and run `hg debugupgraderepo --run`.
>>>
>>>> Can you share some information to help with diagnostic ?
>>> I will not be able to give the exact number but I will try to give a rough estimations
>>>
>>>> - How many changeset do you have ?
>>> 150k
>>>
>>>> - How many files do you have in your working copy ?
>>> Should not be relevant since we really try to have a sparse-checkout working, most of my tests are with --noupdates
>>> But I would say around 200k
>> `hg manifest --rev tip | wc -l` would give you a number for the later head.
>>>> - How many different file did you had in the repository ever ? `hg manifest --all | wc -l`
>>> 450k
>>>
>>>> - How many merge do you have.
>>> Using "git rev-list --merges --count HEAD" in the default branch of our git repo: 100k
>> Do you mean you have 150k changeset, 100k of them being merge ? (so ⅔) If so, this is a pretty high merge ratio.
>> In all cases, having a 1.8GB manifest for 150k changeset is definitely very high I have seems repository with 10× more changeset with manifest 2× smaller. So I would be surprised if we cannot get something better.
>
> Another questions: How many heads to you have?
>
> The output of the following command are both interesting:
> - hg heads -T '.\n' -tc | wc -l
> - hg heads -T '.\n' -c | wc -l
Both are the same number which is about 5k.
But I think it should be a lot smaller (since we are currently converting everything using hg-git instead of 1 trunk git branch only).
I just dont know how to filter branch with hg-git right now (or if its even possible).
Cheers,
Son Luong.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20200302/a5b97cc7/attachment-0002.html>
More information about the Mercurial
mailing list