[External] remotefilelog: Large Manifest
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Mon Mar 2 17:06:16 UTC 2020
On 3/2/20 5:46 PM, Pierre-Yves David wrote:
> On 3/2/20 10:39 AM, Son Luong Ngoc wrote:
>> Hi Yves,
>
> (My first name is actually "Pierre-Yves" as a whole.)
>
>>
>>> On Mar 1, 2020, at 20:51, Pierre-Yves David
>>> <pierre-yves.david at ens-lyon.org
>>> <mailto:pierre-yves.david at ens-lyon.org>> wrote:
>>> +1 using zstd don't have a huge impact on size, but speed up
>>> read/write up to about 50%
>>>
>>> You can see some number here
>>> https://urldefense.com/v3/__https://www.mercurial-scm.org/repo/hg/rev/bb271ec2fbfb__;!!FzMMvhmfRQ!6rrRJJuYzgXSEUsccdjBdOayyRReVBhJmffLSSH9LKaeW20V_3FzB0FhFMjgUoRq$
>>>
>> I can try this but not sure whats the best way to do a format conversion?
>> A re-convert (from using hg-git) is expensive on large repo.
>> Trying "hg --config format.revlog-compression=zstd clone local1
>> local2" now, hopefully I can just copy the hg-git files over after.
>
> You can set format.revlog-compression=zstd in your hgrc (repo or user)
> and run `hg debugupgraderepo --run`.
>
>>
>>> Can you share some information to help with diagnostic ?
>> I will not be able to give the exact number but I will try to give a
>> rough estimations
>>
>>> - How many changeset do you have ?
>> 150k
>>
>>> - How many files do you have in your working copy ?
>> Should not be relevant since we really try to have a sparse-checkout
>> working, most of my tests are with --noupdates
>> But I would say around 200k
>
> `hg manifest --rev tip | wc -l` would give you a number for the later head.
>
>>> - How many different file did you had in the repository ever ? `hg
>>> manifest --all | wc -l`
>> 450k
>>
>>> - How many merge do you have.
>> Using "git rev-list --merges --count HEAD" in the default branch of
>> our git repo: 100k
>
> Do you mean you have 150k changeset, 100k of them being merge ? (so ⅔)
> If so, this is a pretty high merge ratio.
>
> In all cases, having a 1.8GB manifest for 150k changeset is definitely
> very high I have seems repository with 10× more changeset with manifest
> 2× smaller. So I would be surprised if we cannot get something better.
Another questions: How many heads to you have?
The output of the following command are both interesting:
- hg heads -T '.\n' -tc | wc -l
- hg heads -T '.\n' -c | wc -l
--
Pierre-Yves David
More information about the Mercurial
mailing list