[External] remotefilelog: Large Manifest

Pierre-Yves David pierre-yves.david at ens-lyon.org
Mon Mar 2 17:06:16 UTC 2020



On 3/2/20 5:46 PM, Pierre-Yves David wrote:
> On 3/2/20 10:39 AM, Son Luong Ngoc wrote:
>> Hi Yves,
> 
> (My first name is actually "Pierre-Yves" as a whole.)
> 
>>
>>> On Mar 1, 2020, at 20:51, Pierre-Yves David 
>>> <pierre-yves.david at ens-lyon.org 
>>> <mailto:pierre-yves.david at ens-lyon.org>> wrote:
>>> +1 using zstd don't have a huge impact on size, but speed up 
>>> read/write up to about 50%
>>>
>>> You can see some number here 
>>> https://urldefense.com/v3/__https://www.mercurial-scm.org/repo/hg/rev/bb271ec2fbfb__;!!FzMMvhmfRQ!6rrRJJuYzgXSEUsccdjBdOayyRReVBhJmffLSSH9LKaeW20V_3FzB0FhFMjgUoRq$ 
>>>
>> I can try this but not sure whats the best way to do a format conversion?
>> A re-convert (from using hg-git) is expensive on large repo.
>> Trying "hg --config format.revlog-compression=zstd clone local1 
>> local2" now, hopefully I can just copy the hg-git files over after.
> 
> You can set format.revlog-compression=zstd in your hgrc (repo or user) 
> and run `hg debugupgraderepo --run`.
> 
>>
>>> Can you share some information to help with diagnostic ?
>> I will not be able to give the exact number but I will try to give a 
>> rough estimations
>>
>>> - How many changeset do you have ?
>> 150k
>>
>>> - How many files do you have in your working copy ?
>> Should not be relevant since we really try to have a sparse-checkout 
>> working, most of my tests are with --noupdates
>> But I would say around 200k
> 
> `hg manifest --rev tip | wc -l` would give you a number for the later head.
> 
>>> - How many different file did you had in the repository ever ? `hg 
>>> manifest --all | wc -l`
>> 450k
>>
>>> - How many merge do you have.
>> Using "git rev-list --merges --count HEAD" in the default branch of 
>> our git repo: 100k
> 
> Do you mean you have 150k changeset, 100k of them being merge ? (so ⅔) 
> If so, this is a pretty high merge ratio.
> 
> In all cases, having a 1.8GB manifest for 150k changeset is definitely 
> very high I have seems repository with 10× more changeset with manifest 
> 2× smaller. So I would be surprised if we cannot get something better.

Another questions: How many heads to you have?

The output of the following command are both interesting:
- hg heads -T '.\n' -tc | wc -l
- hg heads -T '.\n' -c | wc -l

-- 
Pierre-Yves David



More information about the Mercurial mailing list