Configurable resource usage profile and repository role

Pierre-Yves David pierre-yves.david at ens-lyon.org
Wed Sep 13 10:35:41 UTC 2023


Hello everyone,

I would like to submit some development though and possible directions 
for Mercurial performance-behavior in the future.


One of the key principle of Mercurial since its conception is 
"versatility". Mercurial is built to be as suitable for small 
repositories, large repositories, small teams, large teams, small 
hardware, large hardware. This approach is pretty nice and have been 
working reasonably well so far.

However, when it comes to performance optimization, this approach reach 
some limitation. For example, using more memory for caches can have a 
big impact on some operation, but could cripples them on smaller 
hardware. In the same ways, some expensive computation are necessary for 
smooth server operation, but would signicantly slow down operation on 
developer machine.

So I think it would make sense to introduce two sets of configuration:

- the first one for finer control about resource profiles, (that would 
adjust the default setting for some configuration),

- the second one to clearer declaration of the repository intended 
usage, (that would adjust the default setting for some configuration),


# About resource profile

I can see about three areas were resource usage could be adjusted: 
memory, cpu and storage.

Having three levels would be a good start, "low", "medium" (the default) 
and "high". (maybe with a fourth option that control all other at the 
same time)

 From this configuration, we could adjust some of the current value 
(especially cache size) and some behavior. For example if the storage is 
marked as "low" and the "cpu" is marked as "high", more time can be 
spent optimizing the storage.

I could go into more example of what we could adjust here, but I did not 
want this initial email to grow too large.


# About usage profile :

The way a repository is used can change the tradeoff that works best for 
it. For example, if you make a disposable clone for a CI runner, we 
really don't care about optimizing the storage information received from 
the server, that clone will be dead in a couple of minutes, on the other 
hand, if you are the server holding the main copy of a repository, it 
make sense to carefully validate and optimize the small content you 
receive from your client pushes.

I can think of the following role we could declare (with different level 
of precision).

- server
   - main
   - mirror
- client
   - developer
   - read-only
   - ci
     - temporary
     - persistent


(again, I am not going into too much details to keep this email short.

I guess I am not the only one to think about these problems, so I am 
curious to hear your though.

-- 
Pierre-Yves David



More information about the Mercurial-devel mailing list