Solaris 11.4 hosted repository, TortoiseHG clone attempt consumes all resources

Pierre-Yves David pierre-yves.david at ens-lyon.org
Fri Jun 19 06:36:32 UTC 2020



On 6/18/20 10:49 PM, Scott Newman - NOAA Affiliate wrote:
> On Fri, Jun 12, 2020 at 3:17 PM Pierre-Yves David
> <pierre-yves.david at ens-lyon.org> wrote:
>>
>>
>>
>> On 6/12/20 4:48 PM, Scott Newman - NOAA Affiliate via Mercurial wrote:
>>> Good morning everyone!
>>>
>>> We are currently using Mercurial 5.2.2 hosted on Solaris 11.3 and accessed
>>> by contributors via TortoiseHG 5.0.2 from their Windows Desktops.  We are
>>> in the process of migrating applications to new hosts running Solaris
>>> 11.4.
>>
>> As far as I understand, you use the same versions (Mercurial 5.2.2 on
>> server TortoiseHG 5.0.2 on client) and the same python (probably 2.7
>> something?) The only software version difference is Solaris 11.3 vs
>> Solaris 11.4, right ?
> 
> Pierre-Yves, so nice to hear from you!  Correct. Python 2.7.18 (tried
> some others with the same result).  I have an update that when we
> tried going back to THG 3.4 the clone worked as expected, but that
> doesn't seem like a good long-term solution, particularly since we
> will lose the ability to export-archive that  was introduced somewhere
> around version 4.5, if you recall.

That is very interesting, We are talking about using THG 3.4 on the 
client right? with still using Mercurial 5.2.2 on the server, right?

If so, this means using a new protocol feature introduced in betwen 3.4 
and 5.2 reveal the issue.

Can you confirm this? And if so, can you try to find the exact Mercurial 
version client side that trigger this issue?

However, the export-archive thingy is something you run server side, 
don't you?


> 
>>
>>>   When trying to clone a copy of the repository hosted on Solaris
>>> 11.4 the clone runs very slowly and the process consumes most of the
>>> memory (64GB) on the host, starts generating "-bash: fork: Resource
>>> temporarily unavailable" errors for users on the box after about 2
>>> minutes, and the clone process fails with a " Server Unexpectedly closed
>>> connection" message.
>>
>> So, the serveur hosting the repository is crumbly while cloning right?
>> how are you cloning ? ssh or http ?
> 
> Cloning via ssh.

Great, can you add:

   [ui]
   debug=yes

In the HGRC of the remote repository and run a clone, this you give you 
a tons of remote output that might help to understand what is going on 
when the memory explode.

>>>   The same process on Solaris 11.3 has a negligible
>>> impact on resources and finishes in about 10 minutes.
>>>
>>> I have spent several days with the Network and Systems Administrators
>>> trying to resolve this issue without success.  We tried many things,
>>> including adjusting resource configurations, rebuilding Mercurial and
>>> Python, using Mercurial and Python from the working server, using the
>>> pre-built package from Oracle (v4.9.1),
>>
>> How did you transfer the repository between the two servers?
> 
> I used hg clone (via ssh) between the servers without issue.

This clone might have upgraded the repository to newer format, and 
jumped on an unknown issue affecting you repository. what does `hg 
debugformat` says on the older server?

> 
>> what does `hg debugformat` says on both end?
> 
> On Server:
> format-variant    repo
> fncache:           yes
> dotencode:         yes
> generaldelta:      yes
> sparserevlog:      yes
> sidedata:           no
> copies-sdc:         no
> plain-cl-delta:    yes
> compression:       zlib
> compression-level: default
> On client:
> Since I cannot get the clone to finish I do not have a local clone to
> run this command against.  I may not understand specifically what you
> want here.
> 
>> how big is the `.hg/store/` directory on both side?
> 
> On the server it is 185MB.
> On the client, I cannot get the clone to complete so I am left without
> anything, it seems to clean itself up after the failure.
> 
>> How many revisions do you have in your repository?
> 
> 874 revisions

That is very small, so you are definitly hitting some kind of bad bug. 
Lets find out which one now :-)

Regards,

-- 
Pierre-Yves David



More information about the Mercurial mailing list