Solaris 11.4 hosted repository, TortoiseHG clone attempt consumes all resources

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Jun 23 19:12:36 UTC 2020



On 6/22/20 9:49 PM, Scott Newman - NOAA Affiliate wrote:
>>>>> Good morning everyone!
>>>>>
>>>>> We are currently using Mercurial 5.2.2 hosted on Solaris 11.3 and accessed
>>>>> by contributors via TortoiseHG 5.0.2 from their Windows Desktops.  We are
>>>>> in the process of migrating applications to new hosts running Solaris
>>>>> 11.4.
>>>>
>>>> As far as I understand, you use the same versions (Mercurial 5.2.2 on
>>>> server TortoiseHG 5.0.2 on client) and the same python (probably 2.7
>>>> something?) The only software version difference is Solaris 11.3 vs
>>>> Solaris 11.4, right ?
>>>
>>> Pierre-Yves, so nice to hear from you!  Correct. Python 2.7.18 (tried
>>> some others with the same result).  I have an update that when we
>>> tried going back to THG 3.4 the clone worked as expected, but that
>>> doesn't seem like a good long-term solution, particularly since we
>>> will lose the ability to export-archive that  was introduced somewhere
>>> around version 4.5, if you recall.
>>
>> That is very interesting, We are talking about using THG 3.4 on the
>> client right? with still using Mercurial 5.2.2 on the server, right?
> 
> Correct.  It is so interesting that the client can have such an impact
> on the server!
> 
>>
>> If so, this means using a new protocol feature introduced in betwen 3.4
>> and 5.2 reveal the issue.
>>
>> Can you confirm this? And if so, can you try to find the exact Mercurial
>> version client side that trigger this issue?
> 
> I am scheduled to work on this with another resource tomorrow at 15:00
> EST and will update this thread.  We have confirmed that the problem
> exists in THG4.5.0, so it will be somewhere in between 3.4 and 4.5.0.
> 
>>
>> However, the export-archive thingy is something you run server side,
>> don't you?
>>
> 
> We perform this task on the client side now with the archive function
> and have abandoned the customization in favor of the built-in archive
> functionality added
> 
>>
>>>
>>>>
>>>>>    When trying to clone a copy of the repository hosted on Solaris
>>>>> 11.4 the clone runs very slowly and the process consumes most of the
>>>>> memory (64GB) on the host, starts generating "-bash: fork: Resource
>>>>> temporarily unavailable" errors for users on the box after about 2
>>>>> minutes, and the clone process fails with a " Server Unexpectedly closed
>>>>> connection" message.
>>>>
>>>> So, the serveur hosting the repository is crumbly while cloning right?
>>>> how are you cloning ? ssh or http ?
>>>
>>> Cloning via ssh.
>>
>> Great, can you add:
>>
>>     [ui]
>>     debug=yes
>>
>> In the HGRC of the remote repository and run a clone, this you give you
>> a tons of remote output that might help to understand what is going on
>> when the memory explode.
> 
> Here is the result on the client BEFORE adding debug:
> % hg clone --verbose ssh://<username>@<hostname>//<dirname>/<reponame>
> "C:\Repos\test"
> requesting all changes
> adding changesets
> adding manifests
> adding file changes  ### Processes 123/5396 files, takes 10-15
> minutes, fails here
> transaction abort!
> rollback completed
> abort: stream ended unexpectedly  (got 20593 bytes, expected 32768)
> [command returned code 255 Mon Jun 22 15:31:39 2020]
> 
> When I add the debug entry it stalls at:
> % hg clone --verbose ssh://<username>@<hostname>//<dirname>/<reponame>
> "C:\Repos\test"
> requesting all changes  ### stalls here
> 
>>
>>>>>    The same process on Solaris 11.3 has a negligible
>>>>> impact on resources and finishes in about 10 minutes.
>>>>>
>>>>> I have spent several days with the Network and Systems Administrators
>>>>> trying to resolve this issue without success.  We tried many things,
>>>>> including adjusting resource configurations, rebuilding Mercurial and
>>>>> Python, using Mercurial and Python from the working server, using the
>>>>> pre-built package from Oracle (v4.9.1),
>>>>
>>>> How did you transfer the repository between the two servers?
>>>
>>> I used hg clone (via ssh) between the servers without issue.
>>
>> This clone might have upgraded the repository to newer format, and
>> jumped on an unknown issue affecting you repository. what does `hg
>> debugformat` says on the older server?
> 
> On older server:
> format-variant    repo
> fncache:           yes
> dotencode:         yes
> generaldelta:      yes
> sparserevlog:       no
> sidedata:           no
> copies-sdc:         no
> plain-cl-delta:     no
> compression:       zlib
> compression-level: default

Okay, so the most notable difference is `sparserevlog`. You might 
encounter some unknown pathologilab. Can you try making a new server 
clone using `--config format.sparse-revlog=no` during the clone ?

-- 
Pierre-Yves David



More information about the Mercurial mailing list