Solaris 11.4 hosted repository, TortoiseHG clone attempt consumes all resources
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Tue Jun 23 19:12:36 UTC 2020
On 6/22/20 9:49 PM, Scott Newman - NOAA Affiliate wrote:
>>>>> Good morning everyone!
>>>>>
>>>>> We are currently using Mercurial 5.2.2 hosted on Solaris 11.3 and accessed
>>>>> by contributors via TortoiseHG 5.0.2 from their Windows Desktops. We are
>>>>> in the process of migrating applications to new hosts running Solaris
>>>>> 11.4.
>>>>
>>>> As far as I understand, you use the same versions (Mercurial 5.2.2 on
>>>> server TortoiseHG 5.0.2 on client) and the same python (probably 2.7
>>>> something?) The only software version difference is Solaris 11.3 vs
>>>> Solaris 11.4, right ?
>>>
>>> Pierre-Yves, so nice to hear from you! Correct. Python 2.7.18 (tried
>>> some others with the same result). I have an update that when we
>>> tried going back to THG 3.4 the clone worked as expected, but that
>>> doesn't seem like a good long-term solution, particularly since we
>>> will lose the ability to export-archive that was introduced somewhere
>>> around version 4.5, if you recall.
>>
>> That is very interesting, We are talking about using THG 3.4 on the
>> client right? with still using Mercurial 5.2.2 on the server, right?
>
> Correct. It is so interesting that the client can have such an impact
> on the server!
>
>>
>> If so, this means using a new protocol feature introduced in betwen 3.4
>> and 5.2 reveal the issue.
>>
>> Can you confirm this? And if so, can you try to find the exact Mercurial
>> version client side that trigger this issue?
>
> I am scheduled to work on this with another resource tomorrow at 15:00
> EST and will update this thread. We have confirmed that the problem
> exists in THG4.5.0, so it will be somewhere in between 3.4 and 4.5.0.
>
>>
>> However, the export-archive thingy is something you run server side,
>> don't you?
>>
>
> We perform this task on the client side now with the archive function
> and have abandoned the customization in favor of the built-in archive
> functionality added
>
>>
>>>
>>>>
>>>>> When trying to clone a copy of the repository hosted on Solaris
>>>>> 11.4 the clone runs very slowly and the process consumes most of the
>>>>> memory (64GB) on the host, starts generating "-bash: fork: Resource
>>>>> temporarily unavailable" errors for users on the box after about 2
>>>>> minutes, and the clone process fails with a " Server Unexpectedly closed
>>>>> connection" message.
>>>>
>>>> So, the serveur hosting the repository is crumbly while cloning right?
>>>> how are you cloning ? ssh or http ?
>>>
>>> Cloning via ssh.
>>
>> Great, can you add:
>>
>> [ui]
>> debug=yes
>>
>> In the HGRC of the remote repository and run a clone, this you give you
>> a tons of remote output that might help to understand what is going on
>> when the memory explode.
>
> Here is the result on the client BEFORE adding debug:
> % hg clone --verbose ssh://<username>@<hostname>//<dirname>/<reponame>
> "C:\Repos\test"
> requesting all changes
> adding changesets
> adding manifests
> adding file changes ### Processes 123/5396 files, takes 10-15
> minutes, fails here
> transaction abort!
> rollback completed
> abort: stream ended unexpectedly (got 20593 bytes, expected 32768)
> [command returned code 255 Mon Jun 22 15:31:39 2020]
>
> When I add the debug entry it stalls at:
> % hg clone --verbose ssh://<username>@<hostname>//<dirname>/<reponame>
> "C:\Repos\test"
> requesting all changes ### stalls here
>
>>
>>>>> The same process on Solaris 11.3 has a negligible
>>>>> impact on resources and finishes in about 10 minutes.
>>>>>
>>>>> I have spent several days with the Network and Systems Administrators
>>>>> trying to resolve this issue without success. We tried many things,
>>>>> including adjusting resource configurations, rebuilding Mercurial and
>>>>> Python, using Mercurial and Python from the working server, using the
>>>>> pre-built package from Oracle (v4.9.1),
>>>>
>>>> How did you transfer the repository between the two servers?
>>>
>>> I used hg clone (via ssh) between the servers without issue.
>>
>> This clone might have upgraded the repository to newer format, and
>> jumped on an unknown issue affecting you repository. what does `hg
>> debugformat` says on the older server?
>
> On older server:
> format-variant repo
> fncache: yes
> dotencode: yes
> generaldelta: yes
> sparserevlog: no
> sidedata: no
> copies-sdc: no
> plain-cl-delta: no
> compression: zlib
> compression-level: default
Okay, so the most notable difference is `sparserevlog`. You might
encounter some unknown pathologilab. Can you try making a new server
clone using `--config format.sparse-revlog=no` during the clone ?
--
Pierre-Yves David
More information about the Mercurial
mailing list