Solaris 11.4 hosted repository, TortoiseHG clone attempt consumes all resources

Scott Newman - NOAA Affiliate scott.newman at noaa.gov
Tue Jun 23 19:51:51 UTC 2020


> >>>>> Good morning everyone!
> >>>>>
> >>>>> We are currently using Mercurial 5.2.2 hosted on Solaris 11.3 and accessed
> >>>>> by contributors via TortoiseHG 5.0.2 from their Windows Desktops.  We are
> >>>>> in the process of migrating applications to new hosts running Solaris
> >>>>> 11.4.
> >>>>
> >>>> As far as I understand, you use the same versions (Mercurial 5.2.2 on
> >>>> server TortoiseHG 5.0.2 on client) and the same python (probably 2.7
> >>>> something?) The only software version difference is Solaris 11.3 vs
> >>>> Solaris 11.4, right ?
> >>>
> >>> Pierre-Yves, so nice to hear from you!  Correct. Python 2.7.18 (tried
> >>> some others with the same result).  I have an update that when we
> >>> tried going back to THG 3.4 the clone worked as expected, but that
> >>> doesn't seem like a good long-term solution, particularly since we
> >>> will lose the ability to export-archive that  was introduced somewhere
> >>> around version 4.5, if you recall.
> >>
> >> That is very interesting, We are talking about using THG 3.4 on the
> >> client right? with still using Mercurial 5.2.2 on the server, right?
> >
> > Correct.  It is so interesting that the client can have such an impact
> > on the server!
> >
> >>
> >> If so, this means using a new protocol feature introduced in betwen 3.4
> >> and 5.2 reveal the issue.
> >>
> >> Can you confirm this? And if so, can you try to find the exact Mercurial
> >> version client side that trigger this issue?
> >
> > I am scheduled to work on this with another resource tomorrow at 15:00
> > EST and will update this thread.  We have confirmed that the problem
> > exists in THG4.5.0, so it will be somewhere in between 3.4 and 4.5.0.
> >
> >>
> >> However, the export-archive thingy is something you run server side,
> >> don't you?
> >>
> >
> > We perform this task on the client side now with the archive function
> > and have abandoned the customization in favor of the built-in archive
> > functionality added
> >
> >>
> >>>
> >>>>
> >>>>>    When trying to clone a copy of the repository hosted on Solaris
> >>>>> 11.4 the clone runs very slowly and the process consumes most of the
> >>>>> memory (64GB) on the host, starts generating "-bash: fork: Resource
> >>>>> temporarily unavailable" errors for users on the box after about 2
> >>>>> minutes, and the clone process fails with a " Server Unexpectedly closed
> >>>>> connection" message.
> >>>>
> >>>> So, the serveur hosting the repository is crumbly while cloning right?
> >>>> how are you cloning ? ssh or http ?
> >>>
> >>> Cloning via ssh.
> >>
> >> Great, can you add:
> >>
> >>     [ui]
> >>     debug=yes
> >>
> >> In the HGRC of the remote repository and run a clone, this you give you
> >> a tons of remote output that might help to understand what is going on
> >> when the memory explode.
> >
> > Here is the result on the client BEFORE adding debug:
> > % hg clone --verbose ssh://<username>@<hostname>//<dirname>/<reponame>
> > "C:\Repos\test"
> > requesting all changes
> > adding changesets
> > adding manifests
> > adding file changes  ### Processes 123/5396 files, takes 10-15
> > minutes, fails here
> > transaction abort!
> > rollback completed
> > abort: stream ended unexpectedly  (got 20593 bytes, expected 32768)
> > [command returned code 255 Mon Jun 22 15:31:39 2020]
> >
> > When I add the debug entry it stalls at:
> > % hg clone --verbose ssh://<username>@<hostname>//<dirname>/<reponame>
> > "C:\Repos\test"
> > requesting all changes  ### stalls here
> >
> >>
> >>>>>    The same process on Solaris 11.3 has a negligible
> >>>>> impact on resources and finishes in about 10 minutes.
> >>>>>
> >>>>> I have spent several days with the Network and Systems Administrators
> >>>>> trying to resolve this issue without success.  We tried many things,
> >>>>> including adjusting resource configurations, rebuilding Mercurial and
> >>>>> Python, using Mercurial and Python from the working server, using the
> >>>>> pre-built package from Oracle (v4.9.1),
> >>>>
> >>>> How did you transfer the repository between the two servers?
> >>>
> >>> I used hg clone (via ssh) between the servers without issue.
> >>
> >> This clone might have upgraded the repository to newer format, and
> >> jumped on an unknown issue affecting you repository. what does `hg
> >> debugformat` says on the older server?
> >
> > On older server:
> > format-variant    repo
> > fncache:           yes
> > dotencode:         yes
> > generaldelta:      yes
> > sparserevlog:       no
> > sidedata:           no
> > copies-sdc:         no
> > plain-cl-delta:     no
> > compression:       zlib
> > compression-level: default
>
> Okay, so the most notable difference is `sparserevlog`. You might
> encounter some unknown pathologilab. Can you try making a new server
> clone using `--config format.sparse-revlog=no` during the clone ?
>

I created a new server clone using:
hg clone --config format.sparse-revlog=no --noupdate
ssh://<username>@<hostname>/<SRCreponame> <TARGETreponame>
When I tried to clone with THG 5.0.2 via the UI I saw the same behavior.
When I performed the clone via the console using:
hg clone --config format.sparse-revlog=no --verbose
ssh://<username>@<hostname>/<SRCreponame> "<TARGETreponame"
I saw the same behavior.

Scott



More information about the Mercurial mailing list