Network performance problems when pulling and cloning from HTTP server

Matt Mackall mpm at selenic.com
Sun Nov 25 17:40:56 UTC 2012


On Fri, 2012-11-23 at 07:47 +0100, Angel Ezquerra wrote:
> On Thu, Nov 22, 2012 at 11:05 PM, Augie Fackler <raf at durin42.com> wrote:
> >
> > On Nov 22, 2012, at 7:03 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
> >
> >> On Wed, Nov 21, 2012 at 10:24 PM, Matt Mackall <mpm at selenic.com> wrote:
> >>> On Wed, 2012-11-21 at 08:02 +0100, Angel Ezquerra wrote:
> >>>> On Nov 20, 2012 6:55 PM, "Bryan O'Sullivan" <bos at serpentine.com> wrote:
> >>>>>
> >>>>> On Tue, Nov 20, 2012 at 8:30 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
> >>
> >> Matt, thanks for pitching in.
> >>
> >> I understand now the need to get the hashes, usernames, dates and
> >> descriptions. As for also getting the diffs, that would only happen
> >> when using the --bundle option, right? That is, calling a plain "hg
> >> incoming" (without --bundle) should be pretty fast regardless of the
> >> contents of the changesets themselves. Is that correct?
> >
> > Sadly no, incoming pulls a complete bundle, then throws it on the floor when done. In fact, 'hg help incoming' tells you that '--bundle FILE' lets you save the requested bundle to a file so you can apply it later.
> >
> >>
> >>>> In any case, may I ask why you are not surprised by this?
> >>>
> >>> You've hit the trifecta of ways to have suboptimal performance:
> >>>
> >>> a) Windows
> >>> b) large files
> >>> c) using a protocol designed for broadband and slower on a LAN
> >>
> >> By protocol you mean the underlying http or mercurial's own protocol?
> >
> > hg's own protocol, which can use http (or ssh) as a transport.
> >
> >
> >>>>> Do you have the ability to serve the repo that shows poor performance using "hg serve"? If so, it would be helpful to "hg serve --profile", do a pull from a client, then stop the server and share the profile dump.
> >>>> C:\mercurial_tests\tmp_bigfiles_not_large>hg serve --port 7000
> >>>> --profile --verbose
> >>>
> >>> ..no profile. A profile looks like this:
> >>>
> >>> $ hg serve --profile
> >>> listening at http://calx:8000/ (bound to *:8000)
> >>> [do something]
> >>> [hit ctrl-c]
> >>>   CallCount    Recursive     Total(s)    Inline(s) module:lineno(function)
> >>>           4            0      1.5059      1.5058   <select.select>
> >>>          57           46      0.0052      0.0036   <__import__>
> >>>           1            0      0.0025      0.0013   mimetypes:205(readfp)
> >>>         684            0      0.0006      0.0005   mercurial.config:20(__setitem__)
> >>>         665            0      0.0007      0.0005   mimetypes:78(add_type)
> >>>           2            0      0.0004      0.0004   <_socket.gethostbyaddr>
> >>>          80            0      0.0008      0.0003   mercurial.config:27(update)
> >>>         971            0      0.0003      0.0003   <method 'split' of 'str' objects>
> >>>
> >>
> >> Sorry, I did not know that.
> >>
> >> Unfortunately, I cannot get --profile to work. Whenever I use Ctrl+C
> >> to stop the server it just exits without printing any profile
> >> information.
> >>
> >> I am running this test server on our production mercurial (windows
> >> 2003) server, which is already using the default port. If I do:
> >>
> >> hg server --profile
> >>
> >> without specifying a port, the server exists immediately, saying
> >> "abort: cannot start server at ':8000'", and in that case it _does_
> >> print some profile information. However, it does not print anything if
> >> I set a port to something other than 8000 and then I use Ctrl+C.
> >>
> >> I thought that perhaps --profile did not work when the --port is set.
> >> To verify this I stopped our production server tonight and tried again
> >> to run "hg server --profile" (without setting the port). The result is
> >> the same (i.e. no profile information is printed when I do Ctrl+C). I
> >> tried killing the hg process using the windows task manager but I get
> >> the same result (no profile printed).
> >>
> >> What else could I do?
> >
> > I just verified that ^C gives me profiler output on a Mac. I don't know enough about Windows to guess why this isn't working though.
> >
> >
> > The fact that archives are fast and pulls are slow suggests (to me) that computing deltas is being expensive for your use case, but that's only intuition talking. Figuring out a way to get profiler output would be the best outcome.
> >
> > One thing that *might* give us an idea where you'd be burning some time would be 'hg bundle --all --profile foo.bundle'. That's not exactly the same as the web server, but it might show something surprising (and therefore be helpful without me having to guess about Windows things.)
> >
> 
> Thank you Augie,
> 
> I did ran "hg bundle --all --profile foo.bundle" three times. The
> second and third times I also added --time:
> 
> C:\mercurial_tests\tmp_bigfiles_not_large>hg bundle --all --profile foo.bundle
> 2 changesets found
>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>           26            0     85.1904     79.0038
> <mercurial\util.pyo>:863(read)
>         +571            0      6.1801      0.3268
> +<mercurial\util.pyo>:850(splitbig)
>         +573            0      0.0032      0.0032   +<method 'pop' of
> 'list' objects>
>         +570            0      0.0028      0.0028   +<method 'append'
> of 'list' objects>
>        +1143            0      0.0004      0.0004   +<len>
>           +3            0      0.0000      0.0000   +<method 'insert'
> of 'list' objects>
>          161            0     37.0899     37.0899   <method 'compress'
> of 'bz2.BZ2Compressor' objects>
>            7            0      2.7092      2.7092   <method 'read' of
> 'file' objects>
>            8            0      2.5599      2.5598
> <mercurial\revlog.pyo>:107(decompress)
>           +3            0      0.0002      0.0002   +<zlib.decompress>
>          164            0      1.3976      1.3976   <method 'write' of
> 'file' objects>
>            1            0    124.4743      0.6509
> <mercurial\changegroup.pyo>:60(writebundle)
>         +161            0     37.0899     37.0899   +<method
> 'compress' of 'bz2.BZ2Compressor' objects>
>         +163            0      1.3973      1.3973   +<method 'write'
> of 'file' objects>
>           +1            0      0.0978      0.0978   +<method 'flush'
> of 'bz2.BZ2Compressor' objects>
>           +1            0      0.0029      0.0029   +<method 'close'
> of 'file' objects>
>           +1            0      0.0004      0.0004   +<open>
>           12            0      0.5223      0.5223   <method 'update'
> of '_hashlib.HASH' objects>
>          571            0      6.1801      0.3268
> <mercurial\util.pyo>:850(splitbig)
>          +31            0      5.8525      0.0227
> +<mercurial\localrepo.pyo>:1989(gengroup)
>         +574            0      0.0008      0.0008   +<len>
>            1            0      0.0978      0.0978   <method 'flush' of
> 'bz2.BZ2Compressor' objects>
>            5            0      0.0475      0.0475   <imp.load_dynamic>
>           62           44      0.0859      0.0359   <__import__>
>           +1            0      0.0005      0.0002
> +<mercurial\changegroup.pyo>:8(<module>)
>           +1            0      0.0005      0.0002
> +<mercurial\context.pyo>:8(<module>)
>           +1            0      0.0029      0.0002   +<hashlib.pyo>:55(<module>)
>           +1            0      0.0038      0.0001
> +<mercurial\revlog.pyo>:12(<module>)
>           +1            0      0.0001      0.0001
> +<encodings\utf_8.pyo>:8(<module>)
>            7            0      0.0306      0.0306   <mercurial.osutil.posixfile>
> 
> 
> C:\mercurial_tests\tmp_bigfiles_not_large>hg --time bundle --all
> --profile bar.bundle
> 2 changesets found
>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>           26            0     82.8639     80.5234
> <mercurial\util.pyo>:863(read)
>         +571            0      2.3341      0.3113
> +<mercurial\util.pyo>:850(splitbig)
>         +573            0      0.0034      0.0034   +<method 'pop' of
> 'list' objects>
>         +570            0      0.0026      0.0026   +<method 'append'
> of 'list' objects>
>        +1143            0      0.0004      0.0004   +<len>
>           +3            0      0.0000      0.0000   +<method 'insert'
> of 'list' objects>
>          161            0     37.0936     37.0936   <method 'compress'
> of 'bz2.BZ2Compressor' objects>
>          164            0      1.2516      1.2516   <method 'write' of
> 'file' objects>
>            7            0      0.9041      0.9041   <method 'read' of
> 'file' objects>
>            1            0    121.9192      0.6125
> <mercurial\changegroup.pyo>:60(writebundle)
>         +161            0     37.0936     37.0936   +<method
> 'compress' of 'bz2.BZ2Compressor' objects>
>         +163            0      1.2501      1.2501   +<method 'write'
> of 'file' objects>
>           +1            0      0.0923      0.0923   +<method 'flush'
> of 'bz2.BZ2Compressor' objects>
>           +1            0      0.0031      0.0031   +<method 'close'
> of 'file' objects>
>           +1            0      0.0007      0.0007   +<open>
>            8            0      0.5515      0.5513
> <mercurial\revlog.pyo>:107(decompress)
>           +3            0      0.0002      0.0002   +<zlib.decompress>
>           12            0      0.5342      0.5342   <method 'update'
> of '_hashlib.HASH' objects>
>          571            0      2.3341      0.3113
> <mercurial\util.pyo>:850(splitbig)
>          +31            0      2.0220      0.0190
> +<mercurial\localrepo.pyo>:1989(gengroup)
>         +574            0      0.0008      0.0008   +<len>
>            1            0      0.0923      0.0923   <method 'flush' of
> 'bz2.BZ2Compressor' objects>
>           31            0      2.0220      0.0190
> <mercurial\localrepo.pyo>:1989(gengroup)
>          +31            0      1.9997      0.0002
> +<mercurial\revlog.pyo>:1114(group)
>           +3            0      0.0030      0.0001
> +<mercurial\localrepo.pyo>:624(file)
>           +4            0      0.0001      0.0000
> +<mercurial\localrepo.pyo>:1953(gennodelst)
>           +6            0      0.0000      0.0000   +<len>
>           +1            0      0.0001      0.0000
> +<mercurial\localrepo.pyo>:270(hook)
>           62           44      0.0180      0.0077   <__import__>
>           +1            0      0.0004      0.0003
> +<mercurial\changegroup.pyo>:8(<module>)
> Time: real 121.989 secs (user 59.984+0.000 sys 61.062+0.000)
> 
> C:\mercurial_tests\tmp_bigfiles_not_large>hg --time bundle --all
> --profile bar.bundle2
> 2 changesets found
>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>           26            0     82.6152     80.2107
> <mercurial\util.pyo>:863(read)
>         +571            0      2.3983      0.3247
> +<mercurial\util.pyo>:850(splitbig)
>         +573            0      0.0033      0.0033   +<method 'pop' of
> 'list' objects>
>         +570            0      0.0025      0.0025   +<method 'append'
> of 'list' objects>
>        +1143            0      0.0004      0.0004   +<len>
>           +3            0      0.0000      0.0000   +<method 'insert'
> of 'list' objects>
>          161            0     37.0642     37.0642   <method 'compress'
> of 'bz2.BZ2Compressor' objects>
>          164            0      1.2536      1.2536   <method 'write' of
> 'file' objects>
>            7            0      0.8486      0.8486   <method 'read' of
> 'file' objects>
>            8            0      0.6593      0.6591
> <mercurial\revlog.pyo>:107(decompress)
>           +3            0      0.0002      0.0002   +<zlib.decompress>
>            1            0    121.6504      0.6175
> <mercurial\changegroup.pyo>:60(writebundle)
>         +161            0     37.0642     37.0642   +<method
> 'compress' of 'bz2.BZ2Compressor' objects>
>         +163            0      1.2519      1.2519   +<method 'write'
> of 'file' objects>
>           +1            0      0.0952      0.0952   +<method 'flush'
> of 'bz2.BZ2Compressor' objects>
>           +1            0      0.0030      0.0030   +<method 'close'
> of 'file' objects>
>           +1            0      0.0008      0.0008   +<open>
>           12            0      0.5229      0.5229   <method 'update'
> of '_hashlib.HASH' objects>
>          571            0      2.3983      0.3247
> <mercurial\util.pyo>:850(splitbig)
>          +31            0      2.0729      0.0280
> +<mercurial\localrepo.pyo>:1989(gengroup)
>         +574            0      0.0008      0.0008   +<len>
>            1            0      0.0952      0.0952   <method 'flush' of
> 'bz2.BZ2Compressor' objects>
>           31            0      2.0729      0.0280
> <mercurial\localrepo.pyo>:1989(gengroup)
>          +31            0      2.0412      0.0002
> +<mercurial\revlog.pyo>:1114(group)
>           +6            0      0.0001      0.0001   +<len>
>           +4            0      0.0002      0.0001
> +<mercurial\localrepo.pyo>:1953(gennodelst)
>           +3            0      0.0031      0.0001
> +<mercurial\localrepo.pyo>:624(file)
>           +1            0      0.0002      0.0000
> +<mercurial\localrepo.pyo>:270(hook)
>           62           44      0.0172      0.0081   <__import__>
>           +1            0      0.0004      0.0002
> +<mercurial\changegroup.pyo>:8(<module>)
> Time: real 121.711 secs (user 60.844+0.000 sys 59.844+0.000)
> 
> So it takes 121 seconds just to do the bundle.

See this recent thread:

http://mercurial.markmail.org/thread/dyafoufq3nczfu5m

> The "hg process" is shown on the windows task manager as using from 60
> to 350 MB of memory (it varies), while the CPU usage is mostly 25% and
> quite constant, although I saw it reach 28% briefly. This server is
> virtualized. I don't know how this can affect things but I'm sure this
> makes the "CPU usage" that windows shows less than accurate :-/ That
> being said, the server has "4 cores" so the 25% part could indicate
> that a whole CPU is being used. However the CPU usage chart shows that
> all CPUs get busier during the bundle, some of them reaching brief
> peaks of 100% but being around 50-60% most of the time, which is odd.

The stats above suggest full utilization of one core. As Mercurial is
single-threaded, that's what we'd expect.

-- 
Mathematics is the supreme nostalgia of our time.





More information about the Mercurial mailing list