Network performance problems when pulling and cloning from HTTP server
Matt Mackall
mpm at selenic.com
Sun Nov 25 17:40:56 UTC 2012
On Fri, 2012-11-23 at 07:47 +0100, Angel Ezquerra wrote:
> On Thu, Nov 22, 2012 at 11:05 PM, Augie Fackler <raf at durin42.com> wrote:
> >
> > On Nov 22, 2012, at 7:03 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
> >
> >> On Wed, Nov 21, 2012 at 10:24 PM, Matt Mackall <mpm at selenic.com> wrote:
> >>> On Wed, 2012-11-21 at 08:02 +0100, Angel Ezquerra wrote:
> >>>> On Nov 20, 2012 6:55 PM, "Bryan O'Sullivan" <bos at serpentine.com> wrote:
> >>>>>
> >>>>> On Tue, Nov 20, 2012 at 8:30 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
> >>
> >> Matt, thanks for pitching in.
> >>
> >> I understand now the need to get the hashes, usernames, dates and
> >> descriptions. As for also getting the diffs, that would only happen
> >> when using the --bundle option, right? That is, calling a plain "hg
> >> incoming" (without --bundle) should be pretty fast regardless of the
> >> contents of the changesets themselves. Is that correct?
> >
> > Sadly no, incoming pulls a complete bundle, then throws it on the floor when done. In fact, 'hg help incoming' tells you that '--bundle FILE' lets you save the requested bundle to a file so you can apply it later.
> >
> >>
> >>>> In any case, may I ask why you are not surprised by this?
> >>>
> >>> You've hit the trifecta of ways to have suboptimal performance:
> >>>
> >>> a) Windows
> >>> b) large files
> >>> c) using a protocol designed for broadband and slower on a LAN
> >>
> >> By protocol you mean the underlying http or mercurial's own protocol?
> >
> > hg's own protocol, which can use http (or ssh) as a transport.
> >
> >
> >>>>> Do you have the ability to serve the repo that shows poor performance using "hg serve"? If so, it would be helpful to "hg serve --profile", do a pull from a client, then stop the server and share the profile dump.
> >>>> C:\mercurial_tests\tmp_bigfiles_not_large>hg serve --port 7000
> >>>> --profile --verbose
> >>>
> >>> ..no profile. A profile looks like this:
> >>>
> >>> $ hg serve --profile
> >>> listening at http://calx:8000/ (bound to *:8000)
> >>> [do something]
> >>> [hit ctrl-c]
> >>> CallCount Recursive Total(s) Inline(s) module:lineno(function)
> >>> 4 0 1.5059 1.5058 <select.select>
> >>> 57 46 0.0052 0.0036 <__import__>
> >>> 1 0 0.0025 0.0013 mimetypes:205(readfp)
> >>> 684 0 0.0006 0.0005 mercurial.config:20(__setitem__)
> >>> 665 0 0.0007 0.0005 mimetypes:78(add_type)
> >>> 2 0 0.0004 0.0004 <_socket.gethostbyaddr>
> >>> 80 0 0.0008 0.0003 mercurial.config:27(update)
> >>> 971 0 0.0003 0.0003 <method 'split' of 'str' objects>
> >>>
> >>
> >> Sorry, I did not know that.
> >>
> >> Unfortunately, I cannot get --profile to work. Whenever I use Ctrl+C
> >> to stop the server it just exits without printing any profile
> >> information.
> >>
> >> I am running this test server on our production mercurial (windows
> >> 2003) server, which is already using the default port. If I do:
> >>
> >> hg server --profile
> >>
> >> without specifying a port, the server exists immediately, saying
> >> "abort: cannot start server at ':8000'", and in that case it _does_
> >> print some profile information. However, it does not print anything if
> >> I set a port to something other than 8000 and then I use Ctrl+C.
> >>
> >> I thought that perhaps --profile did not work when the --port is set.
> >> To verify this I stopped our production server tonight and tried again
> >> to run "hg server --profile" (without setting the port). The result is
> >> the same (i.e. no profile information is printed when I do Ctrl+C). I
> >> tried killing the hg process using the windows task manager but I get
> >> the same result (no profile printed).
> >>
> >> What else could I do?
> >
> > I just verified that ^C gives me profiler output on a Mac. I don't know enough about Windows to guess why this isn't working though.
> >
> >
> > The fact that archives are fast and pulls are slow suggests (to me) that computing deltas is being expensive for your use case, but that's only intuition talking. Figuring out a way to get profiler output would be the best outcome.
> >
> > One thing that *might* give us an idea where you'd be burning some time would be 'hg bundle --all --profile foo.bundle'. That's not exactly the same as the web server, but it might show something surprising (and therefore be helpful without me having to guess about Windows things.)
> >
>
> Thank you Augie,
>
> I did ran "hg bundle --all --profile foo.bundle" three times. The
> second and third times I also added --time:
>
> C:\mercurial_tests\tmp_bigfiles_not_large>hg bundle --all --profile foo.bundle
> 2 changesets found
> CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
> 26 0 85.1904 79.0038
> <mercurial\util.pyo>:863(read)
> +571 0 6.1801 0.3268
> +<mercurial\util.pyo>:850(splitbig)
> +573 0 0.0032 0.0032 +<method 'pop' of
> 'list' objects>
> +570 0 0.0028 0.0028 +<method 'append'
> of 'list' objects>
> +1143 0 0.0004 0.0004 +<len>
> +3 0 0.0000 0.0000 +<method 'insert'
> of 'list' objects>
> 161 0 37.0899 37.0899 <method 'compress'
> of 'bz2.BZ2Compressor' objects>
> 7 0 2.7092 2.7092 <method 'read' of
> 'file' objects>
> 8 0 2.5599 2.5598
> <mercurial\revlog.pyo>:107(decompress)
> +3 0 0.0002 0.0002 +<zlib.decompress>
> 164 0 1.3976 1.3976 <method 'write' of
> 'file' objects>
> 1 0 124.4743 0.6509
> <mercurial\changegroup.pyo>:60(writebundle)
> +161 0 37.0899 37.0899 +<method
> 'compress' of 'bz2.BZ2Compressor' objects>
> +163 0 1.3973 1.3973 +<method 'write'
> of 'file' objects>
> +1 0 0.0978 0.0978 +<method 'flush'
> of 'bz2.BZ2Compressor' objects>
> +1 0 0.0029 0.0029 +<method 'close'
> of 'file' objects>
> +1 0 0.0004 0.0004 +<open>
> 12 0 0.5223 0.5223 <method 'update'
> of '_hashlib.HASH' objects>
> 571 0 6.1801 0.3268
> <mercurial\util.pyo>:850(splitbig)
> +31 0 5.8525 0.0227
> +<mercurial\localrepo.pyo>:1989(gengroup)
> +574 0 0.0008 0.0008 +<len>
> 1 0 0.0978 0.0978 <method 'flush' of
> 'bz2.BZ2Compressor' objects>
> 5 0 0.0475 0.0475 <imp.load_dynamic>
> 62 44 0.0859 0.0359 <__import__>
> +1 0 0.0005 0.0002
> +<mercurial\changegroup.pyo>:8(<module>)
> +1 0 0.0005 0.0002
> +<mercurial\context.pyo>:8(<module>)
> +1 0 0.0029 0.0002 +<hashlib.pyo>:55(<module>)
> +1 0 0.0038 0.0001
> +<mercurial\revlog.pyo>:12(<module>)
> +1 0 0.0001 0.0001
> +<encodings\utf_8.pyo>:8(<module>)
> 7 0 0.0306 0.0306 <mercurial.osutil.posixfile>
>
>
> C:\mercurial_tests\tmp_bigfiles_not_large>hg --time bundle --all
> --profile bar.bundle
> 2 changesets found
> CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
> 26 0 82.8639 80.5234
> <mercurial\util.pyo>:863(read)
> +571 0 2.3341 0.3113
> +<mercurial\util.pyo>:850(splitbig)
> +573 0 0.0034 0.0034 +<method 'pop' of
> 'list' objects>
> +570 0 0.0026 0.0026 +<method 'append'
> of 'list' objects>
> +1143 0 0.0004 0.0004 +<len>
> +3 0 0.0000 0.0000 +<method 'insert'
> of 'list' objects>
> 161 0 37.0936 37.0936 <method 'compress'
> of 'bz2.BZ2Compressor' objects>
> 164 0 1.2516 1.2516 <method 'write' of
> 'file' objects>
> 7 0 0.9041 0.9041 <method 'read' of
> 'file' objects>
> 1 0 121.9192 0.6125
> <mercurial\changegroup.pyo>:60(writebundle)
> +161 0 37.0936 37.0936 +<method
> 'compress' of 'bz2.BZ2Compressor' objects>
> +163 0 1.2501 1.2501 +<method 'write'
> of 'file' objects>
> +1 0 0.0923 0.0923 +<method 'flush'
> of 'bz2.BZ2Compressor' objects>
> +1 0 0.0031 0.0031 +<method 'close'
> of 'file' objects>
> +1 0 0.0007 0.0007 +<open>
> 8 0 0.5515 0.5513
> <mercurial\revlog.pyo>:107(decompress)
> +3 0 0.0002 0.0002 +<zlib.decompress>
> 12 0 0.5342 0.5342 <method 'update'
> of '_hashlib.HASH' objects>
> 571 0 2.3341 0.3113
> <mercurial\util.pyo>:850(splitbig)
> +31 0 2.0220 0.0190
> +<mercurial\localrepo.pyo>:1989(gengroup)
> +574 0 0.0008 0.0008 +<len>
> 1 0 0.0923 0.0923 <method 'flush' of
> 'bz2.BZ2Compressor' objects>
> 31 0 2.0220 0.0190
> <mercurial\localrepo.pyo>:1989(gengroup)
> +31 0 1.9997 0.0002
> +<mercurial\revlog.pyo>:1114(group)
> +3 0 0.0030 0.0001
> +<mercurial\localrepo.pyo>:624(file)
> +4 0 0.0001 0.0000
> +<mercurial\localrepo.pyo>:1953(gennodelst)
> +6 0 0.0000 0.0000 +<len>
> +1 0 0.0001 0.0000
> +<mercurial\localrepo.pyo>:270(hook)
> 62 44 0.0180 0.0077 <__import__>
> +1 0 0.0004 0.0003
> +<mercurial\changegroup.pyo>:8(<module>)
> Time: real 121.989 secs (user 59.984+0.000 sys 61.062+0.000)
>
> C:\mercurial_tests\tmp_bigfiles_not_large>hg --time bundle --all
> --profile bar.bundle2
> 2 changesets found
> CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
> 26 0 82.6152 80.2107
> <mercurial\util.pyo>:863(read)
> +571 0 2.3983 0.3247
> +<mercurial\util.pyo>:850(splitbig)
> +573 0 0.0033 0.0033 +<method 'pop' of
> 'list' objects>
> +570 0 0.0025 0.0025 +<method 'append'
> of 'list' objects>
> +1143 0 0.0004 0.0004 +<len>
> +3 0 0.0000 0.0000 +<method 'insert'
> of 'list' objects>
> 161 0 37.0642 37.0642 <method 'compress'
> of 'bz2.BZ2Compressor' objects>
> 164 0 1.2536 1.2536 <method 'write' of
> 'file' objects>
> 7 0 0.8486 0.8486 <method 'read' of
> 'file' objects>
> 8 0 0.6593 0.6591
> <mercurial\revlog.pyo>:107(decompress)
> +3 0 0.0002 0.0002 +<zlib.decompress>
> 1 0 121.6504 0.6175
> <mercurial\changegroup.pyo>:60(writebundle)
> +161 0 37.0642 37.0642 +<method
> 'compress' of 'bz2.BZ2Compressor' objects>
> +163 0 1.2519 1.2519 +<method 'write'
> of 'file' objects>
> +1 0 0.0952 0.0952 +<method 'flush'
> of 'bz2.BZ2Compressor' objects>
> +1 0 0.0030 0.0030 +<method 'close'
> of 'file' objects>
> +1 0 0.0008 0.0008 +<open>
> 12 0 0.5229 0.5229 <method 'update'
> of '_hashlib.HASH' objects>
> 571 0 2.3983 0.3247
> <mercurial\util.pyo>:850(splitbig)
> +31 0 2.0729 0.0280
> +<mercurial\localrepo.pyo>:1989(gengroup)
> +574 0 0.0008 0.0008 +<len>
> 1 0 0.0952 0.0952 <method 'flush' of
> 'bz2.BZ2Compressor' objects>
> 31 0 2.0729 0.0280
> <mercurial\localrepo.pyo>:1989(gengroup)
> +31 0 2.0412 0.0002
> +<mercurial\revlog.pyo>:1114(group)
> +6 0 0.0001 0.0001 +<len>
> +4 0 0.0002 0.0001
> +<mercurial\localrepo.pyo>:1953(gennodelst)
> +3 0 0.0031 0.0001
> +<mercurial\localrepo.pyo>:624(file)
> +1 0 0.0002 0.0000
> +<mercurial\localrepo.pyo>:270(hook)
> 62 44 0.0172 0.0081 <__import__>
> +1 0 0.0004 0.0002
> +<mercurial\changegroup.pyo>:8(<module>)
> Time: real 121.711 secs (user 60.844+0.000 sys 59.844+0.000)
>
> So it takes 121 seconds just to do the bundle.
See this recent thread:
http://mercurial.markmail.org/thread/dyafoufq3nczfu5m
> The "hg process" is shown on the windows task manager as using from 60
> to 350 MB of memory (it varies), while the CPU usage is mostly 25% and
> quite constant, although I saw it reach 28% briefly. This server is
> virtualized. I don't know how this can affect things but I'm sure this
> makes the "CPU usage" that windows shows less than accurate :-/ That
> being said, the server has "4 cores" so the 25% part could indicate
> that a whole CPU is being used. However the CPU usage chart shows that
> all CPUs get busier during the bundle, some of them reaching brief
> peaks of 100% but being around 50-60% most of the time, which is odd.
The stats above suggest full utilization of one core. As Mercurial is
single-threaded, that's what we'd expect.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial
mailing list