Network performance problems when pulling and cloning from HTTP server
Angel Ezquerra
ezquerra at gmail.com
Fri Nov 23 06:47:26 UTC 2012
On Thu, Nov 22, 2012 at 11:05 PM, Augie Fackler <raf at durin42.com> wrote:
>
> On Nov 22, 2012, at 7:03 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
>
>> On Wed, Nov 21, 2012 at 10:24 PM, Matt Mackall <mpm at selenic.com> wrote:
>>> On Wed, 2012-11-21 at 08:02 +0100, Angel Ezquerra wrote:
>>>> On Nov 20, 2012 6:55 PM, "Bryan O'Sullivan" <bos at serpentine.com> wrote:
>>>>>
>>>>> On Tue, Nov 20, 2012 at 8:30 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
>>
>> Matt, thanks for pitching in.
>>
>> I understand now the need to get the hashes, usernames, dates and
>> descriptions. As for also getting the diffs, that would only happen
>> when using the --bundle option, right? That is, calling a plain "hg
>> incoming" (without --bundle) should be pretty fast regardless of the
>> contents of the changesets themselves. Is that correct?
>
> Sadly no, incoming pulls a complete bundle, then throws it on the floor when done. In fact, 'hg help incoming' tells you that '--bundle FILE' lets you save the requested bundle to a file so you can apply it later.
>
>>
>>>> In any case, may I ask why you are not surprised by this?
>>>
>>> You've hit the trifecta of ways to have suboptimal performance:
>>>
>>> a) Windows
>>> b) large files
>>> c) using a protocol designed for broadband and slower on a LAN
>>
>> By protocol you mean the underlying http or mercurial's own protocol?
>
> hg's own protocol, which can use http (or ssh) as a transport.
>
>
>>>>> Do you have the ability to serve the repo that shows poor performance using "hg serve"? If so, it would be helpful to "hg serve --profile", do a pull from a client, then stop the server and share the profile dump.
>>>> C:\mercurial_tests\tmp_bigfiles_not_large>hg serve --port 7000
>>>> --profile --verbose
>>>
>>> ..no profile. A profile looks like this:
>>>
>>> $ hg serve --profile
>>> listening at http://calx:8000/ (bound to *:8000)
>>> [do something]
>>> [hit ctrl-c]
>>> CallCount Recursive Total(s) Inline(s) module:lineno(function)
>>> 4 0 1.5059 1.5058 <select.select>
>>> 57 46 0.0052 0.0036 <__import__>
>>> 1 0 0.0025 0.0013 mimetypes:205(readfp)
>>> 684 0 0.0006 0.0005 mercurial.config:20(__setitem__)
>>> 665 0 0.0007 0.0005 mimetypes:78(add_type)
>>> 2 0 0.0004 0.0004 <_socket.gethostbyaddr>
>>> 80 0 0.0008 0.0003 mercurial.config:27(update)
>>> 971 0 0.0003 0.0003 <method 'split' of 'str' objects>
>>>
>>
>> Sorry, I did not know that.
>>
>> Unfortunately, I cannot get --profile to work. Whenever I use Ctrl+C
>> to stop the server it just exits without printing any profile
>> information.
>>
>> I am running this test server on our production mercurial (windows
>> 2003) server, which is already using the default port. If I do:
>>
>> hg server --profile
>>
>> without specifying a port, the server exists immediately, saying
>> "abort: cannot start server at ':8000'", and in that case it _does_
>> print some profile information. However, it does not print anything if
>> I set a port to something other than 8000 and then I use Ctrl+C.
>>
>> I thought that perhaps --profile did not work when the --port is set.
>> To verify this I stopped our production server tonight and tried again
>> to run "hg server --profile" (without setting the port). The result is
>> the same (i.e. no profile information is printed when I do Ctrl+C). I
>> tried killing the hg process using the windows task manager but I get
>> the same result (no profile printed).
>>
>> What else could I do?
>
> I just verified that ^C gives me profiler output on a Mac. I don't know enough about Windows to guess why this isn't working though.
>
>
> The fact that archives are fast and pulls are slow suggests (to me) that computing deltas is being expensive for your use case, but that's only intuition talking. Figuring out a way to get profiler output would be the best outcome.
>
> One thing that *might* give us an idea where you'd be burning some time would be 'hg bundle --all --profile foo.bundle'. That's not exactly the same as the web server, but it might show something surprising (and therefore be helpful without me having to guess about Windows things.)
>
Thank you Augie,
I did ran "hg bundle --all --profile foo.bundle" three times. The
second and third times I also added --time:
C:\mercurial_tests\tmp_bigfiles_not_large>hg bundle --all --profile foo.bundle
2 changesets found
CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
26 0 85.1904 79.0038
<mercurial\util.pyo>:863(read)
+571 0 6.1801 0.3268
+<mercurial\util.pyo>:850(splitbig)
+573 0 0.0032 0.0032 +<method 'pop' of
'list' objects>
+570 0 0.0028 0.0028 +<method 'append'
of 'list' objects>
+1143 0 0.0004 0.0004 +<len>
+3 0 0.0000 0.0000 +<method 'insert'
of 'list' objects>
161 0 37.0899 37.0899 <method 'compress'
of 'bz2.BZ2Compressor' objects>
7 0 2.7092 2.7092 <method 'read' of
'file' objects>
8 0 2.5599 2.5598
<mercurial\revlog.pyo>:107(decompress)
+3 0 0.0002 0.0002 +<zlib.decompress>
164 0 1.3976 1.3976 <method 'write' of
'file' objects>
1 0 124.4743 0.6509
<mercurial\changegroup.pyo>:60(writebundle)
+161 0 37.0899 37.0899 +<method
'compress' of 'bz2.BZ2Compressor' objects>
+163 0 1.3973 1.3973 +<method 'write'
of 'file' objects>
+1 0 0.0978 0.0978 +<method 'flush'
of 'bz2.BZ2Compressor' objects>
+1 0 0.0029 0.0029 +<method 'close'
of 'file' objects>
+1 0 0.0004 0.0004 +<open>
12 0 0.5223 0.5223 <method 'update'
of '_hashlib.HASH' objects>
571 0 6.1801 0.3268
<mercurial\util.pyo>:850(splitbig)
+31 0 5.8525 0.0227
+<mercurial\localrepo.pyo>:1989(gengroup)
+574 0 0.0008 0.0008 +<len>
1 0 0.0978 0.0978 <method 'flush' of
'bz2.BZ2Compressor' objects>
5 0 0.0475 0.0475 <imp.load_dynamic>
62 44 0.0859 0.0359 <__import__>
+1 0 0.0005 0.0002
+<mercurial\changegroup.pyo>:8(<module>)
+1 0 0.0005 0.0002
+<mercurial\context.pyo>:8(<module>)
+1 0 0.0029 0.0002 +<hashlib.pyo>:55(<module>)
+1 0 0.0038 0.0001
+<mercurial\revlog.pyo>:12(<module>)
+1 0 0.0001 0.0001
+<encodings\utf_8.pyo>:8(<module>)
7 0 0.0306 0.0306 <mercurial.osutil.posixfile>
C:\mercurial_tests\tmp_bigfiles_not_large>hg --time bundle --all
--profile bar.bundle
2 changesets found
CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
26 0 82.8639 80.5234
<mercurial\util.pyo>:863(read)
+571 0 2.3341 0.3113
+<mercurial\util.pyo>:850(splitbig)
+573 0 0.0034 0.0034 +<method 'pop' of
'list' objects>
+570 0 0.0026 0.0026 +<method 'append'
of 'list' objects>
+1143 0 0.0004 0.0004 +<len>
+3 0 0.0000 0.0000 +<method 'insert'
of 'list' objects>
161 0 37.0936 37.0936 <method 'compress'
of 'bz2.BZ2Compressor' objects>
164 0 1.2516 1.2516 <method 'write' of
'file' objects>
7 0 0.9041 0.9041 <method 'read' of
'file' objects>
1 0 121.9192 0.6125
<mercurial\changegroup.pyo>:60(writebundle)
+161 0 37.0936 37.0936 +<method
'compress' of 'bz2.BZ2Compressor' objects>
+163 0 1.2501 1.2501 +<method 'write'
of 'file' objects>
+1 0 0.0923 0.0923 +<method 'flush'
of 'bz2.BZ2Compressor' objects>
+1 0 0.0031 0.0031 +<method 'close'
of 'file' objects>
+1 0 0.0007 0.0007 +<open>
8 0 0.5515 0.5513
<mercurial\revlog.pyo>:107(decompress)
+3 0 0.0002 0.0002 +<zlib.decompress>
12 0 0.5342 0.5342 <method 'update'
of '_hashlib.HASH' objects>
571 0 2.3341 0.3113
<mercurial\util.pyo>:850(splitbig)
+31 0 2.0220 0.0190
+<mercurial\localrepo.pyo>:1989(gengroup)
+574 0 0.0008 0.0008 +<len>
1 0 0.0923 0.0923 <method 'flush' of
'bz2.BZ2Compressor' objects>
31 0 2.0220 0.0190
<mercurial\localrepo.pyo>:1989(gengroup)
+31 0 1.9997 0.0002
+<mercurial\revlog.pyo>:1114(group)
+3 0 0.0030 0.0001
+<mercurial\localrepo.pyo>:624(file)
+4 0 0.0001 0.0000
+<mercurial\localrepo.pyo>:1953(gennodelst)
+6 0 0.0000 0.0000 +<len>
+1 0 0.0001 0.0000
+<mercurial\localrepo.pyo>:270(hook)
62 44 0.0180 0.0077 <__import__>
+1 0 0.0004 0.0003
+<mercurial\changegroup.pyo>:8(<module>)
Time: real 121.989 secs (user 59.984+0.000 sys 61.062+0.000)
C:\mercurial_tests\tmp_bigfiles_not_large>hg --time bundle --all
--profile bar.bundle2
2 changesets found
CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
26 0 82.6152 80.2107
<mercurial\util.pyo>:863(read)
+571 0 2.3983 0.3247
+<mercurial\util.pyo>:850(splitbig)
+573 0 0.0033 0.0033 +<method 'pop' of
'list' objects>
+570 0 0.0025 0.0025 +<method 'append'
of 'list' objects>
+1143 0 0.0004 0.0004 +<len>
+3 0 0.0000 0.0000 +<method 'insert'
of 'list' objects>
161 0 37.0642 37.0642 <method 'compress'
of 'bz2.BZ2Compressor' objects>
164 0 1.2536 1.2536 <method 'write' of
'file' objects>
7 0 0.8486 0.8486 <method 'read' of
'file' objects>
8 0 0.6593 0.6591
<mercurial\revlog.pyo>:107(decompress)
+3 0 0.0002 0.0002 +<zlib.decompress>
1 0 121.6504 0.6175
<mercurial\changegroup.pyo>:60(writebundle)
+161 0 37.0642 37.0642 +<method
'compress' of 'bz2.BZ2Compressor' objects>
+163 0 1.2519 1.2519 +<method 'write'
of 'file' objects>
+1 0 0.0952 0.0952 +<method 'flush'
of 'bz2.BZ2Compressor' objects>
+1 0 0.0030 0.0030 +<method 'close'
of 'file' objects>
+1 0 0.0008 0.0008 +<open>
12 0 0.5229 0.5229 <method 'update'
of '_hashlib.HASH' objects>
571 0 2.3983 0.3247
<mercurial\util.pyo>:850(splitbig)
+31 0 2.0729 0.0280
+<mercurial\localrepo.pyo>:1989(gengroup)
+574 0 0.0008 0.0008 +<len>
1 0 0.0952 0.0952 <method 'flush' of
'bz2.BZ2Compressor' objects>
31 0 2.0729 0.0280
<mercurial\localrepo.pyo>:1989(gengroup)
+31 0 2.0412 0.0002
+<mercurial\revlog.pyo>:1114(group)
+6 0 0.0001 0.0001 +<len>
+4 0 0.0002 0.0001
+<mercurial\localrepo.pyo>:1953(gennodelst)
+3 0 0.0031 0.0001
+<mercurial\localrepo.pyo>:624(file)
+1 0 0.0002 0.0000
+<mercurial\localrepo.pyo>:270(hook)
62 44 0.0172 0.0081 <__import__>
+1 0 0.0004 0.0002
+<mercurial\changegroup.pyo>:8(<module>)
Time: real 121.711 secs (user 60.844+0.000 sys 59.844+0.000)
So it takes 121 seconds just to do the bundle.
Hopefully these tables will give you guys some indication of whether
the current behavior is normal. I'm not sure how to read the table
though. Are the times (in ms) "per call" or accumulated totals over
all the calls to their corresponding function? What is the ordering of
the items in the list?
The "hg process" is shown on the windows task manager as using from 60
to 350 MB of memory (it varies), while the CPU usage is mostly 25% and
quite constant, although I saw it reach 28% briefly. This server is
virtualized. I don't know how this can affect things but I'm sure this
makes the "CPU usage" that windows shows less than accurate :-/ That
being said, the server has "4 cores" so the 25% part could indicate
that a whole CPU is being used. However the CPU usage chart shows that
all CPUs get busier during the bundle, some of them reaching brief
peaks of 100% but being around 50-60% most of the time, which is odd.
Note however that this is on a live server, although as soon as the
bundle command ends the CPU usage goes down again. I can provide
screenshots of the windows task manager CPU usage. I can also rerun
the test on the week-end were nobody else will be using the server.
If you think of some other test that I can do or any other information
I can provide please let me know.
Cheers,
Angel
More information about the Mercurial
mailing list