Network performance problems when pulling and cloning from HTTP server
Angel Ezquerra
ezquerra at gmail.com
Wed Nov 21 07:02:23 UTC 2012
On Nov 20, 2012 6:55 PM, "Bryan O'Sullivan" <bos at serpentine.com> wrote:
>
> On Tue, Nov 20, 2012 at 8:30 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
>>
>> one of my users has a repository with plenty of "big files" (in the
>> order of 50 to 100 MB). Our server is _not_ using the largefiles
>> extension (at least not yet), that is the files are "big" but are not
>> "largefiles".
>>
>> The total repository working directory size is about 1.5 GB. The big
>> files do not change often, if ever.
Bryan, thanks a lot for your comments.
> I'm not surprised that a normal clone would be slow in this situation. Assuming your large files never change, I *am* surprised that subsequent pulls would be slow.
>
Sorry, that is not what I meant. The revision that adds these big
files is the last one. What I meant is that a pull that gets that last
revision will be slow. I expect future pulls of revisions that do not
add more big files to be fast again.
What was initially surprising is that incoming is also very slow. I
always thought of incoming as a way to exchange hashes, but we use
TortoiseHg and I believe that TortoiseHg actually does a pull into a
bundle when you click on the incoming button. I don't know if bare
mercurial does the same?
In any case, may I ask why you are not surprised by this? How can it
be not surprising to have mercurial transmit data using just 2.5% of
the available bandwidth?
I would understand it more if the server had one of its processors
fully used, but they have plenty of spare processing power. I could
also understand it if mercurial were transmitting data in chunks and
each chunk was transmitted fast, with gaps in between them while
mercurial did some processing. However that does not seem to be
happening here. It's just a slow average transfer speed. I wonder what
is keeping mercurial from using the resources (processing and network
bandwidth) available to it? Maybe it is doing IO operations on the
server hard drive?
> Do you have the ability to serve the repo that shows poor performance using "hg serve"? If so, it would be helpful to "hg serve --profile", do a pull from a client, then stop the server and share the profile dump.
Good idea. I just ran a test with a synthetic repo, which has a couple
of the big files on the repo that initially made us notice this
problem. I ran hg serve --profile --verbose, and the result is as
follows:
C:\mercurial_tests\tmp_bigfiles_not_large>hg serve --port 7000
--profile --verbose
listening at http://MERCURIAL.at4wireless.com:7000/ (bound to *:7000)
192.168.14.153 - - [21/Nov/2012 07:37:10] "GET /?cmd=capabilities
HTTP/1.1" 200 -
192.168.14.153 - - [21/Nov/2012 07:37:10] "GET /?cmd=lookup HTTP/1.1"
200 - x-hgarg-1:key=1
192.168.14.153 - - [21/Nov/2012 07:37:10] "GET /?cmd=batch HTTP/1.1"
200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3D
2 changesets found
192.168.14.153 - - [21/Nov/2012 07:37:15] "GET /?cmd=getbundle
HTTP/1.1" 200 -
x-hgarg-1:common=0000000000000000000000000000000000000000&hea
ds=abf60c762c2d373f5ca2102cd8024ddee80c03f6
192.168.14.153 - - [21/Nov/2012 07:46:55] "GET /?cmd=listkeys
HTTP/1.1" 200 - x-hgarg-1:namespace=phases
192.168.14.153 - - [21/Nov/2012 07:46:56] "GET /?cmd=listkeys
HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks
C:\mercurial_tests\tmp_bigfiles_not_large>dir
El volumen de la unidad C es WINSRV2003
El número de serie del volumen es: B834-62CA
Directorio de C:\mercurial_tests\tmp_bigfiles_not_large
20/11/2012 16:37 <DIR> .
20/11/2012 16:37 <DIR> ..
20/11/2012 16:37 <DIR> .hg
20/11/2012 13:42 0 .hgignore
20/11/2012 16:37 94.464.543 signalvector_28-Jan-2011_01.mat
20/11/2012 16:37 47.223.983 simcapture_001.mat
3 archivos 141.688.526 bytes
3 dirs 80.478.457.856 bytes libres
C:\mercurial_tests\tmp_bigfiles_not_large>
In addition, this is what the client shows (in the TortoiseHg window):
% hg clone --rev 1 --verbose -- http://mercurial:7000
E:\aem\Workspaces\3G_FPGA\bigfilestest-mercurial
adding changesets
adding manifests
adding file changes
added 2 changesets with 3 changes to 3 files
calling hook changegroup.lfiles: <function checkrequireslfiles at
0x000000000872B438>
updating to branch default
[command completed successfully Wed Nov 21 07:46:58 2012]
During the clone, the server network activity, as shown by the windows
task manager, hovered around 0.25% of the available 1 Gbps link to the
server, with small and short spikes up to 1%.
As I said earlier, this seems odd to me.
As an aside, it would be really neat if there was a way for mercurial
to automatically select the "uncompressed mode" when it makes sense to
do so. I don't know if there is a valid heuristic that would work
most, if not all the time. Maybe when the server and client are on the
same network? Or perhaps when the file sizes are above a certain size?
Again, thanks a lot for your help.
Cheers,
Angel
More information about the Mercurial
mailing list