Network performance problems when pulling and cloning from HTTP server

Matt Mackall mpm at selenic.com
Wed Nov 21 21:24:01 UTC 2012


On Wed, 2012-11-21 at 08:02 +0100, Angel Ezquerra wrote:
> On Nov 20, 2012 6:55 PM, "Bryan O'Sullivan" <bos at serpentine.com> wrote:
> >
> > On Tue, Nov 20, 2012 at 8:30 AM, Angel Ezquerra <ezquerra at gmail.com> wrote:
> >>
> >> one of my users has a repository with plenty of "big files" (in the
> >> order of 50 to 100 MB). Our server is _not_ using the largefiles
> >> extension (at least not yet), that is the files are "big" but are not
> >> "largefiles".
> >>
> >> The total repository working directory size is about 1.5 GB. The big
> >> files do not change often, if ever.
> 
> Bryan, thanks a lot for your comments.
> 
> > I'm not surprised that a normal clone would be slow in this situation. Assuming your large files never change, I *am* surprised that subsequent pulls would be slow.
> >
> 
> Sorry, that is not what I meant. The revision that adds these big
> files is the last one. What I meant is that a pull that gets that last
> revision will be slow. I expect future pulls of revisions that do not
> add more big files to be fast again.
> 
> What was initially surprising is that incoming is also very slow. I
> always thought of incoming as a way to exchange hashes,

..and user names and dates and descriptions and file lists and sometimes
diffs. In other words, everything.

'hg incoming' pulls a bundle, just like pull. It is in fact the one and
only way to exchange this data supported by the wire protocol. The
bundle starts with a changelog. If we only need the changelog (ie no
diffs), then we abort the transfer in the middle.

> In any case, may I ask why you are not surprised by this?

You've hit the trifecta of ways to have suboptimal performance:

a) Windows
b) large files
c) using a protocol designed for broadband and slower on a LAN

> > Do you have the ability to serve the repo that shows poor performance using "hg serve"? If so, it would be helpful to "hg serve --profile", do a pull from a client, then stop the server and share the profile dump.
> C:\mercurial_tests\tmp_bigfiles_not_large>hg serve --port 7000
> --profile --verbose

..no profile. A profile looks like this:

$ hg serve --profile
listening at http://calx:8000/ (bound to *:8000)
[do something]
[hit ctrl-c]
   CallCount    Recursive     Total(s)    Inline(s) module:lineno(function)
           4            0      1.5059      1.5058   <select.select>
          57           46      0.0052      0.0036   <__import__>
           1            0      0.0025      0.0013   mimetypes:205(readfp)
         684            0      0.0006      0.0005   mercurial.config:20(__setitem__)
         665            0      0.0007      0.0005   mimetypes:78(add_type)
           2            0      0.0004      0.0004   <_socket.gethostbyaddr>
          80            0      0.0008      0.0003   mercurial.config:27(update)
         971            0      0.0003      0.0003   <method 'split' of 'str' objects>

-- 
Mathematics is the supreme nostalgia of our time.





More information about the Mercurial mailing list