hg slow on large repo

Matt Mackall mpm at selenic.com
Wed May 23 23:05:40 UTC 2007


On Wed, May 23, 2007 at 06:34:02PM -0400, Benjamin LaHaise wrote:
> On Wed, May 23, 2007 at 04:53:37PM -0500, Matt Mackall wrote:
> > Can you please try the untar test? If untar is slow, we know we have
> > OS or FS issues.
> > 
> > If untar is significantly faster than update, we have a problem.
> 
> Untar took 4m30s from a compressed tarball of the repository + checked out 
> tree.  There are about 100,000 file in the tree.

Not quite the test I was looking for, I wanted to compare untarring a
working directory vs checking out a working directory.

But that number's still smaller than 11m. We're not out by an order of
magnitude, but something is clearly unhappy here. We should be
somewhere between the time between tar xzf workingdir.tar.gz (about as
fast as you can hope to be) and cp workingdir workingdir2. With proper
readahead, both should be about half of the disk bandwidth, but that
rarely happens in practice.

> > I'm assuming this is an ext3 filesystem. Do you have atime updates
> > disabled? What size is your journal?
> 
> atime updates are disabled.  The journal is the default size for a 100GB 
> filesystem (can't get to the machine at the moment).
> 
> > > git seems to get through this much more quickly with -l as it only has to 
> > > deal with just one large .pack file which can be read sequentially.
> > 
> > What's -l? If you have atime enabled, git will win simply because it
> > has only one atime to update.
> 
> -l uses hardlinks for the repository like hg does.

I fail to see how that's related. The hardlinking part is not the slow
part. That took 17-44 seconds. It's constructing a working copy that's
taking 11m. If git is symlinking the working dir too, then it's doing
something very different and potentially unsafe.

If the tools you use on your working directory are up to it, you can
cp -al the whole tree, working directory and repository. Many editors
do the wrong thing with hardlinks, so we don't do this.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial mailing list