hg slow on large repo

solo turn soloturn at gmail.com
Sat May 26 00:51:21 UTC 2007


On 5/24/07, Matt Mackall <mpm at selenic.com> wrote:
> On Wed, May 23, 2007 at 06:34:02PM -0400, Benjamin LaHaise wrote:
> > On Wed, May 23, 2007 at 04:53:37PM -0500, Matt Mackall wrote:
> > > Can you please try the untar test? If untar is slow, we know we have
> > > OS or FS issues.
> > >
> > > If untar is significantly faster than update, we have a problem.
> >
> > Untar took 4m30s from a compressed tarball of the repository + checked out
> > tree.  There are about 100,000 file in the tree.
>
> Not quite the test I was looking for, I wanted to compare untarring a
> working directory vs checking out a working directory.
>
> But that number's still smaller than 11m. We're not out by an order of
> magnitude, but something is clearly unhappy here. We should be
> somewhere between the time between tar xzf workingdir.tar.gz (about as
> fast as you can hope to be) and cp workingdir workingdir2. With proper
> readahead, both should be about half of the disk bandwidth, but that
> rarely happens in practice.
>
> > > I'm assuming this is an ext3 filesystem. Do you have atime updates
> > > disabled? What size is your journal?
> >
> > atime updates are disabled.  The journal is the default size for a 100GB
> > filesystem (can't get to the machine at the moment).
> >
> > > > git seems to get through this much more quickly with -l as it only has to
> > > > deal with just one large .pack file which can be read sequentially.
> > >
> > > What's -l? If you have atime enabled, git will win simply because it
> > > has only one atime to update.
> >
> > -l uses hardlinks for the repository like hg does.
>
> I fail to see how that's related. The hardlinking part is not the slow
> part. That took 17-44 seconds. It's constructing a working copy that's
> taking 11m. If git is symlinking the working dir too, then it's doing
> something very different and potentially unsafe.
>
> If the tools you use on your working directory are up to it, you can
> cp -al the whole tree, working directory and repository. Many editors
> do the wrong thing with hardlinks, so we don't do this.

this mail seems to suggest git works with hardlinks:
http://marc.info/?l=git&m=116370498919078&w=2



More information about the Mercurial mailing list