hg slow on large repo
Benjamin LaHaise
bcrl at kvack.org
Wed May 23 21:20:32 UTC 2007
On Wed, May 23, 2007 at 01:08:09PM -0500, Matt Mackall wrote:
> Is this a local clone on the same partition? In other words, is it
> using hardlinks? Or is this over the wire? For going over LAN or fast
> WAN, you can use --uncompressed.
It's a local clone on the same partition. Yes, it looks like hardlinks are
getting used as most of the files under .hg show 2 links. Part of what seems
to be the problem is that there are way too many directories and files under
.hg -- just doing a du .hg takes over a minute cache cold.
> How much of the time is clone vs checkout (try time hg clone -U
> followed by hg update)?
hg clone -U takes 17s after a cp -al of the .hg. An immediately following
hg update took XXX.
> For the update side of things, how much time does it take to untar a
> comparable tar.gz?
>
> If local, how much time does it take to do a straight cp and cp -al of .hg?
cp -al of the whole thing takes 4m30s. cp -a of the whole thing is slow
(as in more than 15 minutes). cp -al of just .hg afterwards took 44s.
> Tricks exist, but let's figure out what the problem is first.
This reminds me of a quirk of ext3: if you unpack files in a subdirectory,
the allocator will attempt to place the files in the same block group as the
directory, which it tries to make different than the parent. If the file is
unpacked in the top level of the directory and subsequently moved into the
subdirectory, it will be allocated near the original directory and thus more
closely packed on disk.
git seems to get through this much more quickly with -l as it only has to
deal with just one large .pack file which can be read sequentially. The
hg update rarely peaks 1300KB/s reading from the disk. Does hg have a way
of packing old history into something that isn't touched for typical usage?
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop at kvack.org>.
More information about the Mercurial
mailing list