[WIP] lazy revlog index parsing
Matt Mackall
mpm at selenic.com
Thu Mar 15 06:50:08 UTC 2012
On Wed, 2012-03-14 at 15:18 -0700, Bryan O'Sullivan wrote:
> The current index parser is eager, and does a lot of unnecessary work on
> large revlogs.
>
> This causes e.g. "hg tip" to take 0.3 seconds on a repo that contains
> 300,000 commits.
>
> I've adopted a patch that Matt started on, which parses the index on
> demand. This brings the performance of "hg tip" in the large-commit case
> back to instantaneous.
>
> https://gist.github.com/2039952
[I'll just note in passing the irony of the author of patchbomb using a
pastebin (on github of all places).]
Well performance is good (though I had to tweak perf.py to bypass our
new filecache scheme):
$ hgs perfindex
! wall 0.168411 comb 0.160000 user 0.160000 sys 0.000000 (best of 54)
$ hg perfindex
! wall 0.006424 comb 0.010000 user 0.000000 sys 0.010000 (best of 441)
But it doesn't quite actually work for me, something is screwy with the
offset handling that's causing a seek to get an invalid arg. With the
debugger I'm seeing:
(Pdb) self.index[rev]
(-669319168L, 331, 431, 16259, 16260, 16259, -1, '\x02\xb5p\x13?\xfe
\xa8B\xbfZ\xcfr\xb6\xf1;\xdd\x9e\\}+')
and with contrib/debugshell.py against stable, I'm seeing:
>>> repo.changelog.index[16260]
(205489111040L, 331, 431, 16259, 16260, 16259, -1, '\x02\xb5p\x13?\xfe
\xa8B\xbfZ\xcfr\xb6\xf1;\xdd\x9e\\}+')
>>> hex(-669319168L)
'-0x27e50000L'
>>> hex(205489111040L)
'0x2fd81b0000L'
>>> hex(-669319168L & 0xffffffff)
'0xd81b0000L'
I see: ntohl is getting implicitly defined.
Ok, now tip is still not much faster: it goes from 0.401s to 0.320s on
my test repo. Turns out the bulk of the time here is spent checking that
270 tags point to valid nodes. And doing node->rev mapping is the other
thing that could really stand to be optimized. Disable the valid node
check and it now takes .069s (compared to perfstartup of .025s).
..and that's enough tinkering for this evening.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list