[PATCH 01 of 15] hgext: `speedy` - the history query accelerator
Tomasz Kłeczek
tomasz.kleczek at gmail.com
Tue Dec 11 23:28:23 UTC 2012
I accidentally based my patches on stable, i'll fix it in V2.
On Tue, Dec 11, 2012 at 10:38 AM, Tomasz Kleczek <tkleczek at fb.com> wrote:
> # HG changeset patch
> # User Tomasz Kleczek <tkleczek at fb.com>
> # Date 1355250659 28800
> # Branch stable
> # Node ID 13c6bcb8dd900dc7dbf5e3da9ef68d56fed250b3
> # Parent 8973e7dd92d5afdeb82e91d5b66934d53a74e8da
> hgext: `speedy` - the history query accelerator
>
> This is the first in a series of patches that address the problem
> of hg log being slow on a big repo in most cases.
>
>
> Many history queries have performance linear in the number of commits
> or even worse - the size of manifest.
>
> As a result almost every hg log query with non-trivial rev range or
> with directory specified is painfully slow on big repositories
> and its poor performance doesn't depend on actual output size.
>
> Here are some frequently used commands that just print a handful of
> log messages:
> * hg log dir_with_few_commits
> * hg log --rev "user(MyLazyFriend)"
>
> They may take more then 30 secs on a big repo (with couple hundred
> thousands of commits).
>
> This extension addresses this problem by introducing a server component
> that maintains indices over the history and uses them to respond to queries
> in an efficient manner.
>
> What is going on:
>
> * the client component forwards certain history queries to the server
> and waits for a response
> * it takes into account that its history may have diverged from
> the server's, and still gives correct answers
> * if the server doesn't respond fast enough or crashes, the client falls
> back to computing the answer locally using normal code path
> * extension setup time is neglible and there is no overhead for queries
> that cannot be accelerated by the server
>
> The server can be run:
> * locally in the same process as client or
> * remotely, using a custom protocol over http to communicate
>
> Sample performance gains (while running remote history server):
> All commands are run with -l1 option so that displaying output in the
> terminal doesn't affect the measurements:
>
> All commands takes roughly 30 secs without the extension enabled.
>
> hg log somedir -l1
> -> time reduced to 2.4 sec
>
> hg log --rev "author(someuser)" -l1
> -> time reduced to 1.6 sec
>
> hg log --rev "date(10/1/2012)" -l1
> -> time reduced to 1.7 sec
>
> hg log "relglob:**.html" -l1
> -> time reduced to 2.4 sec
>
> hg log . -l1
> -> time reduced to 11.9 sec
>
> This patch introduces support for `author` revset query. More queries
> will be added in subsequent patches.
>
> diff --git a/hgext/speedy/__init__.py b/hgext/speedy/__init__.py
> new file mode 100644
> --- /dev/null
> +++ b/hgext/speedy/__init__.py
> @@ -0,0 +1,10 @@
> +# Copyright 2012 Facebook
> +#
> +# This software may be used and distributed according to the terms of the
> +# GNU General Public License version 2 or any later version.
> +
> +import client
> +
> +def uisetup(ui):
> + if ui.configbool('speedy', 'client', False):
> + client.uisetup(ui)
> diff --git a/hgext/speedy/client.py b/hgext/speedy/client.py
> new file mode 100644
> --- /dev/null
> +++ b/hgext/speedy/client.py
> @@ -0,0 +1,33 @@
> +# Copyright 2012 Facebook
> +#
> +# This software may be used and distributed according to the terms of the
> +# GNU General Public License version 2 or any later version.
> +
> +from mercurial import extensions, commands
> +from mercurial import revset
> +
> +def patchedauthor(repo, subset, x):
> + """Return the revisions commited by user whose name match x
> +
> + Used to monkey patch revset.author function.
> + """
> + # In the subsequent patches here we are going to forward the query
> + # to the server
> + return revset.author(repo, subset, x)
> +
> +def _speedysetup(ui, repo):
> + """Initialize speedy client."""
> + revset.symbols['author'] = patchedauthor
> +
> +def uisetup(ui):
> + # Perform patching and most of the initialization inside log wrapper,
> + # as this is only needed if log command is being used
> + initialized = [False]
> + def logwrapper(cmd, *args, **kwargs):
> + repo = args[1]
> + if not initialized[0]:
> + initialized[0] = True
> + _speedysetup(ui, repo)
> + cmd(*args, **kwargs)
> +
> + extensions.wrapcommand(commands.table, 'log', logwrapper)
> diff --git a/tests/test-speedy.t b/tests/test-speedy.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-speedy.t
> @@ -0,0 +1,44 @@
> +Global config file
> + $ cat >> $HGRCPATH <<EOF_END
> + > [ui]
> + > logtemplate = "{desc}\n"
> + >
> + > [extensions]
> + > speedy=
> + > EOF_END
> +
> +Preparing local repo
> +
> + $ hg init localrepo
> + $ cd localrepo
> +
> + $ mkdir d1
> + $ echo chg0 > d1/chg0
> + $ hg commit -Am chg0 -u testuser1
> + adding d1/chg0
> + $ echo chg1 > d1/chg1
> + $ hg commit -Am chg1 -u testuser2 --date "10/20/2012"
> + adding d1/chg1
> + $ echo chg2 > d1/chg2
> + $ hg commit -Am chg2 -u testuser1
> + adding d1/chg2
> + $ mkdir d2
> + $ echo chg3 > d2/chg3.py
> + $ hg commit -Am chg3 -u testuser1
> + adding d2/chg3.py
> + $ echo chg4 > d2/chg4
> + $ hg commit -Am chg4 -u testuser1
> + adding d2/chg4
> + $ echo chg5 > chg5.py
> + $ hg commit -Am chg5 -u testuser1 --date "10/20/2012"
> + adding chg5.py
> +
> + $ hg log -r "reverse(user(testuser1))"
> + chg5
> + chg4
> + chg3
> + chg2
> + chg0
> +
> + $ hg log -r "reverse(author(testuser2))"
> + chg1
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial-devel/attachments/20121211/35c4d5d5/attachment-0002.html>
More information about the Mercurial-devel
mailing list