Largefiles slow with ignored files?

Na'Tosha Bard natosha at unity3d.com
Sun Apr 8 07:16:48 UTC 2012


2012/4/7 Michał Sznajder <michalsznajder at gmail.com>

> > I checked the code and I think it has something to do with a comment here
> >
> http://selenic.com/repo/hg/file/329887a7074c/hgext/largefiles/reposetup.py#l158
> > Why ignored files are so special that we always request for them?
> > They are only used in line 242 to calculate unknown files.
>
> Actually I tried patch below and to my surprise largefiles tests are still
> green
>
>  $ python run-tests.py test-largefiles*
>  ...
>  # Ran 3 tests, 0 skipped, 0 failed.
>
> I will send a full patch unless someone has a explanation for me why
> ignored
> files are so special for largefiles status operation.
>
> --- a/hgext/largefiles/reposetup.py
> +++ b/hgext/largefiles/reposetup.py
> @@ -158,7 +159,7 @@
>                 # Get ignored files here even if we weren't asked for
> them; we
>                 # must use the result here for filtering later
>                 result = super(lfilesrepo, self).status(node1, node2, m,
> -                    True, clean, unknown, listsubrepos)
> +                    listignored, clean, unknown, listsubrepos)
>                 if working:
>                     try:
>                         # Any non-largefiles that were explicitly
> listed must be
>
> Michał Sznajder
>

You can see why ignored files are so special for largefiles status
operation -- we use the result later on as part of the calculation.  This
patch you are proposing will almost certainly result in incorrect
calculation in real-world scenarios.  You can see the commit where this
line you propose changing came from in its entirety here:
http://selenic.com/hg/rev/74e691b141c4?revcount=40

Largefiles is unfortunately known to be slow in the case of a very large
number of ignored files, but it is only marginally slower than core
Mercurial in the "typical" case (e.g, a repository with moderate or low
number of ignored files, and a moderate or high number of tracked files).
 Before the commit linked above, it was very slow in *every* case, which
mae users frustrated and GUI tools that rely on status so slow they were
almost impossible to use.

There might be a way to optimize for the case of many ignored files in a
repository, but 1) your proposed change isn't it, and 2) you must take care
to test performance with various types of repositories (many largefiles,
few largefiles, large numbers of ignored files, large numbers of unknown
files, and combinations thereof), as this is an area of code that it's
important to tread carefully around.

Cheers,
Na'Tosha

*Na'Tosha Bard*
Build & Infrastructure Developer | Unity Technologies - Copenhagen

*E-Mail:* natosha at unity3d.com
*Skype:* natosha.bard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial/attachments/20120408/99a5b4d6/attachment-0002.html>


More information about the Mercurial mailing list