[PATCH STABLE] largefiles: check wheter specified patterns are related to largefiles strictly
FUJIWARA Katsunori
foozy at lares.dti.ne.jp
Fri Feb 17 16:18:42 UTC 2012
At Thu, 16 Feb 2012 16:48:51 +0100,
Na'Tosha Bard wrote:
>
> 2012/2/16 FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
>
> >
> > At Wed, 15 Feb 2012 15:08:38 +0100,
> > Na'Tosha Bard wrote:
> > > Did you do any performance testing before and after this patch? What is
> > > the difference in performance? What sort of repository did you test it
> > on?
> > >
> > > Na'Tosha
> >
> > Not yet tested on real repo, just considered about ORDER of
> > processing.
> >
> > before:
> >
> > (b1) lookup in 'lfdirstate' => O(1)
> > (b2) loop by 'match.files()' => O(N_matchfiles) : N of
> > 'match.files()'
> >
> > after:
> >
> > (a1) loop by 'lfdirstate' => O(N_lfiles) : N of lfdirstate
> > (a2) examination by 'match(f)' => O(N_matchfiles)
> >
> > If 'N_matchfiles' can be assumed as few enough (and I think it can),
> >
>
> See, this is a tough one. People using largefiles usually fall into 1 of 2
> categories:
>
> 1) People with a lower number of extremely large binaries
> 2) People with a huge number of
> not-so-large-but-too-large-to-version-directly binaries
>
> >From bug reports and talking to people, I know there are plenty of users in
> both of these categories. Ideally we'd optimize in a way that won't leave
> either side out in the cold, but generally I think group (1) is probably
> bigger than (2).
>
> In any case, I'd like to run some performance tests on both our real
> repository and some generated test repositories of various sizes before
> this patch is applied. I hope to get to this tomorrow.
thank you for your comments.
I just posted patch series to fix bugs around "hg status" with
largefiles as base of discussion for the ways to fix them.
That series shows bugs (or performance enhancement points) all I know
about "hg status" of largefiles.
By the way, are there any public information(or pointer to it) about
how to build(or get) repos for largefiles benchmarking ?
> > main performance difference is between (b1) and (a1).
> >
> > 'N_lfiles' is not small in ordinary cases, so patched code will
> > increase execution cost clearly.
> >
> > I don't have any other good ideas to fix this problem (= showing '?'
> > for largefile itself) with current policy for 'performance boost'
> > route choice. so I posted this patch, even though it increases
> > execution cost.
> >
> > Of course, there are other choices:
> >
> > - fix this problem by any other ways, or
> >
> > - change policy of 'performance boost' route choice itself
> >
> > for example: choose 'slow' route when non-file pattern is specified
> >
> >
> > By the way, current checking by lfdirstate does not work expectedly (=
> > show status of largefile itself not of STANDIN), when "hg status"
> > against the rev tracking largefiles is invoked on working context not
> > tracking largefiles.
> >
> > # I hit on this situation after patch post ....
> >
> > Here, which of ways should be choosen ?
> >
> > (1) check on both contexts whether there are any tracked files:
> > - it is STANDIN, and
> > - non-STANDIN part is matched to specified pattern
> >
> > (2) choose 'slow' route, if both of specified revision are not
> > 'working dir'
> >
> > The later seems to be better, because of performance impact scope.
> >
>
> I agree that (2) is more appropriate here.
>
> Cheers,
> Na'Tosha
>
>
> >
> > > > diff -r f7e0d95d0a0b -r c0a0446aaa86 hgext/largefiles/reposetup.py
> > > > --- a/hgext/largefiles/reposetup.py Fri Feb 10 16:52:32 2012 -0600
> > > > +++ b/hgext/largefiles/reposetup.py Wed Feb 15 23:01:09 2012 +0900
> > > > @@ -118,8 +118,10 @@
> > > > # handle it -- thus gaining a big performance boost.
> > > > lfdirstate = lfutil.openlfdirstate(ui, self)
> > > > if match.files() and not match.anypats():
> > > > - matchedfiles = [f for f in match.files() if f in
> > lfdirstate]
> > > > - if not matchedfiles:
> > > > + for f in lfdirstate:
> > > > + if match(f):
> > > > + break
> > > > + else:
> > > > return super(lfiles_repo, self).status(node1,
> > > > node2,
> > > > match, listignored, listclean,
> > > > listunknown, listsubrepos)
> >
> > ----------------------------------------------------------------------
> > [FUJIWARA Katsunori] foozy at lares.dti.ne.jp
> >
>
>
>
> --
> *Na'Tosha Bard*
> Build & Infrastructure Developer | Unity Technologies - Copenhagen
>
> *E-Mail:* natosha at unity3d.com
> *Skype:* natosha.bard
----------------------------------------------------------------------
[FUJIWARA Katsunori] foozy at lares.dti.ne.jp
More information about the Mercurial-devel
mailing list