hg diff: why is it listing the directories?

Alexis S. L. Carvalho alexis at cecm.usp.br
Sat Jun 2 20:16:58 UTC 2007


Thus spake Guido Ostkamp:
> >> this is most likely the 'lstat' bug. I've documented it two weeks ago 
> >> at <http://www.selenic.com/mercurial/bts/issue567>.
> >> 
> >> I'm waiting for Alexis (who requested a verification which I provided) 
> >> to fix the issue. The version which introduced the bug is already 
> >> known.
> >
> > I still can't see why *any* getdent call is required for "hg diff". It 
> > looks like this bug is about a regression where a full useless dirwalk 
> > is performed, but even in the "fixed" version (b4eaa68dea1b), there are 
> > still 2 getdent calls.
> 
> Alexis made some comments that the full walk had been introduced 
> intentionally. Not belonging to the development team, I can only guess, 
> but I think there is a difference between a 'hg diff' and a 'hg diff 
> <file>'. For the first one (diff on whole repository) I suppose that full 
> walk might possibly be required. Maybe they forgot to disable it for the 
> second case, I don't know.

These are different issues.

As Giovanni noted, many (most?) commands are interested only in tracked
files.  This (usually) doesn't matter too much if you have a .hgignore
that covers all untracked files, but it'd be a nice optimization to just
avoid the os.listdir()s and call os.lstat() directly and only on the
files we're interested in.  This would involve mostly passing a
"list_unknown" argument to walk functions and use this optimization in
the "not list_unknown and not list_ignored and not list_dirs" case.

The issue with hg diff is that we're walking the working dir twice to
generate a diff: one to detect what files have changed and one to...
err... well, for no particular reason (which is the bug).

Hmm... I think I just saw a simple workaround.  Let me test this a bit.

Alexis



More information about the Mercurial mailing list