[PATCH] diff performance: re-establish linear runtime performance
Yuya Nishihara
yuya at tcha.org
Thu Apr 30 14:41:26 UTC 2020
On Thu, 30 Apr 2020 16:25:09 +0200, Pierre-Yves David wrote:
> > The previous method with sum() and list() creates a new list object
> > for every hunk. Then sum() is used to flatten out this sequence of
> > lists. The sum() function is not "lazy", but creates a new list object
> > for every "+" operation and so this code had quadratic runtime behaviour.
> >
> > diff -r 5c1f356b108e -r fd68848f8d55 mercurial/patch.py
> > --- a/mercurial/patch.py Fri Apr 24 12:37:43 2020 -0700
> > +++ b/mercurial/patch.py Thu Apr 30 15:10:05 2020 +0200
> > @@ -2558,7 +2558,7 @@ def diff(
> > fctx2 is not None
> > ), b'fctx2 unexpectly None in diff hunks filtering'
> > hunks = hunksfilterfn(fctx2, hunks)
> > - text = b''.join(sum((list(hlines) for hrange, hlines in hunks), []))
> > + text = b''.join((b''.join(hlines) for hrange, hlines in hunks))
If the number of intermediate objects matters, creating a list of bytes might
be slightly faster.
lines = []
for hr, hl in hunks:
lines.extend(hl)
b''.join(lines)
More information about the Mercurial-devel
mailing list