[PATCH] avoid some false positives for addremove -s
Alexis S. L. Carvalho
alexis at cecm.usp.br
Thu Feb 15 08:31:09 UTC 2007
Thus spake Erling Ellingsen:
> # HG changeset patch
> # User Erling Ellingsen <erlingalf at gmail.com>
> # Date 1171497082 -3600
> # Node ID 475fc420289b50577d59a293e7e24586d0c81ec4
> # Parent 5b1f663ef86d68ce11d70de8e5ab61d93341a18c
> avoid some false positives for addremove -s
This looks mostly ok, but gmail has the annoying tendency to mangle
inline patches. Can you send it again as an attachment?
Some minor things:
> diff -r 5b1f663ef86d -r 475fc420289b mercurial/cmdutil.py
> --- a/mercurial/cmdutil.py Tue Feb 06 16:12:22 2007 -0600
> +++ b/mercurial/cmdutil.py Thu Feb 15 00:51:22 2007 +0100
> @@ -146,20 +145,31 @@ def walk(repo, pats=[], opts={}, node=No
> yield src, fn, util.pathto(repo.getcwd(), fn), fn in exact
>
> def findrenames(repo, added=None, removed=None, threshold=0.5):
> + '''find renamed files -- yields (before, after, score) tuples'''
> if added is None or removed is None:
> added, removed = repo.status()[1:3]
> + import bdiff
Put the import at the top of the file, with the other imports
> ctx = repo.changectx()
> for a in added:
> aa = repo.wread(a)
> - bestscore, bestname = None, None
> + bestname, bestscore = None, threshold
> for r in removed:
> rr = ctx.filectx(r).data()
> - delta = mdiff.textdiff(aa, rr)
> - if len(delta) < len(aa):
> - myscore = 1.0 - (float(len(delta)) / len(aa))
> - if bestscore is None or myscore > bestscore:
> - bestscore, bestname = myscore, r
> - if bestname and bestscore >= threshold:
> +
> + # bdiff.blocks() returns blocks of matching lines
> + # count the number of bytes in each
> + equal = 0
> + alines = mdiff.splitnewlines(aa)
> + matches = bdiff.blocks(aa, rr)[:-1]
> + for x1,x2,y1,y2 in matches:
> + assert x2-x1 == y2-y1
This assert shouldn't be needed. Well, asserts obviously should never
trigger, but if we really want that check, there are probably better
places for it.
> + for line in alines[x1:x2]:
> + equal += len(line)
> +
> + myscore = equal*2.0 / (len(aa)+len(rr))
> + if myscore >= bestscore:
> + bestscore, bestname = myscore, r
You used "bestname, bestscore" a few lines above and "bestscore,
bestname" here. Can you use the same order in both?
> + if bestname:
> yield bestname, a, bestscore
>
> def addremove(repo, pats=[], opts={}, wlock=None, dry_run=None,
> diff -r 5b1f663ef86d -r 475fc420289b tests/test-addremove-similar
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/tests/test-addremove-similar Thu Feb 15 00:51:22 2007 +0100
> diff -r 5b1f663ef86d -r 475fc420289b tests/test-addremove-similar-2
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/tests/test-addremove-similar-2 Thu Feb 15 00:51:22 2007 +0100
I'd rather have both tests in the same test-addremove-similar file...
Thanks
Alexis
More information about the Mercurial-devel
mailing list