Why sortdiff?

Christopher Li hg at chrisli.org
Sat May 21 15:44:22 UTC 2005


I am wondering why sortdiff is needed. The difflib should work fine
with sorted list as well. Is that for performance reason?

If we don't mind get rid of sorted diff, then I can use the following
patch. Which generate the binary diff merge the "replace+insert" and
"replace+delete" as one block. Fewer blocks and more friendly to
detect conflicts.

It is compatible with current patch as well. Performance wise seems
the same.

Chris

Index: hg/mercurial/mdiff.py
===================================================================
--- hg.orig/mercurial/mdiff.py	2005-05-21 09:08:32.000000000 -0400
+++ hg/mercurial/mdiff.py	2005-05-21 10:34:01.000000000 -0400
@@ -40,16 +40,17 @@
     p = [0]
     for i in a: p.append(p[-1] + len(i))
 
-    if sorted:
-        d = sortdiff(a, b)
-    else:
-        d = difflib.SequenceMatcher(None, a, b).get_opcodes()
-
-    for o, m, n, s, t in d:
-        if o == 'equal': continue
-        s = "".join(b[s:t])
-        bin.append(struct.pack(">lll", p[m], p[n], len(s)) + s)
-
+    d = difflib.SequenceMatcher(None, a, b).get_matching_blocks()
+    if len(d) == 1:
+        return ""
+    la = 0
+    lb = 0
+    for am, bm, size in d:
+        s = "".join(b[lb:bm])
+        if am:
+            bin.append(struct.pack(">lll", p[la], p[am], len(s)) + s)
+        la = am + size
+        lb = bm + size
     return "".join(bin)
 
 def patchtext(bin):



More information about the Mercurial mailing list