hg convert from SVN repo getting stuck
Matt Mackall
mpm at selenic.com
Tue Apr 29 19:19:58 UTC 2014
On Tue, 2014-04-29 at 08:26 +0200, Malte Helmert wrote:
> On 26.04.2014 19:36, Malte Helmert wrote:
> > Dear group,
> >
> > I'm having some problems with hg convert:
> [...]
> > Any suggestions?
>
> Hi again,
>
> while the above email was stuck in moderation (or gmane?), I've done
> some tests to reproduce the same behaviour outside of hg convert. This
> time, I used the most recent hg from the stable branch:
>
> $ ./hg version
> Mercurial Distributed SCM (version 3.0-rc+28-d36440d84328)
> [...]
>
> I produced smaller versions of the file that was modified in the
> changeset that caused hg convert to get stuck. For this I truncated the
> file at $SIZE bytes for various values of SIZE. Then I created a
> pristine repository and committed first the truncated before-changeset
> version of the file and then the truncated after-changeset version:
>
> $ hg init testrepo
> $ cd testrepo
> $ cp ../before file.txt
> $ hg add file.txt
> $ hg commit -m "first commit"
> $ cp ../after file.txt
> $ hg diff file.txt | wc -l
> $ hg commit -m "second commit"
>
> For various values of SIZE, here is how long the "hg diff" and second
> commit took, along with the size of the diff:
>
> SIZE=1M: diff 0.29s (22895 lines), commit 0.20s
> SIZE=2M: diff 0.81s (47439 lines), commit 0.61s
> SIZE=3M: diff 1.76s (72830 lines), commit 1.47s
> SIZE=4M: diff 7.43s (97965 lines), commit 6.99s
> SIZE=5M: diff 4.43s (122899 lines), commit 3.91s
> SIZE=6M: diff 88.36s (147787 lines), commit 90.56s
> SIZE=7M: diff 10.51s (172759 lines), commit 9.74s
> SIZE=8M: diff 202.15s (198395 lines), commit 202.04s
> SIZE=9M: diff 25.33s (223536 lines), commit 24.55s
> SIZE=10M: diff 162.89s (248567 lines), commit 159.23s
> SIZE=12M: diff 132.42s (299271 lines), commit 130.41s
> SIZE=14M: diff 53.40s (350046 lines), commit 51.73s
> SIZE=16M: diff 2822.02s (400085 lines), commit 2802.49s
> SIZE=18M: diff 16687.10s (450685 lines), commit 16856.23s
> SIZE=20M: diff 20673.23s (501958 lines), commit 20639.25s
> SIZE=22M: diff 25507.32s (552989 lines), commit 25758.22s
> SIZE=24M: diff 30875.81s (603709 lines), commit 31276.82s
> ...
> (Larger sizes did not terminate yet.)
>
> So there is some nasty scaling here. As a reference point, for the full
> file (35 MB), "diff -u" (GNU diff) takes 0.9 seconds.
Very interesting.
> So I seem to have hit a bad case for Mercurial's diff algorithm. Would
> there be interest in uncovering the reason for this, and possibly
> modifying the diff algorithm to address this?
If you can construct a public test case, yes. Please create an issue on
the bug tracker.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial
mailing list