How to get rid of big binary files buried in my repository's history
John Mulligan
phlogistonjohn at asynchrono.us
Wed Feb 25 17:39:19 UTC 2009
On Wed, Feb 25, 2009 at 03:39:03PM +0100, Martin Geisler wrote:
> Matt Mackall <mpm at selenic.com> writes:
>
> > On Wed, 2009-02-25 at 09:44 +0100, Martin Geisler wrote:
>
> >> * convert the repository hg -> hg and use a filemap.
> >>
> >> I don't know why, but this will change revision hashes on all
> >> changesets, including those from before the file was added.
> >
> > Given you're part of crew now, you ought to know why:
> >
> > The changeset id is a hash of every byte in the given changeset AND
> > its parents. You can't change any byte anywhere in history without
> > breaking it.
>
> I do know how a hash chain works, I study cryptography for a living :-)
>
> But that doesn't really answer the question: the convert extension ought
> to be able to regenerate the first (unaffected) changesets without
> changing their hash values.
>
> It is a session like this that surprices me:
>
> % hg init repo1
> % touch repo1/a
> % hg add repo1/a
> % hg commit -d '0 0' -u 'u' -m 'm' repo1/a
> % hg log repo1
> changeset: 0:c8d4479642ea
> tag: tip
> user: u
> date: Thu Jan 01 00:00:00 1970 +0000
> summary: m
>
> % hg convert repo1 repo2
> initializing destination repo2 repository
> scanning source...
> sorting...
> converting...
> 0 m
> % hg log repo2
> changeset: 0:46c292dd1443
> tag: tip
> user: u
> date: Thu Jan 01 00:00:00 1970 +0000
> summary: m
>
> My guess is that the convert extension add some extra meta data stuff to
> the changeset when converting it.
>
> --
> Martin Geisler
>
There seems to be an under-documented option 'convert.hg.saverev' that
could be used to preserve hashes when converting. The help for convert
says that it allows "target to preserve source revision ID" and it is
enabled by default. Turning it off keeps the hashes intact, but I'm sure
the side affect is that once you hit a change like removing a file you
lose the ability to figure out what changeset in the original repo maps
to the new one.
% hg convert --config convert.hg.saverev=false foo foo2
initializing destination foo2 repository
scanning source...
sorting...
converting...
2 foo
1 bar
0 baz
% hg tip -R foo
changeset: 2:51218615ede7
tag: tip
user: test
date: Wed Feb 25 08:25:36 2009 -0500
summary: baz
% hg tip -R foo2
changeset: 2:51218615ede7
tag: tip
user: test
date: Wed Feb 25 08:25:36 2009 -0500
summary: baz
Maybe the help should state disabling saverev can preserve hashes up to
a change in the repo history? Or is it more tricky than that?
[Reply to list this time...]
More information about the Mercurial
mailing list