[PATCH] Add script to rewrite manifest to workaround lack of parent deltas
Benoit Boissinot
benoit.boissinot at ens-lyon.org
Thu Aug 20 23:04:03 UTC 2009
On Thu, Aug 20, 2009 at 05:20:24PM -0400, Greg Ward wrote:
> # HG changeset patch
> # User Greg Ward <greg-hg at gerg.ca>
> # Date 1233047576 0
> # Node ID 7e0bbea3935b1044d3c5acfecae8005941dfc8ec
> # Parent 2484868cffde3893e3fafb8e515d396346b87e17
> Add script to rewrite manifest to workaround lack of parent deltas.
Thanks for cleaning it up. Some comments below.
> +
> +def good_sort(rl):
maybe toposort() ?
> + write = sys.stdout.write
> +
> + children = {}
> + root = []
> + # build children and roots
> + write('reading %d revs ' % len(rl))
> + #for i in revs:
> + i = 0
> + while i < len(rl):
You can directly iterate on the revs:
for i in rl:
> + children[i] = []
> + parents = [p for p in rl.parentrevs(i) if p != -1]
> + for p in parents:
> + assert p in children
> + if len(parents) == 0:
> + root.append(i)
> + else:
> + for p in parents:
> + children[p].append(i)
The following is simpler:
children[i] = []
parents = [p for p in rl.parentrevs(i) if p != -1]
for p in parents:
assert p in children
children[p].append(i)
if len(parents) == 0:
root.append(i)
> + # XXX this is a reimplementation of the 'branchsort' topo sort
> + # algorithm in hgext.convert.convcmd... would be nice not to duplicate
> + # the algorithm
> + write('sorting ...')
> + visit = root
> + ret = []
> + while visit:
> + i = visit.pop(0)
Maybe it's cleaner to pop from the end
i = visit.pop()
> + ret.append(i)
> + if i not in children:
> + # this only happens if some node's p1 == p2, which can happen in the
> + # manifest in certain circumstances
> + break
break or continue ?
> + next = []
> + for c in children.pop(i):
> + parents_with_child = [p for p in rl.parentrevs(c) if p != -1 and p in children]
parents_unseen is maybe better, we don't care if they have children, but
we care if we already visited them.
> + if len(parents_with_child) == 0:
> + next.append(c)
> + visit = next + visit
if you pop from the end, then you can do:
visit += next
> +def main():
> +
> + # unbuffer stdout for nice progress output
> + sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
> + write = sys.stdout.write
> +
> + # Open the local repository
> + ui = ui_.ui()
> + repo = hg.repository(ui)
> +
> + indexfn = repo.join('store/00manifest.i')
> + datafn = indexfn[:-2] + '.d'
> + if not os.path.exists(datafn):
> + sys.exit('error: %s does not exist: manifest not big enough '
> + 'to be worth shrinking' % datafn)
> +
> + (tmpfd, tmpindexfn) = tempfile.mkstemp(
> + dir=repo.join('store'), prefix='00manifest.', suffix='.i')
I found it a bit cleaner to split after at least one arg.
> + tmpdatafn = tmpindexfn[:-2] + '.d'
> + os.close(tmpfd)
> +
> + r1 = revlog.revlog(util.opener(os.getcwd(), audit=False), indexfn)
> + r2 = revlog.revlog(util.opener(os.getcwd(), audit=False), tmpindexfn)
> +
> + # Don't use repo.transaction(), because then things get hairy with paths:
> + # some need to be relative to .hg, and some need to be absolute. Doing it
> + # this way keeps things simple: everything is an absolute path.
> + lock = repo.lock(wait=False)
> + tr = transaction.transaction(
> + sys.stderr.write, open, repo.join('store/journal'))
ditto
> +
> + try:
> + order = good_sort(r1)
> + write_revs(r1, r2, order, tr)
> + report_shrinkage(datafn, tmpdatafn)
> + tr.close()
> + except:
> + # abort transaction first, so we truncate the files before deleting them
> + tr.abort()
> + if os.path.exists(tmpindexfn):
> + os.unlink(tmpindexfn)
> + if os.path.exists(tmpdatafn):
> + os.unlink(tmpdatafn)
> + raise
> + finally:
> + lock.release()
Is the non-nested except/try/finally possible with python 2.4 ?
regards,
Benoit
--
:wq
More information about the Mercurial-devel
mailing list