D10715: revlog: make transaction handle revlog splits better (issue6423)
valentin.gatienbaron (Valentin Gatien-Baron)
phabricator at mercurial-scm.org
Sun May 16 19:32:44 UTC 2021
valentin.gatienbaron created this revision.
Herald added a reviewer: indygreg.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.
REVISION SUMMARY
Currently, any transaction that includes a revlog split and does not
successfully commit prevents any further write to the repository
(short of performing manual surgery), because it leaves behind a
transaction that hg doesn't know how to rollback.
This teaches hg how to handle such transactions. Specifically, any
interruption outside of a tight window (rename + two writes) results
in transactions that can be rolled back correctly. An interruption in
that window inside that window leaves a broken repository, similarly
to before this change.
One awkward thing is that if an old hg version leaves behind one of
these transactions it can't rollback, an hg with this change would
manage to roll it back, but wrongly (it would end up with non inline
revlog and .d file of size 0). That seems acceptable to me, because
this is a strange scenario (and either way, the solution is to
reclone).
Another awkward thing is that the fncache fails to be updated. This
seems acceptable, because it's not used much, and I'll take the
occcasional call to `hg debugrebuildfncache` over a broken repository.
REPOSITORY
rHG Mercurial
BRANCH
default
REVISION DETAIL
https://phab.mercurial-scm.org/D10715
AFFECTED FILES
mercurial/revlog.py
mercurial/transaction.py
tests/test-transaction-rollback-on-revlog-split.t
CHANGE DETAILS
diff --git a/tests/test-transaction-rollback-on-revlog-split.t b/tests/test-transaction-rollback-on-revlog-split.t
--- a/tests/test-transaction-rollback-on-revlog-split.t
+++ b/tests/test-transaction-rollback-on-revlog-split.t
@@ -23,19 +23,18 @@
$ hg commit -m_
transaction abort!
- rollback failed - please run hg recover
- (failure reason: attempted to truncate data/file.i to 1065 bytes, but it was already 128 bytes)
+ rollback completed
abort: pretxncommit hook exited with status 1
[40]
- $ cat .hg/store/journal | tr -s '\000' ' '
- data/file.i 1065
- data/file.d 0
- data/file.i 64
- 00manifest.i 111
- 00changelog.i 122
+Verify that rollback worked properly (the fncache being stale is a
+bug):
- $ hg recover
- rolling back interrupted transaction
- abort: attempted to truncate data/file.i to 1065 bytes, but it was already 128 bytes
- [255]
+ $ hg verify -q
+ warning: revlog 'data/file.d' not in fncache!
+ 1 warnings encountered!
+ hint: run "hg debugrebuildfncache" to recover from corrupt fncache
+ $ hg debugrebuildfncache
+ adding data/file.d
+ 1 items added, 0 removed from fncache
+ $ hg verify -q
diff --git a/mercurial/transaction.py b/mercurial/transaction.py
--- a/mercurial/transaction.py
+++ b/mercurial/transaction.py
@@ -46,6 +46,26 @@
return _active
+def _entries_to_playback(entries):
+ # On revlog split, we end up with two journal entries for the .i
+ # and .d. In these cases we want the second entries to override
+ # the first ones. This filters out the first entries.
+ usable_entries = []
+ seen_f = {}
+ for f, o in reversed(entries):
+ if f in seen_f:
+ if seen_f[f] != 1:
+ raise error.Abort(
+ _(b"unexpectedly many rollback journal entries for %s") % f
+ )
+ seen_f[f] = 2
+ else:
+ usable_entries.append((f, o))
+ seen_f[f] = 1
+ usable_entries.reverse()
+ return usable_entries
+
+
def _playback(
journal,
report,
@@ -56,7 +76,7 @@
unlink=True,
checkambigfiles=None,
):
- for f, o in entries:
+ for f, o in _entries_to_playback(entries):
if o or not unlink:
checkambig = checkambigfiles and (f, b'') in checkambigfiles
try:
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1973,7 +1973,7 @@
_(b"%s not found in the transaction") % self._indexfile
)
trindex = 0
- tr.add(self._datafile, 0)
+ trdataoffset = 0
if fp:
fp.flush()
@@ -1983,10 +1983,22 @@
self._writinghandles = None
with self._indexfp(b'r') as ifh, self._datafp(b'w') as dfh:
+ if dfh.tell() != 0:
+ # Should never happen, but may reduce confusion if something
+ # went wrong in a rollback or manual fix.
+ raise error.RevlogError(
+ _(b"%s is an inline revlog but %s is nonempty")
+ % (self._indexfile, self._datafile)
+ )
+ # write this after the dfh.tell check, otherwise the rollback caused
+ # by a failure of the check would truncate whatever is the datafile
+ tr.add(self._datafile, 0)
for r in self:
dfh.write(self._getsegmentforrevs(r, r, df=ifh)[1])
if troffset <= self.start(r) + r * self.index.entry_size:
+ # trindex is the first revision to be excluded on rollback
trindex = r
+ trdataoffset = self.start(r)
with self._indexfp(b'w') as fp:
self._format_flags &= ~FLAG_INLINE_DATA
@@ -1999,9 +2011,24 @@
e = header + e
fp.write(e)
- # the temp file replace the real index when we exit the context
- # manager
-
+ # The temp file replace the real index when we exit the context
+ # manager. When that happens, transaction rollback is temporarily
+ # broken, because the rollback journal entry for the old .i is
+ # larger than the size of the new .i (because they include the same
+ # revisions, but offset for the old .i accounts for the inline
+ # data. Except when the data is of size 0, but that should be both
+ # impossible and harmless). This gets corrected very immediately by
+ # the tr.replace below. Note that we *must* write the offset for the
+ # (new) datafile before the offset for the new .i. In this order, if
+ # we get interrupted after the first tr.replace, the rollback
+ # fails. In the other order, if we got interrupted after the
+ # tr.replace for the index, rollback would succeed but leave the
+ # repository broken (non-inline index + datafile of size 0). One
+ # problem we still have is that since this change is done outside
+ # the transaction, the fncache should get updated even on
+ # transaction rollback, but doesn't.
+
+ tr.replace(self._datafile, trdataoffset)
tr.replace(self._indexfile, trindex * self.index.entry_size)
nodemaputil.setup_persistent_nodemap(tr, self)
self._chunkclear()
To: valentin.gatienbaron, indygreg, #hg-reviewers
Cc: mercurial-patches, mercurial-devel
More information about the Mercurial-devel
mailing list