Troubleshooting SHA1 Failures with Mercurial Repositories
Paul Boddie
paul at boddie.org.uk
Sat Jun 13 13:24:00 UTC 2020
Hello,
I have very recently retired a rather old computer that has been my main
development machine for a very long time, but in the last few months it has
exhibited some unreliable behaviour in various respects. There is probably an
interesting detective story here somewhere, and I welcome insights into the
underlying system issues, but my motivation for sending this message is
obviously to assess the impacts on my Mercurial repositories.
(To skip the background, just skip the next three paragraphs!)
The cause of this unreliable behaviour became more apparent when obtaining DVD
images to use with the new computer that will now become my development
machine. Upon running md5sum, sha1sum, sha256sum or sha512sum on the
downloaded DVD image files, it was almost impossible to generate the correct
digests. Moreover, the digests were typically different on each invocation of
the chosen program on the same file, producing something new each time. And
yet, two separately downloaded copies of the same file would compare (using
the cmp program) and be shown to be identical!
Diagnosis of the situation involved writing fairly simple programs to generate
large files with predictable but varying content and then reading them back,
which seemed to yield the expected content each time. I also investigated
other message digest tools and found that the Java-based jacksum tool did
function more reliably with MD5 digests but not all of the time. OpenSSL-based
tools did not fare any better than those which presumably use the C library
digest functions. I ran memtest86+ for some time without any indication of
memory failure, and there was no obvious indication of disk failure, although
I shall aim to run more extensive smartctl tests to be sure.
Generally, I have not experienced obvious problems with my data, but I have
experienced frustration with distribution updates (Debian's apt complaining
about hash sum mismatches) and it has been largely impossible to clone large
Git repositories ("index pack failed"), although I assumed that this was just
Git making increasing demands on system capabilities (and being typically
unhelpful). I doubt that anyone else runs hardware this old - Pentium 4, 3.0
GHz, "Prescott" generation - and support for 32-bit x86 is gradually
disappearing, so I don't know what level of experience other people are likely
to have with these issues (other than remarks about the system being old and
needing replacement).
(Here comes the bit specifically related to Mercurial.)
Anyway, I find myself with Mercurial repositories that I have been updating
during periods of unreliability. On practically no occasion (or not recently,
and then maybe once) have I had a problem updating or accessing repositories,
but I wondered what kind of effects this unreliability might have had on
repository integrity. The Mercurial Wiki and other documentation does not
readily explain the implications of faulty digests, although I found the
following interesting remarks:
"The repository owner may continue committing to the heads of the repository,
but attempts to view the repository at any changeset containing the sensitive
file data will fail due to the hash mismatch (examples: hg update, hg diff, hg
annotate). "hg verify" will fail due to the hash mismatch as well."
https://www.mercurial-scm.org/wiki/CensorPlan
Now, having copied repositories to my new machine, I have successfully
verified the repositories using hg verify. However, using hg convert reveals
differing nodeids between the original and converted repositories. I have
tried hg convert with both --branchsort and --sourcesort options. Then, I have
generated readily comparable logs as follows:
hg log --template '{node}\n' > logfile
Running diff on the logs for the original and converted repositories reveals
considerable differences in nodeids for some repositories, even ones which
haven't been touched in years, but no differences for others. It appears that
--sourcesort replicates history more accurately (as suggested by the
documentation). For validation, converting the converted repositories again
(using --sourcesort) produces identical histories, as one might expect.
I suppose I am left wondering about a few things. Are such simple comparisons
of repository histories useful in assessing the prevalence of faulty nodeids?
How may faulty nodeids affect the integrity of repositories (considering the
quote about censored changesets above)? Are there any compelling practical
arguments for converting these faulty repositories if they otherwise function
apparently normally? (I realise that combining faulty and converted
repositories will result in divergence in the graph at inappropriate places.)
Sorry for the long message, but any insights would be much appreciated!
Paul
More information about the Mercurial
mailing list