SHA1 replacement steps

Joerg Sonnenberger joerg at bec.de
Mon Feb 15 14:18:11 UTC 2021


Hello all,
to help the review process along, here are the rough steps I see in
preparation for supporting 256bit hashes:

(1) Move the current 160bit constants from mercurial.node into a
subclass. Instead of a global constant, derive the correct constant from
the repo/revlog/... instance and pass it down as necessary. The API
change itself is in D9750. The expectation for this step is that a
repository has one hash size and one set of magic values, but it doesn't
change anything regarding the hash function itself. A follow-up change
is necessary to replace the global constants (approximately D9465 minus
D9750).

(2) Adjust various on-disk formats to switch between the current 160bit
and 256bit encoding based on the node constants in use. This would be a
non-functional change for existing repositories.

(3) Introduce the tagged 256(*) hash function. My plan here is to use
Blake2b configured for 248bit output and a suffix of b'\x01'. It is a
bit wasteful to reserve 8bit for the tag, but simplifies code. Biggest
downside is that the full Blake2b support is not available in Python 2.

The tag would allow different hash functions to co-exist and embed
existing SHA1 hashes by zero padding.

(4) Adjust hash verification logic to derive the hash function from the
tag of a node, not just hard-coding it.

At the end of step 4, most repositories can be converted in a mostly
transparent way. Some additional changes might be necessary for allowing
"short" node ids for things like .hgtags, but overall, existing hashes
should just continue to work as before.

Joerg


More information about the Mercurial-devel mailing list