What happens if we have a hash collision
Isaac Jurado
diptongo at gmail.com
Sat May 8 00:21:00 UTC 2010
Replying Jesus Cea:
>
> With regular commands (like -r parameter, or "hg log", etc), Mercurial
> only shows 48 bits of the hash.
>
> According to birthday paradox, having a few thousands of changesets
> will have a pretty high (statistically) probability of collision, if
> we only use 48 bits from the hash:
> <http://en.wikipedia.org/wiki/Birthday_problem>.
>
> I know internally mercurial uses 160 bits (for instance, in tags), but
> what it could happen if I do a "hg log" or a "hg pull -r" with a
> truncated hash with a collision?.
>
> Does Mercurial recognize the fact and force you to use the 160 bits in
> that case?.
Quick answer: yes. If you do programming for a living (or study), you
may want to keep reading.
I know by own experience, as a lazy ass, that asking is much easier and
more comfortable than researching. Now I think I almost learnt the
lesson so I would like to enlighten you towards the same practices.
If you start to follow mercurial code from commands.py (looking for the
log function), a careful read will bring you to the _partialmatch method
in revlog.py. There you can see how when a nodeid specified using less
than 40 characters is searched through the revlog index. Then if
multiple matching entries are found, an exception is raised with the
message "ambiguous identifier".
Cheers.
--
Isaac Jurado
"The noblest pleasure is the joy of understanding."
Leonardo da Vinci
More information about the Mercurial
mailing list