Fixing .hg file open ordering

Durham Goode durham at fb.com
Wed Nov 2 22:30:58 UTC 2016



On 11/2/16 3:20 PM, Bryan O'Sullivan wrote:
> There's long been a well-defined order for accessing historical data: 
> changelog first, then manifest, then revlog, and the reverse for writes.
Yea, I guess I'm proposing formalizing these dependencies into a single 
location in code so we cannot mess it up on accident.
>
> I think that what has happened with bookmarks is that we literally 
> forgot about the next necessary ordering constraint: you must read 
> bookmarks before the changelog, otherwise you run into the race we see 
> here. This same constraint should apply to any other file that 
> contains commit hashes.
>
> It's unfortunate that the bmstore API is "backwards" in this way, but 
> what's the sequence of operations by which bookmarks seem to 
> disappear? There's something missing from your description to help 
> piece the sequence of events together, because a bookmark that points 
> to a commit that is not yet known will be explicitly ignored in the 
> bmstore constructor.
>
> Here's what I *think* must be the sequence:
>
> I read the changelog.
> I'm asked to look up the foo bookmark.
> Someone else writes out some commits and a new bookmarks file.
> I open the bookmarks file, which was written *after* the last 
> changelog write.
> The newly written bookmarks file contains a version of foo that points 
> at a commit that I haven't yet seen, and because the bmstore 
> constructor ignores it, it appears to have vanished.
Yes, I tried to say this in my first paragraph, though not as clearly as 
I could have.
>
> I think this will be a little harder to fix than you may be anticipating.
>
> You can fix the read path by reading the bookmarks before the 
> changelog, but the write path is subtle. Here's why.
>
> Bookmarks are stored in memory. If I'm being asked to write out a new 
> bookmarks file as part of a transaction, what happens in this 
> interleaving?
>
> I read bookmarks
> You write bookmarks
> I write bookmarks, based off my stale understanding
>
> In other words, I'll potentially cause your write to go missing unless 
> I validate during the transaction that the bookmarks haven't changed 
> since I read them before I write them.
Bookmark writes are now within the wlock, and any inmemory read you did 
prior to taking the lock will be invalidated (during the next 
repo._bookmarks access) and thrown away when you take the lock. So once 
you're in the lock you are guaranteed to get the latest version off 
disk, and guaranteed that no one else will modify it until you release 
the lock.

So your sequence becomes:

I read bookmarks
You write bookmarks
You release the lock
I take the lock
I read bookmarks again
I write bookmarks
I release the lock



More information about the Mercurial-devel mailing list