append-onlyness enforced with

Adrian Buehlmann adrian at cadifra.com
Fri Oct 19 09:05:48 UTC 2012


On 2012-10-19 01:44, Benito van der Zander wrote:
> Hi,
> on the ext* file systems you can set the extended "a"-attribute on files 
> to mark them as "append-only".
> Then you can append data just as normally, but no one can delete or 
> modify the existing content, not even root.
> 
> Since the Mercurial files format is (said to?) also append-only, it 
> seems to be a good idea to enable that attribute
> for all files in the .hg/store directory.
> 
> Or are there any problems that this would cause?
> (except the fact that you need root-permissions to enable that attribute)

Since I am a die-hard Windows user I may not be the perfect person to
answer a question about ext file systems and append-only.

But I happen to know the Mercurial store quite well. So let me add some
notes here.

First, the "append-only" nature of Mercurial is IMHO just a first-order
approximation. As usual, the devil is in the details.

Why? Because Mercurial "inlines" small files in the store.

What is inlining?

Mercurial uses a data structure called revlog [1]. revlog uses a file
pair in the store for each tracked file foo: foo.d and foo.i. The foo.i
file is an index into foo.d.

The revlog format was revised [2] for the 0.9 release to mangle
("inline") the contents of the *.d file into the *.i file for smaller
revlogs, so there is no *.d file in that case. A motivation for this is
that it reduces the number of files and thus speeds up operations that
need to access lots of files (like copying a repo).

However, this change introduced a state transition: from inlined to
non-inlined. On that transition, the *.i file is _rewritten_, splitting
out the inlined data into a separate *.d file. This is done by writing
the new *.i file into a temporary file, deleting the old *.i file and
moving (aka renaming) the new file to the old name. Delete and rename
will fail if the append-only flag was set for the directory.

What's more, Mercurial uses hardlinks on file systems that support it
(which includes NTFS on Windows) when cloning [3]. When you modify a
clone by committing changes to it, Mercurial checks the hardlink count
of each file before appending to it. If the count is two or higher, it
breaks the hardlink before writing to the file. Breaking the hardlink is
done by copying the contents of the file to a temporary file, deleting
the original file (which drops one hard-link) and renaming the temp-file
to the old name of the file. Deleting and renaming will fail again if
the append-only flag was set on the directory.

[1] http://mercurial.selenic.com/wiki/Revlog
[2] http://mercurial.selenic.com/wiki/RevlogNG
[3] http://mercurial.selenic.com/wiki/HardlinkedClones




More information about the Mercurial mailing list