Fw: Betr: Re: Many missing revlogs, do we have a problem?
Simon King
simon at simonking.org.uk
Fri Jun 19 10:49:00 UTC 2015
On Fri, Jun 19, 2015 at 10:21 AM, Alban Hertroys
<alban.hertroys at apollovredestein.com> wrote:
> Matt Mackall <mpm at selenic.com> wrote on 18/06/2015 20:40:23:
> On Thu, 2015-06-18 at 10:02 +0200, Alban Hertroys wrote:
>> > Alban Hertroys/NL/AVBV wrote on 18/06/2015 09:57:15:
>> >
>> > > Simon King <simon at simonking.org.uk> wrote on 17/06/2015 18:11:06:
>> >
>> > > > In your (backup of the) master repository, do you have a file
> called
>> > > > .hg/store/data/fncache?
>> >
>> > > Nope.
>> >
>> > Scratch that, now that I've had coffee I can see that file. However,
> it's
>> > in the store/ directory instead of data/. The same goes for my local
> copy
>> > that does verify OK-ish.
>>
>> If Mercurial says your problem is that thousands of revlogs (aka files)
>> have gone missing, it's probably not lying. Just take one of the
>> filenames, check its existence on the client and server. They live
>> in .hg/store/data. Since Mercurial never deletes these files.. the cause
>> is probably elsewhere.
>
> That doesn't appear to be true.
>
> With the first random one I picked, it appears that the entire directory
> is missing from the rev-log:
> data/CustomerRelatedRep/prodlin.mas.i at 0: missing revlog!
>
> There is a file:
> "T:\ibi\apps\.hg\store\data\_customer_related_rep\prodlin.mas.i"
>
fncache encodes filenames to cope with various filesystem oddities,
such as reserved names (COM1 and LPT aren't valid filenames on
Windows), case insensitivity (a repository might contain 2 files,
FILE1 and file1, but Windows filesystems treat those as the same
filename). Details are at
https://mercurial.selenic.com/wiki/fncacheRepoFormat
> The second random file however, does exist:
> data/1_aandp/yfabbb.acx.i at 0: missing revlog!
>
> There is a file:
> "T:\ibi\apps\.hg\store\data\1__aandp\yfabbh.acx.i"
>
> In fact, it turns out that ALL the files that hg verify complains about
> are in fact there!
>
Somehow, your fncache file was rewritten with all those revlogs
missing. I was doing some testing and I discovered that old versions
of mercurial (I tested with 2.5.2) will rewrite the fncache file
during "hg verify" if the revlogs are missing. This test script shows
"hg verify" failing even after a missing revlog has been restored:
#!/bin/bash
export HGRCPATH=/dev/null
rm -rf verifytest
hg -q --version
echo
echo "### Setting up test repository"
hg init verifytest
cd verifytest
touch somefile
hg add somefile
hg ci -u test -m commit
# pretend a revlog gets lost
echo
echo "### Verifying corrupted repository"
mv .hg/store/data/somefile.i .hg/store/data/somefile.i.bak
hg verify
# restore the revlog
echo
echo "### Verifying fixed repository"
mv .hg/store/data/somefile.i.bak .hg/store/data/somefile.i
hg verify
Output:
Mercurial Distributed SCM (version 2.5.2)
### Setting up test repository
### Verifying corrupted repository
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
data/somefile.i at 0: missing revlog!
0: empty or missing somefile
somefile at 0: b80de5d13875 in manifests not found
1 files, 1 changesets, 0 total revisions
3 integrity errors encountered!
(first damaged changeset appears to be 0)
### Verifying fixed repository
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
data/somefile.i at 0: missing revlog!
1 files, 1 changesets, 1 total revisions
1 integrity errors encountered!
(first damaged changeset appears to be 0)
More recent versions of mercurial don't have the same problem:
Mercurial Distributed SCM (version 3.3.2)
### Setting up test repository
### Verifying corrupted repository
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
data/somefile.i at 0: missing revlog!
0: empty or missing somefile
somefile at 0: b80de5d13875 in manifests not found
1 files, 1 changesets, 0 total revisions
3 integrity errors encountered!
(first damaged changeset appears to be 0)
### Verifying fixed repository
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
1 files, 1 changesets, 1 total revisions
> How I figured that out:
> I applied character replacements for [A-Z_] and for '/.' to a store/data
> file-listing that I wrote to a file and sorted the results.
> Then I stripped the hg verify output down to the file-names.
> A diff between the two files showed that there are no differences!
> This was a step-by-step process (done in ViM) I saw the number of
> differences reduce during the process of replacing and sorting.
>
> So it looks like Mercurial actually _can_ be wrong about missing revlogs,
> at least if it does indeed mean that the file isn't there.
>
>
> So far it looks like the fncache got truncated and apparently that can
> cause these symptoms. I'm just echoing what I've been told here, my
> knowledge on this topic is fairly limited - you're the expert.
>
> There is a journal backup file (that's the file that we were having
> locking issues with! Related?), but that one's even smaller than the
> fncache file itself. Is that normal?
>
> I'm starting to suspect that something happened to the fncache file and
> that it got "repaired" using a damaged journal or something...
>
> Is there some way to recreate the fncache in an existing clone?
>
Adrian's answer is the official one - clone the repository using "hg
clone --pull". It is *probably* safe to take the fncache from the
clone and put it back in the original repo.
>From this and your previous emails about file locking, it seems that
you have a *very unreliable* setup. Mercurial is pretty robust, but it
does generally assume that the underlying filesystem is sane. If I
were you I would be looking for ways to fix that. For example, it
seems dangerous that your central repository is also your deployment
repository. Could you separate those two tasks?
Simon
More information about the Mercurial
mailing list