Corruption issue from filesystem exception.

Matt Mackall mpm at selenic.com
Tue Feb 3 17:26:06 UTC 2009


On Mon, 2009-02-02 at 23:05 +0100, Sune Foldager wrote:
> Greetings
> 
> I have been using Mercurial in large-scale test in my company for some  
> weeks now, and mostly things work great :-). We do, however, have a  
> few issues with unhandled exceptions which, depending on where they  
> are raised, can lead to local repository corruption.
> 
> The exception message originates from Windows, I think, and goes  
> something like "Cannot create a file when that file already exists.".  
> This is rendered by hg with "abort: " in front, of course. It happens  
> irregularily during the following situations: commit, pull and,  
> unfortunately, rebase. The latter corrupts the repository.

It would of course be very helpful to actually see the command output.
Do you get a traceback?

Do you have copies of any of the damaged repos still?

> Commit: The new changeset _is_ actually created, but the working  
> directory is unchanged, the modified files are still marked modified  
> and the wdir parent isn't moved forward. Repeating the commit usually  
> works (and of course creates a new almost identical commit).  
> Alternatively, an update -C also usually works.
> 
> Pull: Using --verbose I can see that the problem occurs while writing  
> the revision data to the individual .hg/store files. It seems pretty  
> random. It almost always works to just pull again.
> 
> Rebase: Not sure where it happens, but I am guessing during a strip  
> leading to failed update of some of the involved files. Afterwards,  
> there is often corruption. Usually, the file revisions are off-by-one  
> compared to the changelog. It can be fixed with a hex editor, but  
> that's somewhat uncool of course :-p.
> 
> The last problem has so far only been seen on two machines, both very  
> slow laptops. The other problems, while also rare, occur on several  
> machines. It would seem to be a race condition somewhere, but I am not  
> quite sure where, and I have not been able to reproduce it with more  
> debug-information so far.
> 
> Since this is a bit of a show-stopper for us in using mercurial full- 
> scale (at least with rebase), I hope very much we can hunt this bug  
> down. Any advice is very welcome, and I will try to reproduce the  
> errors as well. We use an almost-crew-tip mercurial + an almost-tip  
> TortoiseHg on Windows Vista mainly.

It's hard for race conditions to occur in Mercurial. It's
single-threaded and in normal operation, only one program is ever trying
to write in .hg at a given time. And there are of course locks to
prevent multiple writers.

Is there anything interesting about your setup? Are you using a network
filesystem? Do you have virus scanners? 

-- 
http://selenic.com : development and support for Mercurial and Linux





More information about the Mercurial-devel mailing list