out of memory, rollback not good?
Matt Mackall
mpm at selenic.com
Mon Jul 18 04:42:59 UTC 2011
On Sun, 2011-07-17 at 18:51 -0700, rupert.thurner wrote:
>
> On Jul 13, 8:55 am, Dan Villiom Podlaski Christiansen
> <dan... at gmail.com> wrote:
> > On 13/07/2011, at 08.05, Steve Borho wrote:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > > On Wed, Jul 13, 2011 at 12:49 AM,rupert.thurner
> > > <rupert.thur... at gmail.com> wrote:
> > >> because somebody posted on the users mailing list that cloning the gcc
> > >> ssh repository is impossible as it gets an error, i was curios and i
> > >> tried.
> >
> > >> hg clone svn://gcc.gnu.org/svn/gcc/trunk gcc
> > >> ...
> > >> [r143562] dodji: Reverted commit 143546 related to PR c++/26693
> > >> transaction abort!
> > >> rollback completed
> > >> abort: out of memory
> > [snip]
> >
> > > The Python-SVN bindings have bad memory leaks in them. Never try to
> > > convert a large SVN repository in one command.
> >
> > > % hg clone -r 1 svn://gcc.gnu.org/svn/gcc/trunk gcc
> > > % cd gcc ; hg pull
> >
> > First of all, the bug you're seeing is in hgsubversion, not Mercurial as such. I've added our list to the Cc.
> >
> > Second, it's not really a bug in hgsubversion as such, but a bug in the Subversion SWIG bindings. One way to work around it is the clone/pull dance suggested by Steve. Another is to install Subvertpy[1] — although our wrapper for it isn't quite as stable as the SWIG wrapper, at least it doesn't leak. (Unless you access repositories directly using ‘file://…’ URLs, unfortunately.)
> >
> > [1] <http://pypi.python.org/pypi/subvertpy>
>
> are you sure its the wrapper, or the usage of it? i tried to start
> cloning with valgrind (http://valgrind.org/) and, while the memory
> consumption seems huge, it does not complain.
Valgrind is probably useless for detecting leaks here. A "leak" in a
garbage-collecting language like Python is a very different beast than a
typical leak in a C program.
In C, a leak generally occurs (and Valgrind detects it) when a piece of
memory goes out of usage without free() being called. This often happens
by simply overwriting a pointer to allocated memory with another pointer
so that no pointers to it exist anymore.
But this can't happen in Python. When an object ceases to have pointers
to it, it's automatically freed. Instead, a leak occurs when a reference
to memory is placed in a data structure (eg appended to a list) that
keeps it active and never removed. This 'leak' thus happens on a
conceptually higher level than the typical C leak. Python's garbage
collector will continue to regularly visit the memory to see if it can
be freed (thus fooling Valgrind), but will never be able to. When Python
exits, it will delete its reference to the leaking data structure, and
everything will be completely cleaned up and the only way to know
anything's wrong is to run out of memory.
> ==15203== still reachable: 522,807,894 bytes in 10,361 blocks
..and there you go. 500M of data just hanging around in data structures
that never get cleaned out.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list