[Reviewers] [PATCH rfc] rfc: call gc at exit of mercurial
Gregory Szorc
gregory.szorc at gmail.com
Wed May 4 06:51:20 UTC 2016
> On Apr 4, 2016, at 23:37, Maciej Fijalkowski <fijall at gmail.com> wrote:
>
> On Tue, Apr 5, 2016 at 8:36 AM, Pierre-Yves David
> <pierre-yves.david at ens-lyon.org> wrote:
>>
>>
>>> On 04/04/2016 10:31 PM, Maciej Fijalkowski wrote:
>>>
>>> class A(object):
>>> def __del__(self):
>>> print "del"
>>>
>>> class B(object):
>>> pass
>>>
>>> b = B()
>>> b.b = b
>>> b.a = A()
>>>
>>>
>>> This example does not call __del__ in CPython either.
>>>
>>> The __del__ is not guaranteed to be called - that's why there is a
>>> painful module finalization procedure where CPython is trying to call
>>> "as much as possible", but there are still no guarantees. If you add
>>> del b; gc.collect() you will see "del" printed. Of course this
>>> involves a cycle, but cycles can come in ways that you don't expect
>>> them and PyPy simply says "everything is GCed". I think it's very much
>>> in line with what python-dev thinks.
>>
>>
>> Which is why we have __del__ in very few object and we deploy massive effort
>> to ensure their don't get caught in cycle and mostly succeeding at this.
>> (Kind of the same we put a lot of effort into making sure __del__ are never
>> really called but keep them as double safety).
>>
>> So in the case we care about (no cycle) Cpython would call our __del__,
>> right?
>>
>> --
>> Pierre-Yves David
>
> Yes, but I would argue you can create cycles without knowing. E.g.
>
> def f():
> try:
> some_stuff
> except:
> x = sys.exc_info()
>
> creates a cycle. There are also ways to create cycles with passing
> global functions around etc.
This.
We have plenty of cycles in our code. We just don't notice them very often because "hg" processes are short-lived. And what's worse is we don't know we're introducing them unless we go looking for them, often after someone complains about a leak on a large repo.
If you want to create cycles and leak memory, I recommend "hg convert" on thousands of revisions with extensions and hooks installed. Or start a WSGI server.
One of the reasons I want to get Python 3 support is so we can use its tracemalloc module to help debug leaks. The Python 2 tools for finding cycles and leaks (such as guppy and heappy) are a bit harder to use and to integrate into our testing harness.
More information about the Reviewers
mailing list