[PATCH 02 of 11 V2] procutil: ensure that all stdio file objects are flushed at interpreter exit

Manuel Jacob me at manueljacob.de
Mon Jul 13 15:14:31 UTC 2020


On 2020-07-13 13:27, Yuya Nishihara wrote:
> On Mon, 13 Jul 2020 00:41:27 +0200, Manuel Jacob wrote:
>> # HG changeset patch
>> # User Manuel Jacob <me at manueljacob.de>
>> # Date 1594346556 -7200
>> #      Fri Jul 10 04:02:36 2020 +0200
>> # Node ID a0ddc1349dde0f5849d196236c3fa31d49934511
>> # Parent  2668345a92975b843cab2e766b8fa59181003772
>> # EXP-Topic stdio
>> procutil: ensure that all stdio file objects are flushed at 
>> interpreter exit
>> 
>> On Python 2 on Unix, we open a second file object referring to the 
>> stdout file
>> descriptor. If this file object gets garbage collected before 
>> sys.stdout (which
>> is very likely), the file descriptor gets closed and sys.stdout has no 
>> chance
>> to flush buffered data. Therefore, we flush them explicitly before the 
>> file
>> objects get garbage collected and the file descriptor gets closed.
>> 
>> diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
>> --- a/mercurial/utils/procutil.py
>> +++ b/mercurial/utils/procutil.py
>> @@ -50,6 +50,27 @@
>>          return False
>> 
>> 
>> +def reopen_stream(orig, mode, buffering):
>> +    import atexit
>> +
>> +    assert not pycompat.ispy3
>> +    # On Python 2, there's no way to create a file such that the file
>> +    # descriptor doesn't get closed if the file object gets garbage 
>> collected.
>> +    # Therefore, when reopening an already open stream, the file 
>> descriptor
>> +    # will be closed twice. The error message for that will be 
>> swallowed during
>> +    # interpreter shutdown.
>> +    new = os.fdopen(orig.fileno(), mode, buffering)
>> +
>> +    @atexit.register
>> +    def flush_streams():
>> +        # Ensure that one stream gets flushed before the other closes 
>> the file
>> +        # descriptor.
>> +        new.flush()
>> +        orig.flush()
>> +
>> +    return new
> 
> Didn't we agree that the use of atexit wouldn't buy and we'll instead 
> add
> py3-only solution later?
> 
> As I said, atexit will increase the complexity of the control flow. All 
> data
> should be flushed inside dispatch.run().

We talked past each other, then. What I meant is that for stderr it’s 
not worth it because 1) we have other reasons to back it out, 2) stderr 
may be used for tracebacks etc. even after the atexit hook. For stdout 
it’s not as important because stdout is mostly written to through the ui 
object, but still there are some contrib scripts etc. using it directly. 
I don’t see any problem using atexit for this particular use case, but 
since I didn’t invest further time trying to find a case where we need 
it for stdout (and the stderr problem was solved otherwise), I’d be fine 
dropping the patch. On Python 3, we don’t need to do anything. What do 
you think?



More information about the Mercurial-devel mailing list