[PATCH 4 of 4 stable] py3: stop using block-buffered stdout on Python 3

Manuel Jacob me at manueljacob.de
Fri Jun 5 09:44:51 UTC 2020


# HG changeset patch
# User Manuel Jacob <me at manueljacob.de>
# Date 1591333053 -7200
#      Fri Jun 05 06:57:33 2020 +0200
# Branch stable
# Node ID 92af1f5877d02bd79420c04b1177accce27a61ce
# Parent  618173e6b7115a93588b7f1a62f9b9b8460ca84b
# EXP-Topic streams
py3: stop using block-buffered stdout on Python 3

The previous code (added in 227ba1afcb65) incorrectly assumed that
sys.stdout.buffer is line-buffered. However the interpreter initializes it
with a block-buffered stream or an unbuffered stream (when the -u option or
the PYTHONUNBUFFERED environment variable is set), never with a line-buffered
stream.

To see the difference with the hg command, you need to do something slow, e.g.
`hg pull https://www.mercurial-scm.org/repo/hg`. On Python 3 on Linux, without
this patch, all output is printed right before the command ends. With this
patch, output is printed almost immediately.

I did some quick and dirty benchmarking, printing on my terminal.


Unfavorable case:
* for 100 times:
  * for 1000 times:
    * call `ui.write(b'x')`
  * call `ui.write(b'\n')`
* ui.flush()

Results:
CPython 2.7 is unaffected (line-buffered): wall 0.231795 comb 0.230000 user 0.230000 sys 0.000000
CPython 3.8 before patch (block-buffered): wall 0.190882 comb 0.190000 user 0.190000 sys 0.000000
CPython 3.8 after patch (unbuffered):      wall 0.654466 comb 0.650000 user 0.480000 sys 0.170000
CPython 3.8 hand-rolled line-buffered [*]: wall 0.218919 comb 0.220000 user 0.220000 sys 0.000000


Favorable case:
* for 100000 times:
  * call `ui.write(b'Testing write performance\n')`

Results:
CPython 2.7 is unaffected (line-buffered): wall 0.717060 comb 0.710000 user 0.510000 sys 0.200000
CPython 3.8 before patch (block-buffered): wall 0.255527 comb 0.260000 user 0.180000 sys 0.080000
CPython 3.8 after patch (unbuffered):      wall 0.616500 comb 0.620000 user 0.450000 sys 0.170000
CPython 3.8 hand-rolled line-buffered [*]: wall 0.811805 comb 0.810000 user 0.570000 sys 0.240000


[*] For this, I implemented a simplified version of what is suggested in the
TODO: stdout was set to block-buffered and ui.write() flushed the stream if
the output contains a newline character.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -55,12 +55,19 @@ def isatty(fp):
 # on Windows).
 # Python 3 rolls its own standard I/O streams.
 if isatty(stdout):
+    # TODO: Consider mitigating the performance impact of using unbuffered
+    # writes on Python 3 and Windows, e.g. by using fully buffered streams and
+    # ensuring on a higher level that lines are flushed.
     if pycompat.iswindows:
         # Windows doesn't support line buffering
         stdout = os.fdopen(stdout.fileno(), 'wb', 0)
-    elif not pycompat.ispy3:
-        # on Python 3, stdout (sys.stdout.buffer) is already line buffered and
-        # buffering=1 is not handled in binary mode
+    elif pycompat.ispy3:
+        # Python 3 doesn't support line buffering on binary streams. Use the
+        # underlying unbuffered raw stream.
+        # TODO: After Mercurial 5.4, use this code path on Python 3 on Windows.
+        # We shouldn't risk changing working behavior in the stable branch.
+        stdout = stdout.raw
+    else:
         stdout = os.fdopen(stdout.fileno(), 'wb', 1)
 
 if pycompat.iswindows:




More information about the Mercurial-devel mailing list