[PATCH 1 of 2 V2] log: flush stdout/err for the first log entries
Yuya Nishihara
yuya at tcha.org
Sat Sep 26 08:03:25 UTC 2015
On Fri, 25 Sep 2015 13:08:09 -0500, Matt Mackall wrote:
> On Thu, 2015-09-24 at 16:11 -0700, Durham Goode wrote:
> > On 9/24/15 3:51 PM, Matt Mackall wrote:
> > > On Thu, 2015-09-24 at 13:02 -0700, Durham Goode wrote:
> > >> On 9/24/15 12:49 PM, Matt Mackall wrote:
> > >>> On Thu, 2015-09-24 at 12:20 -0700, Durham Goode wrote:
> > >>>> # HG changeset patch
> > >>>> # User Durham Goode <durham at fb.com>
> > >>>> # Date 1443118388 25200
> > >>>> # Thu Sep 24 11:13:08 2015 -0700
> > >>>> # Node ID 7a9aaaadaed9887a956cb6a1e79be527e6f4a20f
> > >>>> # Parent a0eff7ebc2aebb32cf4c8da276104102ff37d786
> > >>>> log: flush stdout/err for the first log entries
> > >>>>
> > >>>> There have been problems with the pager where we get no results until the python
> > >>>> buffer has been filled. Let's fix that by manually flushing the buffer for the
> > >>>> first 50 commits from log. 50 was chosen because it will likely fill the users
> > >>>> screen and doesn't introduce a significant overhead.
> > >>>>
> > >>>> The overhead of this is negiligble. I see no perf difference when running log on
> > >>>> 100,000 commits.
> > >>> Still confused as to why this isn't in the ui code.
> > >> This is in addition to the ui code. I have a patch that adjusts the ui
> > >> code to flush every N seconds, but it causes some tests to fail (since
> > >> they didn't on the timing of the progress bar output) so I sent these
> > >> patches first.
> > >>
> > >> We still need these 2 patches though. Imagine a scenario where you run
> > >> hg log on a directory that has only had one commit total (and it was
> > >> recent). Immediately one commit is printed into the buffer (before the
> > >> flush time limit is hit). Then half an hour goes buy while we scan the
> > >> rest of the repository for another commit in that directory. We end up
> > >> not showing the user that commit until the command is completely
> > >> finished. Printing the first N units prevents this experience.
> > > Right. You can't make the timer thing really work without a thread or
> > > something. So given that's known to be somewhat broken and that this
> > > flushing problem isn't in any way log-specific, why don't we just put a
> > > counter and a flush in ui.write()?
> > >
> > > I just did this, saw a pretty repeatable 7.2->7.3s change for hg log on
> > > the hg repo:
> > >
> > > diff -r b80b2ee71a08 mercurial/ui.py
> > > --- a/mercurial/ui.py Thu Sep 24 00:34:15 2015 -0700
> > > +++ b/mercurial/ui.py Thu Sep 24 17:45:42 2015 -0500
> > > @@ -104,6 +104,7 @@
> > > self._ucfg = config.config() # untrusted
> > > self._trustusers = set()
> > > self._trustgroups = set()
> > > + self._outcount = 0
> > > self.callhooks = True
> > >
> > > if src:
> > > @@ -617,6 +618,9 @@
> > > else:
> > > for a in args:
> > > self.fout.write(str(a))
> > > + if self._outcount < 50:
> > > + self.fout.flush()
> > > + self._outcount += 1
> > >
> > > def write_err(self, *args, **opts):
> > > self._progclear()
> > >
> > > ("hg log -k sullivan -T." is a pretty good behavior test here.)
So we want line-buffering for first 50 lines?
I've timed it with line-buffered stdout. It's slightly slower than fully-
buffered io, but perhaps we don't care it if we use the pager.
% time env LANG=C HGRCPATH=/dev/null ./hg log --config extensions.pager= \
--config pager.pager=cat --pager=always --config ui.formatted=True \
-r0:10000 > /dev/null
fully-buffered: 1.88s user 0.06s system 100% cpu 1.940 total
line-buffered: 2.09s user 0.10s system 101% cpu 2.150 total
diff --git a/hgext/pager.py b/hgext/pager.py
--- a/hgext/pager.py
+++ b/hgext/pager.py
@@ -66,6 +66,14 @@ from mercurial.i18n import _
testedwith = 'internal'
def _pagersubprocess(ui, p):
+ # XXX hack to initialize buffering mode of stdout to the default of tty
+ # (i.e. IOLBF) before dup2()
+ sys.stdout.write('\r')
+ sys.stdout.flush()
+ # or call setvbuf() explicitly (needs this if stdout is /dev/null)
+ #import _chgutil
+ #_chgutil.setfilebufmode(sys.stdout, _chgutil.IOLBF)
+
pager = subprocess.Popen(p, shell=True, bufsize=-1,
close_fds=util.closefds, stdin=subprocess.PIPE,
stdout=sys.stdout, stderr=sys.stderr)
More information about the Mercurial-devel
mailing list