D12413: stringutil: try to avoid running `splitlines()` only to get first line

martinvonz (Martin von Zweigbergk) phabricator at mercurial-scm.org
Fri Mar 25 16:55:11 UTC 2022


martinvonz created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  It's wasteful to call `splitlines()` and only get the first line from
  it. However, Python doesn't seem to provide a built-in way of doing
  just one split based on the set of bytes used by `splitlines()`. As a
  workaround, we do an initial split on just LF and then call
  `splitlines()` on the result. Thanks to Joerg for this suggestion. I
  didn't bother to also split on CR, so users with old Windows editors
  (or repos created by such editors) will not get this performance
  improvement.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D12413

AFFECTED FILES
  mercurial/utils/stringutil.py

CHANGE DETAILS

diff --git a/mercurial/utils/stringutil.py b/mercurial/utils/stringutil.py
--- a/mercurial/utils/stringutil.py
+++ b/mercurial/utils/stringutil.py
@@ -687,6 +687,10 @@
 
 def firstline(text):
     """Return the first line of the input"""
+    # Try to avoid running splitlines() on the whole string
+    i = text.find(b'\n')
+    if i != -1:
+        text = text[:i]
     try:
         return text.splitlines()[0]
     except IndexError:



To: martinvonz, #hg-reviewers
Cc: mercurial-patches, mercurial-devel


More information about the Mercurial-devel mailing list