[PATCH 3 of 3] streamclone: use backgroundfilecloser (issue4889)
Gregory Szorc
gregory.szorc at gmail.com
Thu Jan 14 21:45:30 UTC 2016
# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1452807841 28800
# Thu Jan 14 13:44:01 2016 -0800
# Node ID 3cc9fde87e3284c2653a25254b22f03339dbd075
# Parent d73af0bac3890fd4b7b43d888ff61718a166cbdd
streamclone: use backgroundfilecloser (issue4889)
Closing files that have been appended to is slow on Windows/NTFS.
CloseHandle() calls on this platform often take 1-10ms - and that's
on my i7-6700K Skylake processor with a modern and fast SSD. Contrast
with other I/O operations, such as writing data, which take <100us.
This means that creating/appending thousands of files can add
significant overhead. For example, cloning mozilla-central creates
~232,000 revlog files. Assuming 1ms per CloseHandle(), that yields
232s (3:52) of wall time waiting for file closes!
The impact of this overhead can be measured most directly when applying
stream clone bundles. Applying these files is effectively uncompressing
a tar archive (read: it's very fast).
Using a RAM disk (read: no I/O wait), the difference in wall time for a
`hg debugapplystreamclonebundle` for a ~1731 MB mozilla-central bundle
between Windows and Linux from the same machine is drastic:
Linux: ~12.8s (128MB/s)
Windows: ~352.0s (4.7MB/s)
Windows is ~27.5x slower. Yikes!
After this patch:
Linux: ~12.8s (128MB/s)
Windows: ~102.1s (16.1MB/s)
Windows is now ~3.4x faster. Unfortunately, it is still ~8x slower than
Linux. Profiling reveals a few hot code paths that could likely be
improved. But those are for other patches.
This patch introduces test-clone-uncompressed.t because existing tests
of `clone --uncompressed` are scattered about and adding a variation for
background thread closing to e.g. test-http.t doesn't feel correct.
diff --git a/mercurial/streamclone.py b/mercurial/streamclone.py
--- a/mercurial/streamclone.py
+++ b/mercurial/streamclone.py
@@ -305,9 +305,9 @@ def consumev1(repo, fp, filecount, bytec
start = time.time()
tr = repo.transaction(_('clone'))
try:
- if True:
+ with repo.svfs.backgroundclosing(repo.ui, expectedcount=filecount):
for i in xrange(filecount):
# XXX doesn't support '\n' or '\r' in filenames
l = fp.readline()
try:
@@ -319,9 +319,10 @@ def consumev1(repo, fp, filecount, bytec
if repo.ui.debugflag:
repo.ui.debug('adding %s (%s)\n' %
(name, util.bytecount(size)))
# for backwards compat, name was partially encoded
- with repo.svfs(store.decodedir(name), 'w') as ofp:
+ path = store.decodedir(name)
+ with repo.svfs(path, 'w', backgroundclose=True) as ofp:
for chunk in util.filechunkiter(fp, limit=size):
handled_bytes += len(chunk)
repo.ui.progress(_('clone'), handled_bytes,
total=bytecount)
diff --git a/tests/test-clone-uncompressed.t b/tests/test-clone-uncompressed.t
new file mode 100644
--- /dev/null
+++ b/tests/test-clone-uncompressed.t
@@ -0,0 +1,48 @@
+#require serve
+
+ $ hg init server
+ $ cd server
+ $ touch foo
+ $ hg -q commit -A -m initial
+ >>> for i in range(1024):
+ ... with open(str(i), 'wb') as fh:
+ ... fh.write(str(i))
+ $ hg -q commit -A -m 'add a lot of files'
+ $ hg serve -p $HGPORT -d --pid-file=hg.pid
+ $ cat hg.pid >> $DAEMON_PIDS
+ $ cd ..
+
+Basic clone
+
+ $ hg clone --uncompressed -U http://localhost:$HGPORT clone1
+ streaming all changes
+ 1027 files to transfer, 96.3 KB of data
+ transferred 96.3 KB in * seconds (*/sec) (glob)
+ searching for changes
+ no changes found
+
+Clone with background file closing enabled
+
+ $ hg --debug --config worker.backgroundclose=true --config worker.backgroundcloseminfilecount=1 clone --uncompressed -U http://localhost:$HGPORT clone-background | grep -v adding
+ using http://localhost:$HGPORT/
+ sending capabilities command
+ sending branchmap command
+ streaming all changes
+ sending stream_out command
+ 1027 files to transfer, 96.3 KB of data
+ starting 4 threads for background file closing
+ transferred 96.3 KB in * seconds (*/sec) (glob)
+ query 1; heads
+ sending batch command
+ searching for changes
+ all remote heads known locally
+ no changes found
+ sending getbundle command
+ bundle2-input-bundle: with-transaction
+ bundle2-input-part: "listkeys" (params: 1 mandatory) supported
+ bundle2-input-part: "listkeys" (params: 1 mandatory) supported
+ bundle2-input-bundle: 1 parts total
+ checking for updated bookmarks
+ preparing listkeys for "phases"
+ sending listkeys command
+ received listkey for "phases": 58 bytes
More information about the Mercurial-devel
mailing list