[Updated] D10366: rust: Refactor parse_dirstate() to take a callback

SimonSapin phabricator at mercurial-scm.org
Mon Apr 19 14:54:20 UTC 2021


SimonSapin added a comment.
SimonSapin abandoned this revision.


  In https://phab.mercurial-scm.org/D10362#159004 I was worried about potential perf impact, measured, and found none.
  
  However when measuring the entire stack instead of just the first patch I found some regressions (7~12% more wall clock time for `hg status` or `hg status -mard`) and bisected them to this patch. This patch changes `parse_dirstate` from returning a pair of `Vec` that are then turned into `HashMap`s with `Iterator::collect`, to taking a callback that is called for each entry to directly insert into `HashMap`s. I expected that skipping the intermediate `Vec`s would be a slight improvement (or negligible), but this turns out slower because this starts with small maps and reallocate several times as they grow. Reallocating a `HashMap` is more expensive than reallocating a `Vec` because it involves re-hashing. Profiling one case shows ~10% time spend in `RawTable::reserve_rehash` in the hashbrown crate. `Iterator::collect` however can rely on `Iterator::size_hint` (which is accurate when starting from a `Vec`) and make a correctly-sized `HashMap` from the start, without re-hashing.
  
  Therefore I’m rescinding most of this patch, in order to keep the intermediate `Vec` for the current `DirstateMap` and avoid the perf regression. The experimental tree-based `DirstateMap` still uses a callback, but that is small patch so I’ve folded it into D10367 <https://phab.mercurial-scm.org/D10367>.
  
  It might be possible to skip the intermediate `Vec` without adding re-hashing if we can pre-allocate a correctly-sized HashMap. Finding the accurate size would require parsing, but maybe an estimate would be good enough. The equivalent Python code does this: https://www.mercurial-scm.org/repo/hg-committed/file/5.7.1/mercurial/dirstate.py#l1660. However this unrelated to the initial goal of this patch and can be pursued separately.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D10366/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D10366

To: SimonSapin, #hg-reviewers
Cc: mercurial-patches
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial-patches/attachments/20210419/c743a2ae/attachment-0002.html>


More information about the Mercurial-patches mailing list