[Updated] D8958: hg-core: Add a limited read only `revlog` implementation

Alphare (Raphaël Gomès) phabricator at mercurial-scm.org
Fri Aug 28 10:11:10 UTC 2020


This revision now requires changes to proceed.
Alphare added a comment.
Alphare requested changes to this revision.


  This is my partial review, as indicated by my last comment, I'll be resuming it later today.

INLINE COMMENTS

> Cargo.toml:27
> +memmap = "0.7.0"
> +flate2 = { version = "1.0.16", features = ["zlib"], default-features = false }
> +zstd = "0.5.3"

You could add a comment saying that we don't use the `miniz-oxide` backend because its minimum Rust version is `1.36`. However, this PR (https://github.com/Frommi/miniz_oxide/pull/84/files) introduces a flag `no_extern_crate_alloc` to bring the requirement back down to `1.34`, so maybe leave a comment and we'll benchmark this later? (There also is the `zlib` implem from Cloudflare IIRC)

> index.rs:10
> +    bytes: &'a [u8],
> +    /// Offsets of index block start.
> +    /// Only needed when the index is interleaved with data.

nit: "Offsets of starts of index blocks"

> index.rs:46
> +
> +    /// Return then index entry corresponding to the given revision if it
> +    /// exists.

nit: s/then/the/

> index.rs:105
> +    /// For interleaved index and data, the offset stored in the index
> +    /// correspond to the separated data offset.
> +    /// It has to be overrided with the actual offset in the interleaved index

nit: corresponds

> index.rs:106
> +    /// correspond to the separated data offset.
> +    /// It has to be overrided with the actual offset in the interleaved index
> +    /// which is just afert the index block.

nit: "overridden", "after"

> index.rs:111
> +    /// entry is mixed with the index headers.
> +    /// It has to be overrided with 0.
> +    offset_override: Option<usize>,

nit: overridden

> index.rs:112
> +    /// It has to be overrided with 0.
> +    offset_override: Option<usize>,
> +}

Is it right to call it `offset_override` if it's always overridden? Not that I care *that* much, but `offset` would be shorter.

> index.rs:116
> +impl<'a> IndexEntry<'a> {
> +    /// Return the offset of the data if not overrided by offset_override.
> +    pub fn offset(&self) -> usize {

nit: overridden

> index.rs:121
> +        } else {
> +            let bytes = [&[0; 2][..], &self.bytes[0..=5]].concat();
> +            BigEndian::read_u64(&bytes[..]) as usize

This needlessly allocates a `Vec<u8>`, this version doesn't:

  let mut arr = [0;8];
  arr.copy_from_slice(&self.bytes[0..=5]);

You can compare the assembly output here: https://godbolt.org/z/rdbb4P. (I've also tried a very dumb version with manual indexing and no ByteOrder, but index checking doesn't kick in and is less readable anyway.

> patch.rs:6
> +/// A chunk is:
> +/// - an insertion when !data.is_empty() && start == end
> +/// - an deletion when data.is_empty() && start < end

Please use backticks (`) with inline code

> patch.rs:26
> +    ///
> +    /// TODO This method should not exist.
> +    ///      It its PatchList::combine responsibility.

If it's the responsibility of an already-present method, why not delete it now?

> patch.rs:54
> +
> +/// The delta between two revision data.
> +#[derive(Debug, Clone)]

nit: revisions'

> patch.rs:69
> +    pub fn new(data: &'a [u8]) -> Self {
> +        let mut frags = vec![];
> +        let mut data = data;

We could potentially have a heuristic of approximately how many entries there are given a the length of `data`, but that's probably for later.

> patch.rs:89
> +    fn size(&self, initial_size: i32) -> i32 {
> +        let mut res = initial_size;
> +        for m in self.frags.iter() {

This should probably be `self.frags.fold(initial_size, |acc, m| acc + m.len_diff())`, since iterators usually allow for better optimization (also it's less code, but the readability is debatable, I like it personally).

> patch.rs:114
> +    /// as the changes introduced by one patch can be overridden by the next.
> +    /// Combining patches optimizes the whole patching sequences.
> +    fn combine(&mut self, other: &mut Self) -> Self {

nit: s/sequences/sequence/

> patch.rs:148
> +
> +                // TODO explain the math //
> +                let (data_left, data_right) = first.data.split_at(

Should this have been done?

> patch.rs:171
> +
> +            // start_offset will be used to adjust the start of the current
> +            // chunk of `other`.

nit: missing backticks

> patch.rs:240
> +
> +#[cfg(test)]
> +mod tests {

I'm stopping my review here, will resume later, this was a long read so far.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8958/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8958

To: acezar, #hg-reviewers, Alphare
Cc: Alphare, mercurial-patches
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial-patches/attachments/20200828/96e9ea17/attachment-0002.html>


More information about the Mercurial-patches mailing list