[Updated] D8958: hg-core: Add a limited read only `revlog` implementation
Alphare (Raphaël Gomès)
phabricator at mercurial-scm.org
Fri Aug 28 10:11:10 UTC 2020
This revision now requires changes to proceed.
Alphare added a comment.
Alphare requested changes to this revision.
This is my partial review, as indicated by my last comment, I'll be resuming it later today.
INLINE COMMENTS
> Cargo.toml:27
> +memmap = "0.7.0"
> +flate2 = { version = "1.0.16", features = ["zlib"], default-features = false }
> +zstd = "0.5.3"
You could add a comment saying that we don't use the `miniz-oxide` backend because its minimum Rust version is `1.36`. However, this PR (https://github.com/Frommi/miniz_oxide/pull/84/files) introduces a flag `no_extern_crate_alloc` to bring the requirement back down to `1.34`, so maybe leave a comment and we'll benchmark this later? (There also is the `zlib` implem from Cloudflare IIRC)
> index.rs:10
> + bytes: &'a [u8],
> + /// Offsets of index block start.
> + /// Only needed when the index is interleaved with data.
nit: "Offsets of starts of index blocks"
> index.rs:46
> +
> + /// Return then index entry corresponding to the given revision if it
> + /// exists.
nit: s/then/the/
> index.rs:105
> + /// For interleaved index and data, the offset stored in the index
> + /// correspond to the separated data offset.
> + /// It has to be overrided with the actual offset in the interleaved index
nit: corresponds
> index.rs:106
> + /// correspond to the separated data offset.
> + /// It has to be overrided with the actual offset in the interleaved index
> + /// which is just afert the index block.
nit: "overridden", "after"
> index.rs:111
> + /// entry is mixed with the index headers.
> + /// It has to be overrided with 0.
> + offset_override: Option<usize>,
nit: overridden
> index.rs:112
> + /// It has to be overrided with 0.
> + offset_override: Option<usize>,
> +}
Is it right to call it `offset_override` if it's always overridden? Not that I care *that* much, but `offset` would be shorter.
> index.rs:116
> +impl<'a> IndexEntry<'a> {
> + /// Return the offset of the data if not overrided by offset_override.
> + pub fn offset(&self) -> usize {
nit: overridden
> index.rs:121
> + } else {
> + let bytes = [&[0; 2][..], &self.bytes[0..=5]].concat();
> + BigEndian::read_u64(&bytes[..]) as usize
This needlessly allocates a `Vec<u8>`, this version doesn't:
let mut arr = [0;8];
arr.copy_from_slice(&self.bytes[0..=5]);
You can compare the assembly output here: https://godbolt.org/z/rdbb4P. (I've also tried a very dumb version with manual indexing and no ByteOrder, but index checking doesn't kick in and is less readable anyway.
> patch.rs:6
> +/// A chunk is:
> +/// - an insertion when !data.is_empty() && start == end
> +/// - an deletion when data.is_empty() && start < end
Please use backticks (`) with inline code
> patch.rs:26
> + ///
> + /// TODO This method should not exist.
> + /// It its PatchList::combine responsibility.
If it's the responsibility of an already-present method, why not delete it now?
> patch.rs:54
> +
> +/// The delta between two revision data.
> +#[derive(Debug, Clone)]
nit: revisions'
> patch.rs:69
> + pub fn new(data: &'a [u8]) -> Self {
> + let mut frags = vec![];
> + let mut data = data;
We could potentially have a heuristic of approximately how many entries there are given a the length of `data`, but that's probably for later.
> patch.rs:89
> + fn size(&self, initial_size: i32) -> i32 {
> + let mut res = initial_size;
> + for m in self.frags.iter() {
This should probably be `self.frags.fold(initial_size, |acc, m| acc + m.len_diff())`, since iterators usually allow for better optimization (also it's less code, but the readability is debatable, I like it personally).
> patch.rs:114
> + /// as the changes introduced by one patch can be overridden by the next.
> + /// Combining patches optimizes the whole patching sequences.
> + fn combine(&mut self, other: &mut Self) -> Self {
nit: s/sequences/sequence/
> patch.rs:148
> +
> + // TODO explain the math //
> + let (data_left, data_right) = first.data.split_at(
Should this have been done?
> patch.rs:171
> +
> + // start_offset will be used to adjust the start of the current
> + // chunk of `other`.
nit: missing backticks
> patch.rs:240
> +
> +#[cfg(test)]
> +mod tests {
I'm stopping my review here, will resume later, this was a long read so far.
REPOSITORY
rHG Mercurial
CHANGES SINCE LAST ACTION
https://phab.mercurial-scm.org/D8958/new/
REVISION DETAIL
https://phab.mercurial-scm.org/D8958
To: acezar, #hg-reviewers, Alphare
Cc: Alphare, mercurial-patches
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurial-scm.org/pipermail/mercurial-patches/attachments/20200828/96e9ea17/attachment-0002.html>
More information about the Mercurial-patches
mailing list