D4412: internals: document CBOR utilization
indygreg (Gregory Szorc)
phabricator at mercurial-scm.org
Mon Sep 3 13:38:15 UTC 2018
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG2fe21c65777e: internals: document CBOR utilization (authored by indygreg, committed by ).
CHANGED PRIOR TO COMMIT
https://phab.mercurial-scm.org/D4412?vs=10624&id=10726#toc
REPOSITORY
rHG Mercurial
CHANGES SINCE LAST UPDATE
https://phab.mercurial-scm.org/D4412?vs=10624&id=10726
REVISION DETAIL
https://phab.mercurial-scm.org/D4412
AFFECTED FILES
contrib/wix/help.wxs
mercurial/help.py
mercurial/help/internals/cbor.txt
tests/test-help.t
CHANGE DETAILS
diff --git a/tests/test-help.t b/tests/test-help.t
--- a/tests/test-help.t
+++ b/tests/test-help.t
@@ -1010,6 +1010,7 @@
bundle2 Bundle2
bundles Bundles
+ cbor CBOR
censor Censor
changegroups Changegroups
config Config Registrar
@@ -3294,6 +3295,13 @@
Bundles
</td></tr>
<tr><td>
+ <a href="/help/internals.cbor">
+ cbor
+ </a>
+ </td><td>
+ CBOR
+ </td></tr>
+ <tr><td>
<a href="/help/internals.censor">
censor
</a>
diff --git a/mercurial/help/internals/cbor.txt b/mercurial/help/internals/cbor.txt
new file mode 100644
--- /dev/null
+++ b/mercurial/help/internals/cbor.txt
@@ -0,0 +1,130 @@
+Mercurial uses Concise Binary Object Representation (CBOR)
+(RFC 7049) for various data formats.
+
+This document describes the subset of CBOR that Mercurial uses and
+gives recommendations for appropriate use of CBOR within Mercurial.
+
+Type Limitations
+================
+
+Major types 0 and 1 (unsigned integers and negative integers) MUST be
+fully supported.
+
+Major type 2 (byte strings) MUST be fully supported. However, there
+are limitations around the use of indefinite-length byte strings.
+(See below.)
+
+Major type 3 (text strings) are NOT supported.
+
+Major type 4 (arrays) MUST be supported. However, values are limited
+to the set of types described in the "Container Types" section below.
+And indefinite-length arrays are NOT supported.
+
+Major type 5 (maps) MUST be supported. However, key values are limited
+to the set of types described in the "Container Types" section below.
+And indefinite-length maps are NOT supported.
+
+Major type 6 (semantic tagging of major types) can be used with the
+following semantic tag values:
+
+258
+ Mathematical finite set. Suitable for representing Python's
+ ``set`` type.
+
+All other semantic tag values are not allowed.
+
+Major type 7 (simple data types) can be used with the following
+type values:
+
+20
+ False
+21
+ True
+22
+ Null
+31
+ Break stop code (for indefinite-length items).
+
+All other simple data type values (including every value requiring the
+1 byte extension) are disallowed.
+
+Indefinite-Length Byte Strings
+==============================
+
+Indefinite-length byte strings (major type 2) are allowed. However,
+they MUST NOT occur inside a container type (such as an array or map).
+i.e. they can only occur as the "top-most" element in a stream of
+values.
+
+Encoders and decoders SHOULD *stream* indefinite-length byte strings.
+i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long
+byte string value when indefinite-length byte strings are being used
+if it can be avoided. Mercurial MAY use extremely long indefinite-length
+byte strings and buffering the source or destination value COULD lead to
+memory exhaustion.
+
+Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20
+bytes.
+
+Container Types
+===============
+
+Mercurial may use the array (major type 4), map (major type 5), and
+set (semantic tag 258 plus major type 4 array) container types.
+
+An array may contain any supported type as values.
+
+A map MUST only use the following types as keys:
+
+* unsigned integers (major type 0)
+* negative integers (major type 1)
+* byte strings (major type 2) (but not indefinite-length byte strings)
+* false (simple type 20)
+* true (simple type 21)
+* null (simple type 22)
+
+A map MUST only use the following types as values:
+
+* all types supported as map keys
+* arrays
+* maps
+* sets
+
+A set may only use the following types as values:
+
+* all types supported as map keys
+
+It is recommended that keys in maps and values in sets and arrays all
+be of a uniform type.
+
+Avoiding Large Byte Strings
+===========================
+
+The use of large byte strings is discouraged, especially in scenarios where
+the total size of the byte string may by unbound for some inputs (e.g. when
+representing the content of a tracked file). It is highly recommended to use
+indefinite-length byte strings for these purposes.
+
+Since indefinite-length byte strings cannot be nested within an outer
+container (such as an array or map), to associate a large byte string
+with another data structure, it is recommended to use an array or
+map followed immediately by an indefinite-length byte string. For example,
+instead of the following map::
+
+ {
+ "key1": "value1",
+ "key2": "value2",
+ "long_value": "some very large value...",
+ }
+
+Use a map followed by a byte string:
+
+ {
+ "key1": "value1",
+ "key2": "value2",
+ "value_follows": True,
+ }
+ <BEGIN INDEFINITE-LENGTH BYTE STRING>
+ "some very large value"
+ "..."
+ <END INDEFINITE-LENGTH BYTE STRING>
diff --git a/mercurial/help.py b/mercurial/help.py
--- a/mercurial/help.py
+++ b/mercurial/help.py
@@ -205,6 +205,8 @@
loaddoc('bundle2', subdir='internals')),
(['bundles'], _('Bundles'),
loaddoc('bundles', subdir='internals')),
+ (['cbor'], _('CBOR'),
+ loaddoc('cbor', subdir='internals')),
(['censor'], _('Censor'),
loaddoc('censor', subdir='internals')),
(['changegroups'], _('Changegroups'),
diff --git a/contrib/wix/help.wxs b/contrib/wix/help.wxs
--- a/contrib/wix/help.wxs
+++ b/contrib/wix/help.wxs
@@ -43,6 +43,7 @@
<Component Id="help.internals" Guid="$(var.help.internals.guid)" Win64='$(var.IsX64)'>
<File Id="internals.bundle2.txt" Name="bundle2.txt" />
<File Id="internals.bundles.txt" Name="bundles.txt" KeyPath="yes" />
+ <File Id="internals.cbor.txt" Name="cbor.txt" />
<File Id="internals.censor.txt" Name="censor.txt" />
<File Id="internals.changegroups.txt" Name="changegroups.txt" />
<File Id="internals.config.txt" Name="config.txt" />
To: indygreg, #hg-reviewers
Cc: mercurial-devel
More information about the Mercurial-devel
mailing list