[PATCH 06 of 21 V2] speedy: custom wire protocol
Tomasz Kleczek
tkleczek at fb.com
Fri Dec 14 02:52:18 UTC 2012
# HG changeset patch
# User Tomasz Kleczek <tkleczek at fb.com>
# Date 1355383371 28800
# Node ID f18b4628ae086a3b4474d0c2a4fe5b3bb9151edc
# Parent 859126a36e9173e5f6370ddb7b8bc14fc82ef682
speedy: custom wire protocol
History queries take many kinds of python objects as parameters. We have
to (de)serialize them when transporting though network layer.
Here are some requirements that a candidate protocol should satisfy:
* it shouldn't use pickle/marshal modules internally as they are not
secure against maliciously constructed data
* it should handle binary data without a significant overhead as
most of the data sent over the network will be lists of node ids
* compatibility with Python 2.4 is a plus
Considering this a custom protocol for object serialization is introduced.
Currently supported value types:
* int
* string
* list of supported elements
* dict with supported key/values
* tuple of supported elements (serialized as list)
but can be easily extended to support more types.
The protocol is stream-oriented which means that a (de)serialization of a
single object may result in many read/writes to the underlying transport
layer.
To achieve good performance the transport layer should provide some
buffering mechanism.
To change the query into the request we perform a simple transformation:
query_name(*args) -> [query_name, args]
and serialize the resulting list with the wire protocol. At the server
end the reverse operation in performed.
This change only introduce a wireprotocol class, it will be integrated into
exisiting code in the subsequent patches.
diff --git a/hgext/speedy/protocol.py b/hgext/speedy/protocol.py
new file mode 100644
--- /dev/null
+++ b/hgext/speedy/protocol.py
@@ -0,0 +1,99 @@
+# Copyright 2012 Facebook
+#
+# This software may be used and distributed according to the terms of the
+# GNU General Public License version 2 or any later version.
+
+"""Custom wire protocol."""
+
+import struct
+
+class wireprotocol(object):
+ """Defines a mechanism to map in-memory data structures to a wire-format.
+
+ Raw data is read from/write to underlying transport using callbacks
+ provided on initialization of wireprotocol instance.
+
+ Currently supported value types:
+ * int
+ * string
+ * list of supported elements
+ * dict with supported key/values
+ * tuple of supported elements (serialized as list)
+
+ The protocol is stream-oriented which means that a (de)serialization of a
+ single object may result in many read/writes to the underlying transport.
+
+ To achieve good performance the transport layer should provide some
+ buffering mechanism.
+
+ No versioning or message framing is provided.
+ """
+ def __init__(self, read, write):
+ self._read = read
+ self._write = write
+
+ def _writeint(self, v):
+ self._write(struct.pack('>L', v))
+
+ def _readint(self):
+ return int(struct.unpack('>L', self._read(4))[0])
+
+ def serialize(self, val):
+ """Serialize a given value.
+
+ Writes data in a series of calls to the `self._write` callback.
+
+ Raises `TypeError` if the type of `val` is not supported.
+ NOTE: Some data might have been already written to transport
+ instance when the exception is raised.
+ """
+ if isinstance(val, int):
+ self._write('i')
+ self._writeint(val)
+ elif isinstance(val, str):
+ self._write('s')
+ self._writeint(len(val))
+ self._write(val)
+ elif isinstance(val, (tuple, list)):
+ # tuples are serialized as lists
+ self._write('l')
+ self._writeint(len(val))
+ serialize = self.serialize
+ for e in val:
+ serialize(e)
+ elif isinstance(val, dict):
+ self._write('d')
+ self._writeint(len(val))
+ serialize = self.serialize
+ for k, e in val.iteritems():
+ serialize(k)
+ serialize(e)
+ else:
+ raise TypeError("wireprotocol serialization: unsupported"
+ " value type: %s" % val.__class__.__name__)
+
+ def deserialize(self):
+ """Deserialize a single value.
+
+ Reads data in a series of calls to the `self._read` callback.
+
+ Raises `TypeError` if an unknown type description is encountered.
+ """
+ type = self._read(1)
+ if type == 'i':
+ return self._readint()
+ elif type == 's':
+ size = self._readint()
+ return self._read(size)
+ elif type == 'l':
+ size = self._readint()
+ deserialize = self.deserialize
+ return [ deserialize() for x in xrange(0, size) ]
+ elif type == 'd':
+ size = self._readint()
+ deserialize = self.deserialize
+ return dict([ (deserialize(), deserialize()) for x in xrange(0,
+ size)])
+ else:
+ raise TypeError("wireprotocol deserialization: unknown"
+ " value type descriptor: %r")
More information about the Mercurial-devel
mailing list