Updated serialization library (diagnostics part 1)

Now that I have the qdd100 servo in beta phase, the IMU working at full rate, and the quad A1 is moving around I’m getting closer to actually working to improve the gaits that the machine can execute.  To date, the gaits I have used completely ignore the IMU and only use the feedback from the joints in order to maintain force in 3D.  With tuning and on controlled surfaces this can work well, but if you go outside the happy regime, then it can undergo significant pitch and roll movements during the leg swing phase, which at best results in a janky walk, and at worst results in oscillation or outright instability.

There are also a number of as-yet-unidentified problems that seemingly cause the feet to not track the ground position properly, resulting in the feet slipping on the floor despite being nearly fully loaded.

To tackle all these new domains requires some improvements to my diagnostics infrastructure and tools.  I’ll cover the improvements I’ve made in a few posts, since the work that has gone into it has covered a fair amount of ground.  I’ll start with something I mostly completed back in the summer of 2019 and has the least direct impact, but gives at least a background for some of the other upcoming changes.

Telemetry format

Super Mega Microbot since its inception in 2014 used a self-describing serialization and telemetry format that was loosely based on work I had done professionally previously at Bluefin Robotics and then Jaybridge Robotics.  This format was then the basis for later work at Jaybridge and Toyota Research Institute.  The basic idea breaks down like this:

  • The schema which describes the data and the data are separate entities
  • The schema is recorded alongside the data whenever it is written to persistent storage
  • The schema contains sufficient information to reconstruct a CSV or JSON like representation of the data with no additional meta-data
  • Structure tools can map a given on disk-schema to a possibly different in-memory one using a schema evolution algorithm
  • The data is serialized and stored in a manner which is very efficient to write at high rates from realtime processes

Compared to other serialization mechanisms, this has different trade-offs.

  • Formats like JSON, XML, either completely include the schema in each data instance, or include a large amount of self-describing information in each data instance that is not strictly necessary to represent it
  • Formats like protobuf, capnproto, flatbuffers, and SBE have a different tradeoff.  They are geared towards performance, but largely also assume a single canonical source of schema data that is shared through an independent side channel and has a single linear revision history.  This makes sense for server RPC, where client and server are each distributed (possibly different) versions of the schema and want to communicate without having to exchange it.  They also include more metadata in the data stream than is strictly required many of them are more expensive to serialize or deserialize.
  • The closest to this work is Apache AVRO.  It uses the same principle of separate schema and data, and expects the schema to be stored alongside the data.  It also requires no code generation, which many of the above tools do require.

The unique pieces in this work over AVRO are that:

  • The data format is such that many common in-memory structures can simply be bit copied as serialized data with no further effort.  Those that do require some manipulation still require no additional in-memory structures associated with serialization.  This combines the properties of protobuf in that the serialization objects can be used as mutable state, with those of capnproto that allows zero cost serialization.
  • No recursion or pointers are supported, which renders the necessary code very simple.  The entirety of the C++ serialization and deserialization library is only a few hundred lines of code and took less than a week overall to write, unit test, and debug over the 6 years I’ve been using it.  It also functions perfectly fine in microcontroller-based embedded environments like the moteus controller.
  • The on-disk format is designed for rapid random seek access in time, assuming that small-ish records are written regularly.

The downsides are that it isn’t widely supported, isn’t optimized to handle single structures which have very large serialized representations, and the only language bindings aside from C++ are read only ones for python and TypeScript.

In future articles, I’ll describe a bit of the detail of the recently revised design, then go into the tools that use it.