Tag Archives: log

Video and telemetry synchronization (diagnostics part 8)

This is part of a continuing series on updated diagnostic tools for the mjbots quad A1 robot.  Previous editions are in 1, 2, 3, 4, 5, 6, and 7.  Here I’ll be looking at one of the last pieces of the puzzle, synchronizing the video with the rest of the telemetry.

As mentioned previously, recording video of a robot running is an easy, cheap, and fast way to provide ground truth information on all of the sensors and actuators.  However, it is only truly useful if it can be accurately synchronized in time to the other telemetry streams for the robot.

Options

This was part of the puzzle that I spent a long time thinking about before I got started, as there are several possible options that seemed like they could maybe work:

Visual

The concept here would be to put an LED beacon on the robot that is visible from all angles.  It could strobe a synchronizing pattern, like the output from an LFSR which could be identified in the subsequent video frames.

Pros: This should be able to give frame accurate synchronization, and works even for my 1000 fps camera which can’t record audio.

Cons: It is hard to find a good place to mount a light which could be observed from all angles.  The top is the best bet, but I have plans to attach further things there, which would then render synchronization infeasible.

Audio

In this concept, I put a microphone on the robot and have it record audio of the environment during its run.  Then standard audio synchronization algorithms can be used to align the two streams.  I actually included a microphone on the most recent version of the pi3 hat to potentially use this approach.

Pros: This has no visibility requirements, and should be able to give synchronization accuracy well under a single frame of video.

Cons: Getting the microphone data off the pi3 hat was looking to be moderately annoying, as the STM32 which it is connected to is already streaming IMU and RF data back to the robot over its single SPI bus.  When I brought up the board, I verified I could get 1kHz audio off, but that isn’t enough to be useful.

IMU

This was the idea I had last, and what I am using now.  Here, I slap the side of the robot in a semi-random pattern during the video.  That results in an audio signature in the video, as well as lateral accelerometer readings.

Pros: No additional hardware or software is required anywhere on the robot.

Cons: This has worse accuracy than pure audio, as the IMU is only sampled at 400Hz and doesn’t perfectly correspond to the audio found in the video.

Implementation

I took a stab at the IMU version, since it looked to be the easiest and still gave decent performance.  I made up a simple python tool which reads in the robot telemetry data, the audio stream of a video file, and lets the user select rough ranges for the audio and video streams to work from.

It then uses scipy.signal.correlate to do its best job of finding an alignment that best matches both data streams, producing a plot of the alignment.

20200515-video_aligner

As you can see, the audio rings out for some time after the IMU stops its high frequency response, largely due to the mechanical damping of the robot.  However, it is enough for the correlation to work with and give frame accurate results.

Log file format (diagnostics part 4)

In parts 1, 2, and 3 I covered some motivation for the updated mjlib diagnostics system and the serialization of individual structures.  In this post, I’ll cover how those structures are written into a file from an embedded system like a robot and how diagnostic tools can access them efficiently.

Goals

The top level goals are:

  • Efficient to write live from an embedded system: The quad A1 generates log data currently at 400Hz, consisting of hundreds to thousands of telemetry data points in every update.  It does this on a relatively low-end raspberry pi 3b+.  The format should be able to support writing data at high rates without a significant CPU burden.
  • Efficient seeking by time and record: Readers of the file should be able to efficiently seek by time in the stream, as well as extract all of a single record without having to process unnecessary data from the log.
  • Self contained: While this property  in the log comes from the underlying mjlib serialization format, it is worth re-iterating here.  All information necessary to return a JSON or CSV like structure for each instance should be present within the log.

Design

The detailed design of the log format is documented at README.md, here I will give a brief summary.

The log consists of a header followed by a series of “Blocks” concatenated together.  The two primary block types are one that contain the schema for an individual record and one for the data.  For a given record the schema will only be present once in the log, typically near the beginning.  The data block, contains a single serialized instance of the record, along with some optional flags and data.  The optional flags include a timestamp, a checksum, whether the data is compressed, and a pointer to the most recent data block for this record.

Another block is the SeekMarker block, which contains a timestamp and a 64 bit long unique-ish byte code and a checksum.  When readers need to perform random seeks in the log, they can binary search to an arbitrary byte offset, then search to find an instance of this unique code.  If it is present in conjunction with the necessary header and a validated checksum, it can be assumed that the framing has been recovered and the time for that point in the log.

Finally, there is an Index block, written at the very end of the log.  This includes pointers to the schema entries for all records in the file, as well as the most recent data block for that record.  That allows readers to find the set of records in a log, and extract a single record (albeit backwards) from the log while reading no extra data.

Future extensions

Most of the entities in the log have flag bitmasks to control additional future features or extensions.  Current readers throw errors when unknown bits are discovered, which makes it safe to almost arbitrarily modify the log structure at the expense of forward compatibility.

The mostly likely extensions are related to compression.  The current per-data compression format is snappy, from google.  It is fast, but has relatively poor compresson performance.  At some point, I’d like to switch to Zstandard, which has even better runtime performance, much better compression performance, and supports incremental dictionary manipulation.  I have actually integrated into in a test manner into the C++ writer and reader and the effort was trivial, however the other languages that I support, python and TypeScript are more challenging.  With snappy, there are operating system provided packages that work just fine in Debian and Ubuntu, but not so for Zstandard.  Bazel has rules that support pulling in pip packages for python and npm for TypeScript, but both of those mechanisms don’t have very straightforward support for the recursive WORKSPACE workarounds I am using now.  For now, it is easiest just to stick to snappy.

Next

Now that we have the data structures out of the way, I’ll move on to the tools that use them!

C++ serialization API (diagnostics part 3)

In the previous issue in this series, I described the schema and data elements of the mjlib serialization format.  Here, I’ll describe the API used to convert between C++ structures and the corresponding schema and data serializations.

First, I’ll start by saying this API is far from perfect.  It hits a certain tradeoff in the design space that may not be appropriate for every system.  I have developed and used similar APIs professionally both at Jaybridge and TRI, so it has seen use in millions of lines of code, but not billions by any stretch.  It is also mostly orthogonal to the rest of the design, and alternate serialization APIs could be built while still maintaining the performance and schema evolution properties described in parts 1 and 2.  Now with that out of the way, the library API:

Structure annotation

Structures are annotated for serialization in one of two ways, either intrusively or externally.  Intrusive serialization is the easiest if the structures are under your control, while external serialization can be used for structures from libraries or other systems.

The intrusive interface requires defining a templated visitor method, in the same vein as boost serialization.  This is a single method template, which accepts an unknown “archive” and calls the “Visit” method on the archive for all children of the structure.  It looks like:

struct MyStruct {
  int32_t field1 = 0;
  std::string field2;
  std::vector<double> field3;

  template <typename Archive>
  void Serialize(Archive* a) {
    a->Visit(MJ_NVP(field1));
    a->Visit(MJ_NVP(field2));
    a->Visit(MJ_NVP(field3));
  }
};

There is a helper macro named MJ_NVP which is just used to capture the textual name of the field as well as its address without duplication.  It can be equivalently written as:

  a->Visit(mjlib::base::MakeNameValuePair("field1", &field1));

with more verbosity.

Serialization and Deserialization

Once a structure has been annotated, then binary schema and data blobs can be generated through various writing classes:

namespace tl = mjlib::telemetry;

// Generate a binary schema
std::string binary_schema = 
  tl::BinarySchemaArchive::Write<MyStruct>();

// Generate a binary data
MyStruct my_struct;
std::string binary_data = 
  tl::BinaryWriteArchive::Write(my_struct);

When reading data, there is one class which parses the schema, and another which allows reading of the data back into a C++ structure while accounting for schema evolution rules.

tl::BinarySchemaParser parsed_schema{binary_schema};
tl::MappedBinaryReader reader{&parsed_schema};
MyStruct reconstituted_my_struct = reader.Read(binary_data);

These quick examples used the std::string value interface, but there exist interfaces for reading into existing structures as well as operating on streams of data instead of std::string.

Comparison to other systems

While some systems, notably boost serialization use this templated visitor pattern, many other C++ serialization schemes use a separate code generation step.  That includes most of the modern ones like protobuf, flatbuffers, capnproto, etc.  Here, C++ was chosen instead to minimize build complexity and permit the natural use of existing C++ structures.  For instance, mjlib defines an external visitor for Eigen matrices (a C++ linear algebra library).  That allows one to write:

struct MyStruct {
  Eigen::Vector3d point;
  Eigen::Matrix4f matrix;

  template <typename Archive>
  // ...
};

And have it “just work”.

The API is also sufficiently general to implement memcpy optimization for structures that are suitable candidates.

Secondly, structures annotated with templated visitor pattern can be used to implement many other types of transformations as well, such as JSON serialization and deserialization or command line parsing.

Next

Next in this series I’ll talk about the file format used to record the binary schema and data elements over time from an embedded system.

Revised mjlib serialization design (diagnostics part 2)

As discussed previously, I recently significantly revised the serialization format used by the mjbots quad A1 based on experience in previous professional domains, and from studying newer external projects like Apache AVRO.  Here I’ll describe the design of the serialized representation, which is more completely defined at: mjlib/telemetry/README.md

Refresher and definitions

As a brief refresher, this serialization format is intended to be used primarily to record telemetry from embedded systems, where that telemetry data may be persisted on disk for a long time.  Secondarily, it can be used to inspect the results of a live system.  The primitive it operates on is a “record”, which is logically a structure of elements which is emitted at some intervals over time.  For any given record, it logically breaks it up into a “schema” and a “data” portion.  The schema describes what types of elements are present in the structure, their names and relationships.  The “data” portion contains the minimum amount of information necessary to communicate one instance of the structure, assuming that the receiver already has a copy of the schema.

Schemas

A schema consists of one “type”.  There exist a number of “primitive” types which directly, or close to directly, map to machine storage.  For instance an abbreviated subset:

  • boolean can be true or false
  • float64 is a 64 bit floating point value
  • fixeduint is an unsigned integer of size 1, 2, 4, or 8
  • varuint is an unsigned integer of dynamic encoding length
  • string is a sequence of UTF-8 characters
  • bytes is a sequence of arbitrary bytes

After that, there are “complex” types, which consist of:

  • object is a list of fields, each with its own type
  • enum is an unsigned integer, along with a mapping from those integers to strings
  • array is a variable length array of some other type
  • fixedarray is a fixed length array of some other type
  • map is a mapping from strings to another type
  • union is an index discriminated union between multiple types

 Data

The data associated with each type is a direct mapping for the primitive types.  For the “complex” types, the associated data is as follows:

  • object the data consists of the data from each field in order
  • enum the data consists of a single unsigned integer
  • array the data consists of a size, followed by that many instances of the types data
  • fixedarray consists of the types data repeated the number of times from the schema
  • map just consists of the keys and values from the map
  • union contains a single unsigned integer index, followed by the selected type’s data

Encoding

For both the schema and the data there are two encodings defined, a JSON* one, and a binary one.  The JSON data encoding is what would be traditionally exchanged in Javascript applications.  It is not completely minimal, since field names and object and list delimiters are present.  For example, a simple object type consisting of a boolean, a string, and a list of fixedint might have a data representation in JSON like:

{
  "field1" : true,
  "field2" : "my string data",
  "field3" : [4, 5, 6],
}

The JSON schema encoding contains the entirety of the information from the schema.  For the above record it might look like:

{
  "type" : "object",
  "name" : "MyObject",
  "aliases" : ["AnOldName"],
  "fields" : [
    { "name" : "field1", "type" : "boolean" },
    { "name" : "field2", "type" : "string" },
    { "name" : "field3", "type" : "array", "items" : "fixedint32" }
  ],
}

A binary encoding for both the schema and the data is defined as well.  The schema is straightforward, if uninteresting and can be found in the README.  The data encoding for the primitive types for those which have direct machine analogs are the little endian machine representation.  The object data binary representation is merely the concatenation of all the field’s data fields.  This makes it possible to construct record definitions that exactly match a useful set of in memory structures to make serialization for those structures be a noop.

Next steps

In the next issue of this series, I’ll describe the C++ API for serializing and deserializing objects.

*Actually JSON5, which supports comments and final trailing commas among other improvements for human readability.

Updated serialization library (diagnostics part 1)

Now that I have the qdd100 servo in beta phase, the IMU working at full rate, and the quad A1 is moving around I’m getting closer to actually working to improve the gaits that the machine can execute.  To date, the gaits I have used completely ignore the IMU and only use the feedback from the joints in order to maintain force in 3D.  With tuning and on controlled surfaces this can work well, but if you go outside the happy regime, then it can undergo significant pitch and roll movements during the leg swing phase, which at best results in a janky walk, and at worst results in oscillation or outright instability.

There are also a number of as-yet-unidentified problems that seemingly cause the feet to not track the ground position properly, resulting in the feet slipping on the floor despite being nearly fully loaded.

To tackle all these new domains requires some improvements to my diagnostics infrastructure and tools.  I’ll cover the improvements I’ve made in a few posts, since the work that has gone into it has covered a fair amount of ground.  I’ll start with something I mostly completed back in the summer of 2019 and has the least direct impact, but gives at least a background for some of the other upcoming changes.

Telemetry format

Super Mega Microbot since its inception in 2014 used a self-describing serialization and telemetry format that was loosely based on work I had done professionally previously at Bluefin Robotics and then Jaybridge Robotics.  This format was then the basis for later work at Jaybridge and Toyota Research Institute.  The basic idea breaks down like this:

  • The schema which describes the data and the data are separate entities
  • The schema is recorded alongside the data whenever it is written to persistent storage
  • The schema contains sufficient information to reconstruct a CSV or JSON like representation of the data with no additional meta-data
  • Structure tools can map a given on disk-schema to a possibly different in-memory one using a schema evolution algorithm
  • The data is serialized and stored in a manner which is very efficient to write at high rates from realtime processes

Compared to other serialization mechanisms, this has different trade-offs.

  • Formats like JSON, XML, either completely include the schema in each data instance, or include a large amount of self-describing information in each data instance that is not strictly necessary to represent it
  • Formats like protobuf, capnproto, flatbuffers, and SBE have a different tradeoff.  They are geared towards performance, but largely also assume a single canonical source of schema data that is shared through an independent side channel and has a single linear revision history.  This makes sense for server RPC, where client and server are each distributed (possibly different) versions of the schema and want to communicate without having to exchange it.  They also include more metadata in the data stream than is strictly required many of them are more expensive to serialize or deserialize.
  • The closest to this work is Apache AVRO.  It uses the same principle of separate schema and data, and expects the schema to be stored alongside the data.  It also requires no code generation, which many of the above tools do require.

The unique pieces in this work over AVRO are that:

  • The data format is such that many common in-memory structures can simply be bit copied as serialized data with no further effort.  Those that do require some manipulation still require no additional in-memory structures associated with serialization.  This combines the properties of protobuf in that the serialization objects can be used as mutable state, with those of capnproto that allows zero cost serialization.
  • No recursion or pointers are supported, which renders the necessary code very simple.  The entirety of the C++ serialization and deserialization library is only a few hundred lines of code and took less than a week overall to write, unit test, and debug over the 6 years I’ve been using it.  It also functions perfectly fine in microcontroller-based embedded environments like the moteus controller.
  • The on-disk format is designed for rapid random seek access in time, assuming that small-ish records are written regularly.

The downsides are that it isn’t widely supported, isn’t optimized to handle single structures which have very large serialized representations, and the only language bindings aside from C++ are read only ones for python and TypeScript.

In future articles, I’ll describe a bit of the detail of the recently revised design, then go into the tools that use it.

 

Multiple axes in implot

I used Dear Imgui for the simple Mech Warfare control application I built earlier and was relatively impressed with the conciseness with which one could develop effective (although not necessarily the prettiest), interactive and response user interfaces in C++.  For some time I had been planning on developing a new diagnostic application for the mjbots quad that would allow plotting like the original tplot.py, but would also integrate recorded video and 3D rendering and diagnostics.  I had assumed I would use HTML/JS because it is the cool new thing, but I never got up the energy to make it happen, because every technical step along the way had big hurdles.  I figured I would give Dear Imgui a try, but the big thing it was missing was plotting support.

In the original tplot.py, I used matplotlib for plotting integration.  It is a high quality python library that can make interactive plots in nearly every imaginable form as well as production quality static plots.  It integrates with a number of GUI toolkits, in tplot I used it along with PySide.  The downside is, that given that it supports nearly anything under the sun, the code itself is relatively complex and hard to tweak.  In order to make tplot.py support multiple axes I had to do some careful source inspection to figure out which undocumented things could be poked.

Dear ImGui itself has a bare bones plotting system, but that doesn’t have anywhere near the feature set I would need.  The next system I seriously considered is implot.  It is very new, as in its repository is only a few weeks old, but already supported most of what I needed for a diagnostic tool.  The biggest thing it didn’t have was support for multiple Y axes.

So I took a stab at adding them!

One weekend later, I was largely successful:

20200510-multi-y-axis-2

Only a day after that and Evan had fixed up a few remaining problems and got it merged into master: https://github.com/epezent/implot/commit/5eb4b713849