Improving the moteus update rate, part 1

The moteus brushless controller I’ve developed for the force controlled quadruped uses an RS485 based command-response communication protocol.  To complete a full control cycle, the controlling computer needs to send new commands to each servo and read the current state back from each of them.  While I designed the system to be capable of high rate all-system updates, my initial implementation took a lot of shortcuts.  The result being that for all my testing so far, the outgoing update rate has been 100Hz, but state read back from the servos has been more at like 10Hz.  Here I’ll cover my work to get that rate both symmetric, and higher.

In this first post, I’ll cover the existing design and how that drives the update rate limitations.

Individual contributors

There are many pieces that chain together to determine the overall cycle time.  Here is my best estimate of each.

RS485 bitrate

The RS485 protocol that I’m using right now runs at 3,000,000 baud half duplex.  That means it can push about 300k bytes per second in one direction or the other.  While the STM32 in the moteus has UARTs capable of going faster than that, control computers that can manage much faster than 3Mbit are rare, so without switching to another transport like ethernet, this is about as good as it will get.

This means that at a minimum, there is a latency associated with all transmissions associated with the amount of data, which is roughly bytes * 10 / 3000000.

Servo turnaround

The RS485 protocol moteus uses allows for unidirectional or bidirectional commands.  In past experiments, all the control commands were sent in a group as unidirectional commands, then the state was queried in a series of separate command-reply sequences.  The firmware of the moteus servo currently takes around 140us from when a command finishes transmission and the corresponding reply is started.  The ideal turnaround for a bare servo is then (txbytes * 10 / 3000000) + 140us + (rxbytes * 10 / 3000000).

IMU junction board

The current quadruped has a network topology that looks like:

imu-junction-block-diagram

The junction board is an STM32F4 processor that performs active bridging across the RS485 networks and also contains an IMU.  This topology was chosen so that the junction board could query both halves of the quadruped simultaneously, then send a single result back to the host computer.  However, that has not been implemented yet, thus all the junction board does is further increase the latency of a single command.  As implemented now, it adds about 90us of latency, plus the time required to transmit the command and reply packets a second time.  That makes the latency for a single command and reply now: 2 * (txbytes * 10 / 3000000) + 110us + 90us + 2 * (rxbytes * 10 / 3000000)

Raspberry pi command transmission

As mentioned, the current system first sends new commands to the servos, then updates their state.  When sending the new commands, the existing implementation makes a separate system call to initiate each servos output packet.  Sometimes the linux kernel groups those together into a single outgoing frame on the wire, but more often than not those commands ended up being separated by 120us of white space.  That adds 12 * 120us of additional latency to an overall update frame.  So, 12 * 120us = 1440us

Raspberry pi reply to query turnaround

During the phase when all 12 of the servos are being queried, after each query, the raspberry pi needs to receive the response then formulate and send another query.  This currently takes around 200us from when the reply finishes transmission until when the next query hits the wire.  This is some combination of hardware latency, kernel driver latency, and application latency.  It sums up to 200us * 12 = 2400us

Packet framing

The RS485 protocol used for moteus has some header and framing bytes, that are an overhead on every single command or response.  This is currently:

  • Leadin Framing: 2 bytes
  • Source ID: 1 byte
  • Destination ID: 1 byte
  • Payload Size: 1 byte for small things
  • Checksum: 2 bytes

That works out to a 7 byte overhead, which in the current formulation applies 12x for the command phase, and 48 times for the query phase.  12x for the raspberry pi sending, 12x for the junction board sending, and 24x for the combined receive side.  That makes a total of (12 + 48) * 7 = 420 bytes * 10 / 3000000 = 1400us

Data encoding

In the current control mode of the servo, a number of different parameters are typically updated every control cycle:

  • Target angle
  • Target velocity
  • Maximum torque
  • Feedforward torque
  • Proportional control constant
  • Not to exceed angle (only used during open loop startup)

The servo protocol allows each of these values to be encoded on the wire as either a 4 byte floating point value, or as a fixed point signed integer of either 4, 2, or 1 bytes.  The current implementation sends all 6 of these values every time as 4 byte floats.  Additionally two bytes are required to denote which parameters are being sent.  That works out to: ((6 parameters * 4 byte float + 2) * 12 servos * 2 for junction board * 10) / 3000000 = 2080us

The receive side returns the following:

  • Current angle
  • Current velocity
  • Current torque
  • Voltage
  • Temperature
  • Fault code

And in the current implementation all of those are either sent as a 4 byte float, or a 4 byte integer.  That makes ((6 parameters * 4 bytes + 2) * 12 servos * 2 for junction board * 10) / 3000000 = 2080us

Overall result

I put together a spreadsheet that let me tweak each of the individual parameters and see how that affected the overall update rate of the system.

I made a dedicated test program and used the oscilloscope to monitor a cycle and roughly verified these results:

overall-latency

Thus, with a full command and query cycle, an update rate of about 80Hz can be achieved with the current system.

Next up, working to make this much better.

 

4 thoughts on “Improving the moteus update rate, part 1

  1. I am working on a very similar project. I feel your pain with the control latency. I settled on using the CAN bus. It seems there is significant delay on the raspberry pi to your junction board. I am curious if you could cut some delay by maybe doing SPI to the junction board?

    Like

    1. Thanks for the feedback!

      Yep, CAN is a logical alternative, mostly because of its integrated support in the STM32F4 which would remove some overhead and latency. I purposefully chose RS485 here initially primarily for the higher bitrate and the lack of constraint of 8 byte frames. The moteus controller has the capability of sending back a large amount of structured diagnostic telemetry, which of course could be tunneled through 8 byte CAN frames at 1Mbit/s, it would just be a lot slower and less effective.

      SPI to the junction board might be possible, although the cable run is long enough even in this configuration that signal integrity would be an issue at the necessary 5Mbit/s or so that it would need. In the Mech Warfare configuration, with the rpi mounted in the turret, that would only get worse.

      This post just documented the state of things after not having cared about control rate for a year. In the past week or so I’ve managed to improve things by 4x or 5x, the details of which will be posted soon.

      Do you have a write up of what you’re working on anywhere?

      Like

      1. That makes sense. I wasn’t sure of your configuration to the junction board. I plan to have a similar in between board but planned to plug it directly into the headers on my Jetson board. It’s nice because I can read back the state in the same transaction time as sending the updated control command. But yeah if you are running SPI over any appreciable distance it gets dicey.

        CAN is definitely slow unfortunately. I have been looking at some of the new STM32 chips which have CAN-FD support for some higher bitrates on the next version of my motor controller.

        What I would REALLY like to do is figure out a way to use something like EtherCAT. But that gets quite a bit more involved and RJ45 connectors are by no means the smallest when trying to integrate them.

        Unfortunately do to time constraints my documentation of the project has not been great. Best resource is my github page here: https://github.com/implementedrobotics/Nomad/

        Hopefully work will slow down and I’ll have more time to document. As of now I am trying to squeeze any bit of free time into design.

        Awesome project btw!

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s