Savage Solder: Spring 2013 Competition Recap

We ran Savage Solder in two competitions this spring, Robomagellan in Robogames 2013, and the Sparkfun Autonomous Vehicle Competition 2013. In neither contest did we fare particularly well, especially given the level of preparation we put in. I’ll describe our results here, along with a little analysis from each event.

Robogames 2013 Robomagellan

20130620-robogames-shirt.jpg
Robogames 2013

In brief, the Robomagellan event requires that your robot touch a target orange traffic cone. It’s made harder because the cone can be in an arbitrary outdoor environment, with challenging terrain, changing features, people, GPS obstructions and the like. Your robot can achieve time deductions by touching other “bonus” cones. You get 3 runs, and your score is the best of the three.

Savage Solder only made two runs this year. A summary of the failure modes:

  • First Run: After touching the 25% bonus cone, our u-blox GPS decided to have a slew event, moving 20m away from where the car was. This resulted in the early termination of our run when the car hit a trash can. Our Robomagellan code has no obstacle avoidance, and it thought this was the final cone, so this ended our run.
  • Second Run: After the competition had started, the event center’s groundskeepers decided to start locating some picnic tables on the competition area. They placed one right in the middle of our surveyed path about 30s before we had to start. To change the plan, our robomagellan code at that time required surveying a series of waypoints, which we could not do in that short a time. The car got unlucky, clipped the side of the table enough to make it think it hit a cone, then reversed. When it reversed, it got hung up on the picnic table, burning out our ESC. We had no spare ESCs, so that ended our Robomagellan event.

Sparkfun AVC 2013

20130620-avc-entrance.jpg
Sparkfun AVC 2013

The Sparkfun Autonomous Vehicle Competition, held for the last couple of years, requires the robot to drive around a loop course. Bonus points are awarded for passing through a medium sized metal hoop and jumping over a ramp. This year, the overall score was the sum total of three runs.

After our dissapointing Robomagellan performance, we doubled down for Sparkfun AVC. We replaced our ESC and got spares… of nearly everything. We built a ramp, hoop, and obtained barrels. The two weekends before the event the car was autonomously completing our practice course 100% of the time including hitting both the ramp and the jump about 90%-95% of the time over maybe a hundred runs.

How did this translate into our competition performance? We completed one run, failed after the second corner on the second run, and failed after the first corner on the third run. End result: not so hot.

  • First Run: This run was flawless. We set our speed at 11mph, on the theory that we completed most of our testing there, and we didn’t know how fast or successful at hitting the props the other teams would be.
  • Second Run: Here, after seeing Netburner’s fast run we increased our maximum speed to 15mph to shave some time off. This speed didn’t end up being a problem, aside from Netburner rear ending us going into the second turn harmlessly — but the clouds were a problem. When homing on the hoop, our hoop detector got a glimpse of some very hoop shaped clouds off in the distance and went for them. This ended badly after it closelined itself, ran into a couple of haybales and the curb. This was the first time we had a false positive on clouds in all our testing.
  • Third Run: I went back to the image data in our first and second run, and updated our hoop detection constants to have the same detection rate, but weed out all the clouds. We went with a speed of 15mph again. However, right before we started, I noticed on our display that it was complaining that the GPS was not reporting data regularly. In fact, it was barely reporting data at all. As with Robomagellan and the picnic tables, this problem appeared right before the starting gun, so there was nothing we could do but start it and watch. And watch we did, as it collided with the fence after the first turn. The root cause: the u-blox doesn’t seem to be able to emit NMEA messages longer than 364 characters when run at 115200, and the thus the u-blox’s PUBX,3 NMEA message gets truncated once the GPS has more than about 19 satellites in its tracking filter. Our GPS interface software requires all the messages it uses to be valid (with checksums) before emitting a fix upstream, so it wasn’t reporting anything most of the time. Looking back at the data from all our Boston testing, hundreds of hours worth, this only happened for a very brief time, maybe 10s total, as we don’t have the same clear view of the sky as in Boulder.

Fixes

  • u-blox GPS slew: We have seemingly resolved this by using the RMS of the reported range residuals as a floor on the measurement noise of position into our Kalman filter. Since that change, u-blox slews, while they still happen, don’t significantly impact the trajectory of the car.
  • Last minute picnic tables: For AVC 2013 and beyond, we’ve switched from surveying points with a handheld GPS to drawing interpolated curves on satellite maps which need maybe a one time registration. This allows us to change paths quickly as necessary.
  • Hoop shaped cloud: This one we fixed at the event by tuning constants. We didn’t get our hoop detector working until about 3 weeks before the event, so our corpus of hoop images was relatively small. If we run Sparkfun AVC again, I’ll also create a larger corpus of hoop images including some fluffy cloud days.
  • u-blox limited NMEA length: Here we are switching to use the u-blox binary protocol, which has both has shorter messages and is presumably better tested by u-blox.

Conclusions

Through both the spring competitions, we had 1 successful run out of 5. Of the 4 failed runs, 3 were from root causes we had never before seen, or had seen entirely too infrequently to diagnose. The 4th, (the u-blox slewing that caused our trash can hit), had been happening only every 4 or 5 robomagellan runs, so it was a known, but limited, risk.

Obviously, we are fixing all the individual failure modes that we observed. I am not sure what the larger lesson here is, other than that there are many many failure modes and the only way to find them all is through exhaustive testing. Maybe using more expensive components would have helped, as the cheap u-blox and our ESC with no overcurrent protection cost us 3 runs total (4 if you count our missed 3rd robomagellan run).

I suppose the ultimate conclusion is that building robots remains hard.