The road to openpilot 1.0

6 minute read

This is the second post in our three part series about the three main teams at comma.ai. Research, operations and openpilot. In this post we’ll talk about the openpilot team, and the problems we solve on a daily basis. We will also talk about the challenges that need to be solved before we can reach the milestone of openpilot 1.0.

openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for over 85 supported car makes and models. openpilot is completely open source, and MIT licensed. At the beginning of this year all development was moved to GitHub.

openpilot driving through a rainstorm. Video by Logan LeGrand

The openpilot team is responsible for any code running on the comma two. This includes openpilot itself, the operating system running (NEOS), our messaging library cereal and the code powering the panda. The panda forms the bridge between the car’s CAN bus and the comma two, and functions as a safety guardian. The panda code is developed following automotive industry standards (ISO26262 and MISRA C), and enforces our safety rules on every message sent to the car.

The main goal of the openpilot team is improving the user experience and increasing the time between disengagements. This includes making sure openpilot can handle a wide variety of scenarios and reducing the number of unplanned disengagements due to bugs.

Graph of the maximum miles between disengagements per day for the last two years showing a consistent improvement

In this blog post I’ll briefly talk about two interesting problems we’ve solved over the past year. The first one is about improving our testing infrastructure by creating automated integration tests. The second is about estimation of some key variables in the vehicle model, improving the quality and accuracy of the driving experience.

Automatic integration tests

In the early days of openpilot a lot of testing still required driving cars around. This is very time consuming, and will also never be a complete test. Ideally after we refactor some of the codebase, we want to make sure the behavior didn’t change. As the community got more involved with the development of openpilot we also need more automated tests that run on pull requests on GitHub. Therefore we were looking for something to automatically verify the functionality of all microservices.

Import functionality can be explicitly tested using unit tests, but reaching full test coverage requires a lot of manual work. For this test we wanted to build an integration test that automatically determines the specification of each service based on known good behavior.

To achieve this, we first had to make sure the behavior of all our microservices is deterministic, which mostly means we could not rely on the system clock. Then we created a specification of each of our microservices that describes which messages it expects as an input, and which of these messages would generate a response. For example plannerd (the service responsible for computing the desired steering angle based on the model input), expects a model packet, and will send a plan packet in response.

We then created a test harness to run each service, and supply it with all the inputs it needs from the logs of an actual drive. We then store all the outputs the service generates. If the service is properly deterministic we can run it on the same inputs again and it should generate the same output. If after making a change to the code, the output changes we can show exactly which fields in which messages changed. This allows us to make sure nothing changes in a refactor, or when we implement a new feature we double check that only the expected outputs changed and there was no impact on existing functionality.

Failed test output after some of the fields in the liveLocationKalman changed unexpectedly

This test runs in CI on most of our services on logs on all our supported brands. It runs on every commit so we can immediately verify if a pull request has unintended consequences before merging it.

Vehicle model parameter estimation

Next we’ll talk about a project that improved the driving experience by making turns more accurate by estimating some key variables about the car.

openpilot receives the location of the lane lines and desired path from the model. Using model predictive control we generate a smooth trajectory from the car’s current position onto the desired path. To actually drive this path we have to compute how much to turn the steering wheel. To convert the desired trajectory into steering angles we use a mathematical model of the car. Using this model we can predict how much the car will turn based on steering angle and speed of the vehicle.

Failed test output after some of the fields in the liveLocationKalman changed unexpectedly

We see a lot of errors in alignment or steering angle sensors that are not mounted correctly. This will cause the car to turn when openpilot thinks the steering wheel is straight. Also roads are not completely flat, especially in sharp turns roads are often banked. This causes the car to drift to the lower part of the road when the steering wheel is straight.

Both of these errors can be modeled as an offset between the measured steering angle and the actual steering angle, we call this the angle offset. We decompose this into two separate variables the average angle offset (long term effects such as alignment) and fast angle offset (short term effect such as banked roads). The images below show a plot of the (estimated) angle offset of a one minute segment of a car driving through a very banked turn.

Plot of the angle offset through a one minute segment. At t=600 then car enters a very banked turn resulting a fast angle offset of almost 4 degrees.

To make sure the car follows the desired trajectory we want to continuously improve the estimates of these two angle offsets. While driving we can use the vehicle model to predict how much the car should be turning given the current steering angle and speed. We can then use our accurate localizer to measure how much the car is actually turning and use this to improve the estimate for the parameters.

To estimate the parameters we use an Extended Kalman Filter (EKF), since the equations for the vehicle model are non-linear. The research team has written a very nice EKF library, rednose. Using SymPy we can define the equations governing the vehicle dynamics, and rednose will auto generate high performance C++ code to actually run the estimation.

After implementing this and comparing the distance between the center of the lane in turns we saw a significant improvement in cornering accuracy between the previous gradient descent based algorithm (blue) and the new EKF based one (orange). The gradient descent based one is biased towards either the inside or the outside in each turn, while the EKF is unbiased.

Comparison between EKF and gradient descent based vehicle model parameter estimation in turns. Note that the gradient descent based method is biased.

What’s next?

The openpilot team’s most important priorities are stability, testing and code cleanup. We want to reach at least a 1000 hours between unplanned disengagements before we can call openpilot 1.0. This requires improvements in OS stability and finding even the rarest bugs in the openpilot code.

To achieve this we need to improve test coverage and on device monitoring to catch unexpected events and alert the user before they turn into a problem. To make the codebase more maintainable and allow for better community contributions we want to improve testing, readability and documentation of the openpilot codebase.

Join the team

Are you interested in working on these kind of problems? Join the openpilot team. We’re currently looking for software developers with a strong background in Python and C++.

Willem Melching,
Head of openpilot @ comma.ai

Updated: