Traffic updates: Saying a lot while revealing a little

Published January 28, 2019

By John Krumm , Senior Principal Researcher Eric Horvitz , Chief Scientific Officer

Share this page

The idea of crowdsourcing traffic data has been around for a while: If we can get vehicles on the roads to upload their current speeds, then we can get instant, up-to-date data on how fast traffic is moving for well-traveled segments. This is useful for finding the fastest route to a destination, avoiding slowdowns.

There are problems with this idea, though. The main one is that drivers need to upload their location along with their speed, which can raise concerns about privacy. Frequent speed reports also use up data transmission capacity that could be used for other purposes.

Our project, described in our paper, Traffic Updates: Saying a Lot While Revealing a Little (opens in new tab), to be presented at the 33rd AAAI Conference (opens in new tab) on Artificial Intelligence in Honolulu, Hawaii later this month, is aimed at significantly reducing the number of speed reports while still maintaining an accurate estimate of how fast traffic is moving on all the roads. We also explore principles around the joint use of central and distributed predictive models and the opportunity to make inferences in the absence of communication.

We leverage three ideas:

If traffic is moving like it normally does, there is no need to tell anyone, because everyone can assume that everything is normal.
If traffic is abnormal, only a few vehicles on a road segment need to report it. Not everyone has to say the same thing.
A speed report from one road can be used to automatically infer speeds on other roads, due to correlations between speeds on different roads. This reduces the information needs and thus the need to directly monitor all segments.

We took advantage of these ideas using a Markov Random Field (MRF). This is a mathematical model that lets us make inferences about traffic speeds everywhere based on only a few speed reports. We train the model with historical speed data so it can exploit the probabilistic dependencies among traffic on different roads. For instance, the model might learn that traffic speeds on a section of highway tend to be slow when the preceding and subsequent sections are also slow. The model can also predict what traffic is like under normal conditions. In the figure, the white dots show the sections of road where we have traffic measurements, and the black lines show the strength of dependencies between the sections. A darker line indicates a stronger correlation.

Map of Los Angeles indicating location of traffic measurements and strength of probabilistic influences among the sections.

We assume that each vehicle is running a version of this traffic model on its internal computer. The models can be executed with computing power provided by today’s smartphones. With each vehicle independently computing expected traffic speeds across the whole system, we can implement the three ideas above:

The predictive model informs the vehicle what normal traffic is like at the vehicle’s location by time of day and day of week. If the vehicle’s speed matches the normal speed, the vehicle does not need to make a report. The rest of the system knows that not receiving a report when cars are present means that “cars are flowing normally.”
Our experiments show that having reports from about 20 vehicles at a given location is adequate. We developed a probabilistic way for the vehicles to self-select a subset of themselves for reporting without having to communicate with each other. It’s sort of like rolling dice, and a vehicle only reports its speed if it rolls a “1”.
If traffic is abnormal, the vehicle can use the traffic model to estimate how important its own report would be for inferring traffic speeds everywhere. Only important reports are sent in.

We tested our approach on highway traffic data around Los Angeles, CA. We found that 90% of the time, traffic is normal and does not need speed reports. For the other 10% of the time, our experiments showed that only about 8% of vehicles need to report their speed. In total, we can reduce the number of speed reports to less than 1% of all vehicles, which significantly boosts privacy and reduces data transmissions.

In summary, we have explored and exercised the core idea of distributing out many copies of rich inferential models that each represent and reason holistically about a larger system. These local models make inferences within a shroud of privacy, and only share locally sensed information when the local, holistic models show that it will be valuable to others. While we focused on the example of traffic, the principles are applicable wherever sharing information may be valuable for enhancing a community service, but must be balanced with considerations of the privacy of individuals.