Since they have shown to be successful test beds for safely testing out perilous driving scenarios, hyper-realistic virtual environments have been hailed as the greatest driving schools for autonomous vehicles (AVs). Since testing and accumulating detailed I-almost-crashed data is typically not the most convenient or desirable to replicate, Tesla, Waymo, and other self-driving businesses all rely significantly on data to enable pricey and proprietary photorealistic simulators. In order to achieve this, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) developed “VISTA 2.0,” a data-driven simulation engine that enables vehicles to learn how to drive in real-world conditions and recover from close calls. Additionally, the public will have access to all of the code.
“Today, only companies have software like the type of simulation environments and capabilities of VISTA 2.0, and this software is proprietary. With this release, the research community will have access to a powerful new tool for accelerating the research and development of adaptive robust control for autonomous driving,” says MIT Professor and CSAIL Director Daniela Rus, senior author on a paper about the research.
VISTA 2.0 is an advancement of the team’s earlier model, VISTA, and it differs significantly from other AV simulators since it is data-driven, or developed and photorealistically produced using real-world data, allowing for direct transfer to reality. The first prototype only enabled single car lane-following with a single camera sensor, but in order to achieve high-fidelity data-driven simulation, it was necessary to reconsider the fundamentals of how various sensors and behavioural interactions can be combined.
Here’s VISTA 2.0: a data-driven system that can scale-up simulations of massively interactive situations and crossings involving complex sensor types. The team educated autonomous vehicles that could be far more reliable than those trained on enormous amounts of real-world data with much less data than earlier models. The group was successful in scaling the difficulty of interactive driving tasks for activities like overtaking, following, and negotiating, including multiagent situations in extremely photorealistic settings.
Because the majority of our data—thankfully—is just routine, everyday driving, training AI models for driverless vehicles requires hard-to-secure feed of various edge cases and bizarre, risky events. It makes no sense logically to simply crash into other vehicles to train a neural network how to avoid doing so.
There has recently been a move away from simulation environments that were more traditionally created by humans and toward ones that were based on real-world data. The former can mimic virtual cameras and lidars with ease, whereas the latter have incredible photorealism. With this paradigm shift, a crucial concern has arisen: Can the richness and complexity of all the sensors that autonomous vehicles require, like more sparse sensors like lidar and event-based cameras, be accurately synthesised?
In a society driven by data, it is much more difficult to understand data from lidar sensors since you are essentially trying to create entirely new 3D point clouds with millions of points from sparse views of the environment. The team took the data that the car collected, projected it into a 3D space derived from the lidar data, and then allowed a new virtual vehicle to move around locally from where that original vehicle was. The result was the creation of 3D lidar point clouds. Finally, they used neural networks to project all of that sensory data back into the field of view of this new virtual vehicle.
The crew experienced quick transferability of results, with both failures and accomplishments, when they drove their full-scale car into the “wild,” which is another way of saying Devens, Massachusetts. Additionally, they were able to show off the bold, magical term of self-driving car models: “robust.” They demonstrated that completely VISTA 2.0-trained AVs were sufficiently resilient to manage tough failures in the actual world.
Human feeling is one safety net that people use that can’t yet be replicated. It’s an acknowledgement gesture like a friendly wave, nod, or blinker switch—exactly the kind of subtleties the team want to use going forward.