Content
summary Summary

British startup Wayve unveils PRISM-1, a new AI model for realistic reconstruction of dynamic scenes from video data. The company aims to take autonomous driving simulation to a new level.

London-based startup Wayve has developed an AI model called "PRISM-1" that can reconstruct three-dimensional street scenes with the fourth dimension of time using vehicle camera videos. The company uses technologies similar to the neural representations that enable NeRFs and Gaussian splatting.

The technology aims to enable more detailed and realistic simulations of traffic scenarios, allowing Wayve to train and test its AI models for autonomous driving faster.

Video: Wayve

Ad
Ad

At the core of PRISM-1 is a flexible method that can capture even complex urban scenes with many moving elements such as pedestrians, cyclists, and other vehicles. This includes blinking traffic lights, brake lights, turn signals on cars, and windshield wipers.

Previous simulations for self-driving cars faced limitations here because it is very time-consuming to map all dynamic interactions and lighting conditions in a 3D model.

PRISM-1 learns static and dynamic elements

PRISM-1 learns to separate static from dynamic elements in the videos in a self-supervised manner, without manual annotations or predefined models. According to Wayve, this saves a lot of effort.

The system then implicitly tracks the movements in the scene and matches them with the 3D geometry. For precise understanding, information on depth, surface normals, optical flow, and semantic segmentation, i.e., the assignment of image points to object classes, is incorporated.

This is based on techniques of visual reasoning, i.e., logical inference from images. Explicit 3D annotations or additional sensors such as LIDAR are not necessary.

Recommendation

PRISM-1 can also play through alternative scenes

The example videos show reconstructed street scenes from London and Mountain View, California. PRISM-1 can freely pan the camera to show the scene from different angles. Time jumps are also possible, for example, a stationary vehicle while pedestrians and cars move around it.

Video: Wayve

This is important for testing the behavior of a driving model in dangerous situations that lie off the originally recorded route. According to Wayve, the rendering is stable even in difficult lighting conditions such as reflections in a tunnel.

In addition to the pure camera image, the reconstructions show depth maps and velocity vectors of moving objects. Targeted removal of individual elements such as pedestrians is also possible to play through alternative scenarios.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Video: Wayve

Autonomous cars to train with PRISM-1 in the "Ghost Gym"

Wayve plans to integrate PRISM-1 into its driving simulator "Ghost Gym," which the company introduced in December 2023. The more realistic environments are expected to accelerate the training and evaluation of the driving models.

Wayve expects the improved simulator to enable faster development cycles for its AI. At the same time, the company wants to adapt the models to underrepresented scenarios, such as driving in rare weather conditions or new regions. Efficient testing of driving behavior on other vehicle types or with other cameras should also become easier.

In the course of its work on PRISM-1, Wayve has created and published the reference dataset "WayveScenes101." It contains sample scenes of roads in the UK and USA with complex, dynamic elements.

In addition to PRISM-1, the company has also previously introduced GAIA-1, a generative AI model that generates synthetic videos of a variety of traffic situations from text, image, video, and action data. With Lingo-1 and Lingo-2, it is also developing multimodal language models that combine machine vision with text-based logic to explain driving decisions.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • London-based startup Wayve has developed an AI model called PRISM-1 that can reconstruct three-dimensional street scenes with dynamic elements such as traffic lights, vehicles, and pedestrians from video data to enable more realistic simulations for autonomous vehicle training.
  • PRISM-1 automatically separates static from dynamic elements in the videos and uses visual reasoning techniques to implicitly track movements in the scene and match them to the 3D geometry without the need for explicit annotations or additional sensors.
  • Wayve plans to integrate PRISM-1 into its Ghost Gym driving simulator to accelerate development cycles for its AI driving models, adapt them to under-represented scenarios, and facilitate testing on other vehicle types or with other cameras. The company has also released the WayveScenes101 reference dataset.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.