Google shows AI-rendered streetscapes. They could help autonomous cars find their way and enable Google Maps Street View in real-time 3D.
Neural Radiance Fields (NeRFs) are one of numerous AI technologies that could one day replace classic rendering approaches. NeRFs are neural networks that can learn representations of 3D objects and scenes from multiple photos and then render them in real-time 3D from multiple viewpoints.
Google researchers demonstrated several advances in the last two years, such as extremely detailed views of landmarks or real-time rendering of NeRFs that otherwise require several seconds of computation per image (see video below).
Nvidia is also researching NeRFs and recently unveiled Instant-NGP, a method that greatly accelerates the otherwise time- and compute-intensive AI training of neural networks. Artificial Intelligence is thus increasingly becoming an alternative for traditional rendering methods.
Block NeRF renders extensive 3D scene for the first time
Until now, however, NeRFs have only been used to render individual objects or rooms. Now, Google is demonstrating an approach with Block-NeRF that can render the largest 3D scene ever rendered with AI: Alamo Square in San Francisco, consisting of eight streets.
This is made possible by a network of multiple NeRFs, each responsible for a separate block in the city. The split decouples render time from scene size, allows scaling to environments of any size, and allows per-block updates for changes such as construction sites.
Camera cars provide training data
The block NeRFs were trained with nearly 2.8 million images captured over three months by camera-equipped cars. The different lighting and weather conditions of the recordings enable the NeRF networks to display the road scenes under different conditions as well.
Pedestrians, cars and other changing objects are automatically filtered out by the system during AI training. However, shadows of vehicles are still visible in some images, and changing vegetation results in washed-out trees and bushes along the roadway.
Individual NeRFs were trained between nine and 24 hours on 32 TPUv3 chips made by Google. Rendering a 1200 by 900 pixel image of a single NeRF takes 5.9 seconds.
Multiple block NeRFs can render in parallel, which is necessary in some scenes where the blocks overlap. The rendering in the distance is currently also washed out. However, improvements are already planned, the team said.
NeRFs could enable Google Maps Street View in 3D
Google cites training of autonomous vehicles or other robots, as well as support for aerial mapping, as potential applications for block NeRFs. The project was developed in cooperation with Alphabet's Waymo company, which specializes in autonomous driving.
The detailed 3D environments can be used, for example, for planning and testing routes. In the future, additional NeRFs could also dynamically represent individual vehicles in the rendered scene to simulate traffic, the team said.
Google also plans to incorporate training and rendering time improvements into Block NeRF, making it much more energy efficient and faster to render entire streets. That could open up new use cases for Block-NeRFs, such as a 3D variant of Google's Street View service in Google Maps.
There are more render examples on the Waymo project page for Block-NeRF.