DiffusionLight is a method that estimates lighting in an image using a generated chrome ball.
The researchers have developed a simple but effective technique for estimating illumination in a single input image. They use diffusion models trained on billions of standard images to render a chrome ball in the input image and use it as a light probe.
This method has a variety of applications, including realistic insertion of virtual objects into images, enhancements in AR and VR, realistic visualizations in architecture and interior design, more realistic scenes in computer games, and more accurate planning in photography and film.
DiffusionLight uses Stable Diffusion XL
Current lighting estimation techniques are based on HDR panoramic datasets that are used to train neural networks. However, these approaches often struggle with real-world images due to the limited variety and size of available datasets.
In contrast, the technique called "DiffusionLight" uses the Stable Diffusion XL diffusion model, pre-trained with billions of images, to render a chrome ball in the input image. The method assumes that the AI models have learned about HDR and the wide range of brightness indirectly from examples of underexposed and overexposed images in their training sets.
However, without further intervention, SDXL generates incorrect or inconsistent objects and cannot easily generate images in HDR format. The researchers therefore used a technique called "iterative inpainting" to find an initial diffusion noise map, which is then used to generate chrome balls of consistent quality.
To create HDR chromospheres, the researchers also generate and combine multiple LDR chrome balls with different exposure values, which are used to further refine SDXL with LoRA.
After training, DiffusionLight provides convincing light estimates in different settings and can process very different scenes.
More information and examples can be found on the DiffusionLight project page.