DiffusionAvatars generates realistic 3D avatars

Dec 2, 2023

Kirschstein et al.

Key Points

Researchers at the Technical University of Munich developed DiffusionAvatars, a method for creating high-quality 3D avatars with realistic facial expressions by combining 2D diffusion models and 3D neural networks.
The system can animate avatars by taking animations from input videos or by generating targeted facial expressions via a simple control and has applications in VR/AR, videoconferencing, gaming, film animation, language learning, and as a virtual assistant.
Despite promising results, DiffusionAvatars still has limitations, such as the lack of control over exposure properties and the computationally intensive architecture, which currently makes it unsuitable for real-time applications.

Researchers at the Technical University of Munich have developed DiffusionAvatars, a method for creating high-quality 3D avatars with realistic facial expressions.

The system was trained using RGB videos and 3D meshes of human heads. After training, the system is able to animate avatars both by taking animations from the input videos and by generating facial expressions via a simple control.

DiffusionAvatars combines the image synthesis capabilities of 2D diffusion models with the consistency of 3D neural networks. For the latter, DiffusionAvatars uses so-called "Neural Parametric Head Models" (NPHM) to predict the geometry of the human head. According to the team, these models provide better geometry data than conventional 3D neural models.

DiffusionAvatars has applications in XR and more

According to the team, DiffusionAvatars generates temporally consistent and visually appealing videos for new poses and facial expressions of a person, outperforming existing approaches.

The technology could be used in several areas in the future, including VR/AR applications, immersive videoconferencing, games, film animation, language learning, and virtual assistants. Companies such as Meta and Apple are also researching AI-generated realistic avatars.

However the technology has its limits: DiffusionAvatars currently incorporates lighting into the generated images and offers no control over exposure characteristics. This is a problem for avatars in realistic environments. In addition, the current architecture is still computationally intensive and therefore not yet suitable for real-time applications.

More information can be found on the DiffusionAvatars project page.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Arxiv