Content
summary Summary

Researchers at the Technical University of Munich have developed DiffusionAvatars, a method for creating high-quality 3D avatars with realistic facial expressions.

Ad

The system was trained using RGB videos and 3D meshes of human heads. After training, the system is able to animate avatars both by taking animations from the input videos and by generating facial expressions via a simple control.

DiffusionAvatars combines the image synthesis capabilities of 2D diffusion models with the consistency of 3D neural networks. For the latter, DiffusionAvatars uses so-called "Neural Parametric Head Models" (NPHM) to predict the geometry of the human head. According to the team, these models provide better geometry data than conventional 3D neural models.

DiffusionAvatars has applications in XR and more

According to the team, DiffusionAvatars generates temporally consistent and visually appealing videos for new poses and facial expressions of a person, outperforming existing approaches.

Ad
Ad

The technology could be used in several areas in the future, including VR/AR applications, immersive videoconferencing, games, film animation, language learning, and virtual assistants. Companies such as Meta and Apple are also researching AI-generated realistic avatars.

However the technology has its limits: DiffusionAvatars currently incorporates lighting into the generated images and offers no control over exposure characteristics. This is a problem for avatars in realistic environments. In addition, the current architecture is still computationally intensive and therefore not yet suitable for real-time applications.

More information can be found on the DiffusionAvatars project page.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers at the Technical University of Munich developed DiffusionAvatars, a method for creating high-quality 3D avatars with realistic facial expressions by combining 2D diffusion models and 3D neural networks.
  • The system can animate avatars by taking animations from input videos or by generating targeted facial expressions via a simple control and has applications in VR/AR, videoconferencing, gaming, film animation, language learning, and as a virtual assistant.
  • Despite promising results, DiffusionAvatars still has limitations, such as the lack of control over exposure properties and the computationally intensive architecture, which currently makes it unsuitable for real-time applications.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.