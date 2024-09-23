AI research
Maximilian Schreiner

Meta's new AI creates custom images from a single photo without extra training

Meta
Meta's new AI creates custom images from a single photo without extra training
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Profile
E-Mail
Content
summary Summary

Meta's new AI model "Imagine yourself" can generate a variety of personalized images from a single reference image - without additional training.

Ad

Meta has introduced a new AI model called "Imagine Yourself" that can generate personalized images from a single reference image without requiring additional training. The model can create multiple new images of a person in various poses, styles, and environments based on a single reference image.

Unlike previous approaches that needed to be retrained for each individual, "Imagine Yourself" operates without person-specific training. The model simultaneously processes the reference image and text instruction, allowing it to adapt flexibly to new people and instructions.

Meta relies on synthetic training data

To achieve these advancements, Meta employs several novel techniques. Firstly, "Imagine Yourself" utilizes synthetic training pairs, generating synthetic variants that correspond to real reference images. This enables the model to learn how to portray individuals in different poses and styles without adhering too closely to the reference image.

Ad
Ad

Secondly, the model features a new architecture with three parallel text processing modules and a trainable image processing module. These modules process image and text concurrently, facilitating better coordination between the two. Meta also applies multi-stage fine-tuning, training the model alternately with real and synthetic data to optimize identity preservation and instruction compliance.

Das Bild zeigt die Architektur des
Image: Meta

According to Meta, "Imagine Yourself" outperforms existing approaches like InstantID or IP adapters in executing complex instructions that necessitate significant changes to the reference image. For instance, the model can alter a person's facial expression or head posture and situate them in entirely new environments.

"Imagine yourself" still has weaknesses and is not yet available

However, the study also reveals that competing models occasionally surpass "Imagine Yourself" in terms of identity preservation. Meta attributes this to the fact that these models often simply copy parts of the reference image, potentially resulting in unnatural-looking results.

"Imagine Yourself" can also be extended to generate images featuring multiple people. To accomplish this, the image information from several reference images is processed in parallel, enabling the creation of group photos with known individuals in new poses and environments.

Meta intends to continue researching "Imagine Yourself," with future priorities including the extension to video generation and the improvement of very complex poses such as jumps. The model and code are not yet publicly available.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
AI research

How DeepMind's Genie AI could reshape robotics by generating interactive worlds from images

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Meta has developed a new AI model called "Imagine yourself" that can generate multiple personalized images of a person from a single reference image without the need for additional training.
  • The model uses synthetic training data and a new architecture with parallel text and image processing modules. This allows it to respond flexibly to new people and instructions, and to make complex changes to the reference image.
  • According to Meta, "Imagine yourself" can implement complex instructions better than existing approaches, but still has weaknesses in identity preservation. The company plans to continue its research and extend the model to video generation.
Sources
Arxiv Meta
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Profile
E-Mail
AI and society

Facebook users become AI training data as Meta launches controversial program

News, tests and reports about VR, AR and MIXED Reality.
Meta Quest: "Oceanarium" is the largest home environment yet MIXED reader questions - you ask, we answer! New VR sport "Disc" announced exclusively for Meta Quest MIXED-NEWS.com
AI research

Meta's "Transfusion" blends language models and image generation into one unified model

AI research

Nous Research's new Hermes 3 AI models promise high controllability without 'latent thoughtcrime'

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Meta's new AI creates custom images from a single photo without extra training

Bank details

IBAN: DE87 1203 0000 1086 0070 75
Account holder: DEEP CONTENT GbR
Purpose: Support THE DECODER
AI research

Nvidia researcher Jim Fan expects "GPT-3 moment" for robotics in the next few years

AI in practice
Update

OpenAI's new 'o1' model thinks longer to give smarter answers

AI in practice

Ordinary chatbot answers could be an asset in court, judge suggests

Google News