Meta's new AI creates custom images from a single photo without extra training

Meta's new AI model "Imagine yourself" can generate a variety of personalized images from a single reference image - without additional training.

Meta has introduced a new AI model called "Imagine Yourself" that can generate personalized images from a single reference image without requiring additional training. The model can create multiple new images of a person in various poses, styles, and environments based on a single reference image.

Unlike previous approaches that needed to be retrained for each individual, "Imagine Yourself" operates without person-specific training. The model simultaneously processes the reference image and text instruction, allowing it to adapt flexibly to new people and instructions.

Meta relies on synthetic training data

To achieve these advancements, Meta employs several novel techniques. Firstly, "Imagine Yourself" utilizes synthetic training pairs, generating synthetic variants that correspond to real reference images. This enables the model to learn how to portray individuals in different poses and styles without adhering too closely to the reference image.

Secondly, the model features a new architecture with three parallel text processing modules and a trainable image processing module. These modules process image and text concurrently, facilitating better coordination between the two. Meta also applies multi-stage fine-tuning, training the model alternately with real and synthetic data to optimize identity preservation and instruction compliance.

Das Bild zeigt die Architektur des — Image: Meta

According to Meta, "Imagine Yourself" outperforms existing approaches like InstantID or IP adapters in executing complex instructions that necessitate significant changes to the reference image. For instance, the model can alter a person's facial expression or head posture and situate them in entirely new environments.

"Imagine yourself" still has weaknesses and is not yet available

However, the study also reveals that competing models occasionally surpass "Imagine Yourself" in terms of identity preservation. Meta attributes this to the fact that these models often simply copy parts of the reference image, potentially resulting in unnatural-looking results.

"Imagine Yourself" can also be extended to generate images featuring multiple people. To accomplish this, the image information from several reference images is processed in parallel, enabling the creation of group photos with known individuals in new poses and environments.

Meta intends to continue researching "Imagine Yourself," with future priorities including the extension to video generation and the improvement of very complex poses such as jumps. The model and code are not yet publicly available.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI research

Meta's new AI creates custom images from a single photo without extra training

Meta relies on synthetic training data

"Imagine yourself" still has weaknesses and is not yet available

New Othello experiment supports the world model hypothesis for large language models

Some Meta employees fear being sidelined as Zuckerberg reshuffles teams for AI progress

Meta tests chatbots with proactive messaging to boost retention

Meta launches AI video editing but holds back on full features for now

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Google upgrades Gemini with Deep Think and flags early warning risks

Meta's new AI creates custom images from a single photo without extra training

Meta relies on synthetic training data

"Imagine yourself" still has weaknesses and is not yet available

Share

Bank details