Maximilian Schreiner

Snap's new SnapGen AI can create high-res images in seconds on your phone

Snap Inc.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
A team of researchers, including some from Snap Inc, the company behind Snapchat, has developed an AI image generator that can run directly on phones.

Their new system, called SnapGen, can create high-resolution images in just seconds on high-end phones, the team says.

The key feature here is how much smaller they've made the AI model. While popular image generators like SDXL use about 2.6 billion parameters, SnapGen needs just 379 million - making it about seven times smaller. That's even more compact than Huawei's PixArt-⍺, another lightweight AI model optimized for phone use.

Same quality in a smaller package

According to Snap's team, making the model smaller hasn't hurt its performance. In fact, their tests show that it might actually perform better than its larger competitors.

"We achieve an extremely efficient T2I model that comprehensively outperforms many existing multi-billion parameter models such as SDXL, Lumina-Next, and Playgroundv2," the team writes.

When measuring how well the system matches images to text descriptions, SnapGen scored 0.66 on the GenEval benchmark, outperforming SDXL's score of 0.55.

Comparison gallery of various AI models with seven image prompts; SnapGen results (first column) show high quality despite compact model size.
SnapGen shows good results compared to much larger models like SDXL or Playground v2. | Bild: Chen et al.

The system really shines when it comes to speed. Previous AI image generators were either too slow or too large to work well on phones, but SnapGen can generate a high-resolution 1024×1024 pixel image in about 1.4 seconds on an iPhone 16 Pro Max.

A demo app for iOS shows the system's performance in action. | Video: Snap Inc.

The team says it achieved these improvements by "systematically examining the design choices of the network architecture to reduce model parameters and latency while ensuring high-quality generation." They also streamlined the decoder - the part that turns AI output into finished images - making it 36 times smaller than similar systems.

Google DeepMind develops grandmaster-level chess AI with language model architecture

To make their smaller model work as well as the larger ones, the researchers let their model learn from larger AI systems like SD3 and SD3.5 and the few-step version of SD3.5 (called SD3.5-Large-Turbo) to speed up image-generation. They also developed a special training process that can recognize when certain tasks are harder for the smaller model to learn and adjusts the teaching process accordingly.

  • Snap Inc. researchers have created SnapGen, an AI image generator capable of producing high-resolution images within seconds directly on high-end mobile phones.
  • Despite having only 379 million parameters, making it approximately seven times smaller than well-known systems like SDXL, SnapGen surpasses them in image quality and generation speed, taking just 1.4 seconds to create a 1024×1024 pixel image on an iPhone 16 Pro Max.
  • The significant efficiency gains were achieved by thoroughly redesigning the neural network architecture, removing unnecessary computations, and extensively optimizing the decoder responsible for converting the AI output into final images.
Sources
Arxiv
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
