Metas AITemplate delivers speedup for models like Stable Diffusion

Meta's AITemplate can execute code from the PyTorch AI framework up to twelve times faster. Among others, image AI systems like Stable Diffusion benefit significantly.

Meta's AITemplate (AIT) is a unified inference system with separate acceleration backends for AMD and Nvidia GPUs. It can perform high-performance inference on both GPU vendors' hardware - without the need for an entirely new implementation of the AI model that would otherwise be required when switching vendors.

Meta is making AITemplate available as open source and promises near hardware-native Tensor Core (Nvidia) and Matrix Core (AMD) performance for a variety of common AI models such as CNNs, Transformers, and Diffusion models.

AITemplate is up to twelve times faster, according to Meta

AITemplate converts AI models into high-performance C++ GPU template code as a Python framework, speeding up inference. According to Meta, AITemplate can speed up AI inference by up to 12x on Nvidia GPUs and up to 4x on AMD GPUs compared to Eager mode in PyTorch. In Eager mode, API calls are not executed until they are invoked. PyTorch is set to Eager Execution Mode by default.

Meta's AITemplate accelerates ResNet50 inference by a factor of 12 on an Nvidia A100 GPU at a low batch size. | Image: Meta

The framework offers numerous performance innovations, according to Meta, including advanced kernel fusion, an optimization method that combines multiple kernels into a single kernel to run more efficiently, and advanced optimizations for transformer blocks.

Meta's open-source framework can accelerate stable diffusion

Meta also delivers commonly used models out of the box with AITemplate, including Vision Transformer, BERT, Stable Diffusion, ResNet, and MaskRCNN. The generative image AI system Stable Diffusion (SD) runs about 2.4 times faster with AIT on an Nvidia GPU, allowing it to work around an out-of-memory error in a test on an RTX 3080 even with high SD settings.

In practice, Meta's AIT can thus accelerate image generation and processing with Stable Diffusion or enable higher resolutions, for example. The implementation of AIT in common solutions like Stable Diffusion WebUI is probably only a matter of time.

According to Meta, the release of AITemplate is also just the beginning of a long series of planned releases on the road to building a powerful AI inference engine. Further optimizations are planned, as well as expansion to other hardware systems such as Apple M-series GPUs and CPUs from other companies.

Metas AITemplate is available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI research

Metas AITemplate delivers speedup for models like Stable Diffusion

AITemplate is up to twelve times faster, according to Meta

Meta's open-source framework can accelerate stable diffusion

Wait a minute! Researchers say AI's "chains of thought" are not signs of human-like reasoning

US government plans new AI chip export controls targeting Malaysia and Thailand

Trump administration plans executive orders to speed up U.S. AI data center expansion

Microsoft’s Braga AI chip faces six-month delay, trails Nvidia’s Blackwell

"Cat attack" on reasoning model shows how important context engineering is

Apple's claims about large reasoning models face fresh scrutiny from a new study

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

Metas AITemplate delivers speedup for models like Stable Diffusion

AITemplate is up to twelve times faster, according to Meta

Meta's open-source framework can accelerate stable diffusion

Share

Bank details