Metas AITemplate delivers speedup for models like Stable Diffusion

Meta's AITemplate can execute code from the PyTorch AI framework up to twelve times faster. Among others, image AI systems like Stable Diffusion benefit significantly.

Meta's AITemplate (AIT) is a unified inference system with separate acceleration backends for AMD and Nvidia GPUs. It can perform high-performance inference on both GPU vendors' hardware - without the need for an entirely new implementation of the AI model that would otherwise be required when switching vendors.

Meta is making AITemplate available as open source and promises near hardware-native Tensor Core (Nvidia) and Matrix Core (AMD) performance for a variety of common AI models such as CNNs, Transformers, and Diffusion models.

AITemplate is up to twelve times faster, according to Meta

AITemplate converts AI models into high-performance C++ GPU template code as a Python framework, speeding up inference. According to Meta, AITemplate can speed up AI inference by up to 12x on Nvidia GPUs and up to 4x on AMD GPUs compared to Eager mode in PyTorch. In Eager mode, API calls are not executed until they are invoked. PyTorch is set to Eager Execution Mode by default.

Meta's AITemplate accelerates ResNet50 inference by a factor of 12 on an Nvidia A100 GPU at a low batch size. | Image: Meta

The framework offers numerous performance innovations, according to Meta, including advanced kernel fusion, an optimization method that combines multiple kernels into a single kernel to run more efficiently, and advanced optimizations for transformer blocks.

Meta's open-source framework can accelerate stable diffusion

Meta also delivers commonly used models out of the box with AITemplate, including Vision Transformer, BERT, Stable Diffusion, ResNet, and MaskRCNN. The generative image AI system Stable Diffusion (SD) runs about 2.4 times faster with AIT on an Nvidia GPU, allowing it to work around an out-of-memory error in a test on an RTX 3080 even with high SD settings.

In practice, Meta's AIT can thus accelerate image generation and processing with Stable Diffusion or enable higher resolutions, for example. The implementation of AIT in common solutions like Stable Diffusion WebUI is probably only a matter of time.

According to Meta, the release of AITemplate is also just the beginning of a long series of planned releases on the road to building a powerful AI inference engine. Further optimizations are planned, as well as expansion to other hardware systems such as Apple M-series GPUs and CPUs from other companies.

Metas AITemplate is available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI research

Metas AITemplate delivers speedup for models like Stable Diffusion

AITemplate is up to twelve times faster, according to Meta

Meta's open-source framework can accelerate stable diffusion

The next leap in AI depends on agents that learn by doing, not just by reading what humans wrote

Trump administration aims to scrap Biden-era AI chip export rules

TSMC to take a 20 percent stake in Intel's chip joint venture

HPE and Nvidia unveil new AI data layer and hardware for enterprise applications

US Copyright Office says fair use does not cover AI trained on "vast troves of copyrighted works

US think tank warns of "reverse brain drain" in China's AI sector

Researchers used AI to manipulate Reddit users, scrapped study after backlash

Metas AITemplate delivers speedup for models like Stable Diffusion

AITemplate is up to twelve times faster, according to Meta

Meta's open-source framework can accelerate stable diffusion

Share

Bank details