summary Summary

Meta's AITemplate can execute code from the PyTorch AI framework up to twelve times faster. Among others, image AI systems like Stable Diffusion benefit significantly.

Meta's AITemplate (AIT) is a unified inference system with separate acceleration backends for AMD and Nvidia GPUs. It can perform high-performance inference on both GPU vendors' hardware - without the need for an entirely new implementation of the AI model that would otherwise be required when switching vendors.

Meta is making AITemplate available as open source and promises near hardware-native Tensor Core (Nvidia) and Matrix Core (AMD) performance for a variety of common AI models such as CNNs, Transformers, and Diffusion models.

AITemplate is up to twelve times faster, according to Meta

AITemplate converts AI models into high-performance C++ GPU template code as a Python framework, speeding up inference. According to Meta, AITemplate can speed up AI inference by up to 12x on Nvidia GPUs and up to 4x on AMD GPUs compared to Eager mode in PyTorch. In Eager mode, API calls are not executed until they are invoked. PyTorch is set to Eager Execution Mode by default.

Meta's AITemplate accelerates ResNet50 inference by a factor of 12 on an Nvidia A100 GPU at a low batch size. | Image: Meta

The framework offers numerous performance innovations, according to Meta, including advanced kernel fusion, an optimization method that combines multiple kernels into a single kernel to run more efficiently, and advanced optimizations for transformer blocks.

Meta's open-source framework can accelerate stable diffusion

Meta also delivers commonly used models out of the box with AITemplate, including Vision Transformer, BERT, Stable Diffusion, ResNet, and MaskRCNN. The generative image AI system Stable Diffusion (SD) runs about 2.4 times faster with AIT on an Nvidia GPU, allowing it to work around an out-of-memory error in a test on an RTX 3080 even with high SD settings.

In practice, Meta's AIT can thus accelerate image generation and processing with Stable Diffusion or enable higher resolutions, for example. The implementation of AIT in common solutions like Stable Diffusion WebUI is probably only a matter of time.

According to Meta, the release of AITemplate is also just the beginning of a long series of planned releases on the road to building a powerful AI inference engine. Further optimizations are planned, as well as expansion to other hardware systems such as Apple M-series GPUs and CPUs from other companies.

Metas AITemplate is available on GitHub.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • Meta releases the open-source Python framework AITemplate.
  • AITemplate accelerates AI inference on Nvidia and AMD GPUs many times over.
  • To this end, AITemplate transforms AI models into high-performance C++ GPU template code.
  • Stable Diffusion also benefits from Meta's framework and runs 2.4 times faster.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.