Meta unveils four generations of custom AI chips to cut inference costs for billions of users
Meta has unveiled four new generations of custom AI chips—MTIA 300, 400, 450, and 500—designed to make AI cheaper to run across its platforms.
The chips are being developed in partnership with Broadcom and are built to make AI applications more cost-effective for the billions of users on Meta's platforms. Meta says it's following a roughly six-month development cycle per chip generation. From MTIA 300 to 500, memory bandwidth (HBM) increases by a factor of 4.5 and computing power jumps 25x.
MTIA 300 is optimized for ranking and recommendation models (R&R) and is already in production, according to Meta. MTIA 400 is the first generation that Meta says can compete with leading commercial products on raw performance. A rack of 72 chips forms a single scale-up domain. MTIA 400 has completed lab testing and is currently being rolled out to data centers.
MTIA 450 and 500 target generative AI inference
MTIA 450 and 500 are specifically optimized for generative AI inference. MTIA 450 doubles the HBM bandwidth compared to MTIA 400, outperforming existing commercial products, according to Meta. The chips support low-precision data formats like MX4 and MX8, which cut the computing power needed for inference without significantly hurting model quality. MTIA 500 adds another 50 percent HBM bandwidth and up to 80 percent more HBM capacity. Both chips are scheduled for mass production in 2027.
| Metric | MTIA 300 | MTIA 400 | MTIA 450 | MTIA 500 |
|---|---|---|---|---|
| Workload Focus | R&R Training | General | GenAI Inference | GenAI Inference |
| Module TDP | 800 W | 1200 W | 1400 W | 1700 W |
| HBM Bandwidth | 6.1 TB/s | 9.2 TB/s | 18.4 TB/s | 27.6 TB/s |
| HBM Capacity | 216 GB | 288 GB | 288 GB | 384-512 GB |
| MX4 Performance | - | 12 PFLOPs | 21 PFLOPs | 30 PFLOPs |
| FP8/MX8 Performance | 1.2 PFLOPs | 6 PFLOPs | 7 PFLOPs | 10 PFLOPs |
| BF16 Performance | 0.6 PFLOPs | 3 PFLOPs | 3.5 PFLOPs | 5 PFLOPs |
| Scale-up domain size | 16 | 72 | 72 | 72 |
| Scale-up network (unidirectional bandwidth*) |
1 TB/s | 1.2 TB/s | 1.2 TB/s | 1.2 TB/s |
| Scale-out network (unidirectional bandwidth*) |
200 GB/s** | 100 GB/s | 100 GB/s | 100 GB/s |
On the software side, Meta built the chips around industry standards like PyTorch, vLLM, and Triton. Developers can port existing models to MTIA without special adaptations and run them on GPUs and MTIA at the same time. More technical details are available on Meta's blog.
Meta also continues to work with AMD and Nvidia for GPUs. In early February 2026, Meta announced a billion-dollar deal with AMD to provide up to six gigawatts of AMD Instinct GPU computing power for Meta's AI workloads.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now