Meta has unveiled details of the next generation of its in-house AI chip MTIA. According to Meta, the new chip is up to three times more powerful than its predecessor and is already being used in Meta's advertising and ranking processes.
Meta has announced new details on the next generation of the Meta Training and Inference Accelerator (MTIA), the chip family that Meta is developing specifically for inferencing its AI workloads.
According to the company, the new version of MTIA doubles the compute and memory bandwidth of the previous version, while maintaining a tight connection to Meta's workloads.
The architecture focuses on striking the right balance between processing power, bandwidth and storage capacity for Meta's ad ranking and recommendation models.
In addition, several programs are underway to expand the scope of MTIA, including support for GenAI workloads.
Meta also optimized the software level and developed Triton-MTIA, a low-level compiler that Meta says generates "high-performance code" for the MTIA hardware.
Meta says initial results show that the new chip has already tripled the performance of four key models over the first generation chip. Because Meta controls the entire stack, higher efficiency can be achieved compared to commercial GPUs.
The new chip can reduce Meta's dependence on Nvidia graphics cards in some areas, which are used not only to run AI models but also to train them, but cannot replace them.
Meta CEO Mark Zuckerberg recently announced that 340,000 Nvidia H100 GPUs will be in use by the end of the year, for a total of about 600,000 graphics cards. This makes Meta one of Nvidia's largest customers.
Meta isn't the only company investing in its own AI chips. Google just unveiled a new version, TPU v5p, that offers more than twice as many FLOPS and three times as much high-speed memory as the previous generation. Google says the chip is a general-purpose AI processor that supports training, fine-tuning, and inference.