Content
summary Summary

Microsoft has introduced two new AI models developed entirely in-house: the MAI-Voice-1 speech model and the MAI-1-preview text model. The announcement signals a move toward greater independence from OpenAI, whose technology has been central to Microsoft's Copilot products.

Ad

MAI-Voice-1 is designed for efficiency and expressiveness. Microsoft reports that it can generate a minute of audio in under a second using a single GPU, making it one of the fastest speech systems available. The model is already being used in Copilot Daily and podcasts, and is available for testing in Copilot Labs.

MAI-1-preview is Microsofts first real foundation model

The second model, MAI-1-preview, is Microsoft's first foundation model. It was trained on roughly 15,000 NVIDIA H100 GPUs, which is significantly fewer than the over 100,000 GPUs used for models like xAI's Grok. Microsoft aimed to create a model that could deliver strong performance with fewer resources. Prior to this, the company had only developed smaller language models in the Phi series.

Mustafa Suleyman, head of Microsoft AI (MAI), has pointed out that MAI-1-preview performs beyond what its size would suggest. He has argued that success with these models depends not just on raw compute power, but also on careful data selection and efficient use of resources. According to Suleyman, it is important to avoid wasting computing power on data that does not contribute to the model's learning.

Ad
Ad

MAI-1-preview is currently available for public testing on the LMArena platform and is being gradually integrated into Copilot features. The model currently ranks 13th on LMArena, though Microsoft has not published detailed benchmarks.

Developers interested in early access can apply for API access.

Building its own infrastructure and vision

Microsoft says that developing its own models is part of a long-term strategy, with a five-year roadmap and continued investment. The company is using a new compute cluster based on Nvidia's GB200 chips, and plans to build specialized models for different use cases, integrating its AI platform into Windows, Office, and Azure.

A major focus is on shaping model behavior after training. Microsoft is working to remove traits that could make the AI appear as if it has emotions or intentions. Suleyman has previously warned about the risks of seemingly sentient AI that imitates human behavior, suggesting that now is the time to address these concerns.

Changing dynamics with OpenAI?

With up to $13 billion invested and multiple exclusive agreements, Microsoft remains OpenAI's largest backer. However, the companies are now engaged in tough negotiations over OpenAI's planned restructuring. Microsoft’s push to develop its own models could be seen as a signal to OpenAI, but company leadership maintains that the goal is to strengthen the partnership and ensure productive collaboration for the future.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Microsoft has introduced two new AI models developed internally: MAI-Voice-1, a fast and expressive speech system, and MAI-1-preview, its first large-scale foundation model for text, both aimed at reducing reliance on OpenAI technology.
  • MAI-1-preview was trained with significantly fewer GPUs than competing models like xAI's Grok, and Microsoft claims it performs above expectations for its size by focusing on efficient data use and careful training; it is now available for testing and ranks 13th on the LMArena leaderboard.
  • The company is building dedicated infrastructure and plans to integrate its AI into Windows, Office, and Azure, while also emphasizing the importance of shaping model behavior to avoid the appearance of emotions or intentions, amid ongoing complex negotiations with OpenAI over future collaboration.
Sources
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.