Microsoft has introduced two new AI models developed entirely in-house: the MAI-Voice-1 speech model and the MAI-1-preview text model. The announcement signals a move toward greater independence from OpenAI, whose technology has been central to Microsoft's Copilot products.
MAI-Voice-1 is designed for efficiency and expressiveness. Microsoft reports that it can generate a minute of audio in under a second using a single GPU, making it one of the fastest speech systems available. The model is already being used in Copilot Daily and podcasts, and is available for testing in Copilot Labs.
MAI-1-preview is Microsofts first real foundation model
The second model, MAI-1-preview, is Microsoft's first foundation model. It was trained on roughly 15,000 NVIDIA H100 GPUs, which is significantly fewer than the over 100,000 GPUs used for models like xAI's Grok. Microsoft aimed to create a model that could deliver strong performance with fewer resources. Prior to this, the company had only developed smaller language models in the Phi series.
Mustafa Suleyman, head of Microsoft AI (MAI), has pointed out that MAI-1-preview performs beyond what its size would suggest. He has argued that success with these models depends not just on raw compute power, but also on careful data selection and efficient use of resources. According to Suleyman, it is important to avoid wasting computing power on data that does not contribute to the model's learning.
MAI-1-preview is currently available for public testing on the LMArena platform and is being gradually integrated into Copilot features. The model currently ranks 13th on LMArena, though Microsoft has not published detailed benchmarks.
Developers interested in early access can apply for API access.
Building its own infrastructure and vision
Microsoft says that developing its own models is part of a long-term strategy, with a five-year roadmap and continued investment. The company is using a new compute cluster based on Nvidia's GB200 chips, and plans to build specialized models for different use cases, integrating its AI platform into Windows, Office, and Azure.
A major focus is on shaping model behavior after training. Microsoft is working to remove traits that could make the AI appear as if it has emotions or intentions. Suleyman has previously warned about the risks of seemingly sentient AI that imitates human behavior, suggesting that now is the time to address these concerns.
Changing dynamics with OpenAI?
With up to $13 billion invested and multiple exclusive agreements, Microsoft remains OpenAI's largest backer. However, the companies are now engaged in tough negotiations over OpenAI's planned restructuring. Microsoft’s push to develop its own models could be seen as a signal to OpenAI, but company leadership maintains that the goal is to strengthen the partnership and ensure productive collaboration for the future.