Microsoft wants to reduce the cost of generative AI and is researching more efficient AI models. The latest model, Phi 1.5, can now analyze images.
Previously, Microsoft researchers developed the Phi-1 mini-language model, which can perform high-level coding tasks by training on relatively small amounts of high-quality data.
The model, which significantly outperformed larger models in benchmarks, was trained on "textbook quality" data. This underscores the importance of data quality in AI training.
Phi 1.5 can analyze images, yet it's still small
Now, Microsoft researchers have added the ability to analyze images to Phi. They call the multimodal model Phi 1.5, which has been additionally trained with synthetic data. The additional training and the new feature are said to have increased the size of the model only slightly.
Sebastien Bubeck, leader of the Machine Learning Foundations group at Microsoft Research, cites OpenAI's image analysis in GPT-4 as a role model. The research question, he says, was whether this capability was reserved for a giant AI model, or whether it could be integrated into a tiny model like Phi 1.5. "And, to our amazement, yes, we can do it," Bubeck tells Semafor.
GPT-4 is said to have about 1.7 trillion parameters in multiple interconnected neural networks; Phi-1 has 1.3 billion parameters, according to the paper, so it is a fraction of the size of GPT-4. However, the model also has significantly fewer capabilities, specializing for example in coding tasks with Python, rather than a general language model like GPT-4.
The high cost of high-quality generative AI
Microsoft's update to Phi-1.5 fits with reports that the cost of generative AI is high and companies are looking for more efficient models than OpenAI's GPT-4.
Microsoft research chief Peter Lee is said to have tasked many of the company's 1,500 researchers with developing smaller and less expensive chat AI models. The Phi model is cited as a model for the needed increase in efficiency.
According to Microsoft AI researcher Ahmed Awadallah, small and large AI models could work together in the future, with one model acting as an agent and handing off tasks to the larger model if it does not feel confident enough. Microsoft is already following this principle with Bing Chat in "balanced" mode.
These lower-cost systems will be integrated into Microsoft's software, which can generate millions of requests per day, potentially resulting in particularly high costs.
Microsoft Phi is available as open source.