Phi 1.5: Microsoft improves mini language model "Phi" with image analysis

Midjourney prompted by THE DECODER

Microsoft wants to reduce the cost of generative AI and is researching more efficient AI models. The latest model, Phi 1.5, can now analyze images.

Previously, Microsoft researchers developed the Phi-1 mini-language model, which can perform high-level coding tasks by training on relatively small amounts of high-quality data.

The model, which significantly outperformed larger models in benchmarks, was trained on "textbook quality" data. This underscores the importance of data quality in AI training.

Phi 1.5 can analyze images, yet it's still small

Now, Microsoft researchers have added the ability to analyze images to Phi. They call the multimodal model Phi 1.5, which has been additionally trained with synthetic data. The additional training and the new feature are said to have increased the size of the model only slightly.

Sebastien Bubeck, leader of the Machine Learning Foundations group at Microsoft Research, cites OpenAI's image analysis in GPT-4 as a role model. The research question, he says, was whether this capability was reserved for a giant AI model, or whether it could be integrated into a tiny model like Phi 1.5. "And, to our amazement, yes, we can do it," Bubeck tells Semafor.

GPT-4 is said to have about 1.7 trillion parameters in multiple interconnected neural networks; Phi-1 has 1.3 billion parameters, according to the paper, so it is a fraction of the size of GPT-4. However, the model also has significantly fewer capabilities, specializing for example in coding tasks with Python, rather than a general language model like GPT-4.

The high cost of high-quality generative AI

Microsoft's update to Phi-1.5 fits with reports that the cost of generative AI is high and companies are looking for more efficient models than OpenAI's GPT-4.

Microsoft research chief Peter Lee is said to have tasked many of the company's 1,500 researchers with developing smaller and less expensive chat AI models. The Phi model is cited as a model for the needed increase in efficiency.

According to Microsoft AI researcher Ahmed Awadallah, small and large AI models could work together in the future, with one model acting as an agent and handing off tasks to the larger model if it does not feel confident enough. Microsoft is already following this principle with Bing Chat in "balanced" mode.

Recommendation

AI research

The next leap in AI depends on agents that learn by doing, not just by reading what humans wrote

These lower-cost systems will be integrated into Microsoft's software, which can generate millions of requests per day, potentially resulting in particularly high costs.

Microsoft Phi is available as open source.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Phi 1.5: Microsoft improves mini language model "Phi" with image analysis

Phi 1.5 can analyze images, yet it's still small

The high cost of high-quality generative AI

The next leap in AI depends on agents that learn by doing, not just by reading what humans wrote

Microsoft’s MAI-DxO boosts AI diagnostic accuracy and cuts costs by nearly 70 percent

Microsoft’s Braga AI chip faces six-month delay, trails Nvidia’s Blackwell

Microsoft has introduced an AI agent to the Windows Settings menu

"Cat attack" on reasoning model shows how important context engineering is

Apple's claims about large reasoning models face fresh scrutiny from a new study

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

Phi 1.5: Microsoft improves mini language model "Phi" with image analysis

Phi 1.5 can analyze images, yet it's still small

The high cost of high-quality generative AI

Share

Bank details