Mistral AI adds web search and image generation to its Le Chat AI assistant, while introducing a new visual model that performs well on industry benchmarks.
Le Chat users can now access current web content through integrated web search and create images using Black Forest Labs' Flux Pro model. In addition, the assistant processes documents and images using Mistral's new Pixtral Large model.
The company also added a canvas interface that allows users to edit generated content directly in the chat window. Users can write documents, create presentations, and edit code without generating new responses.
With the integration of Pixtral Large, Le Chat can now analyze complex PDF documents, including graphics, tables, diagrams, and formulas. These new features are initially being rolled out as a free beta on the startup's "Le Chat" platform.
Pixtral Large shows competitive performance in visual tasks
The new Pixtral Large model, built on Mistral Large 2, shows good results in visual benchmarks. It scored 69.4 percent on MathVista, a test of mathematical reasoning with visual data, outperforming both GPT-4o and Gemini 1.5 Pro, according to the company.
Mistral says Pixtral Large also outperforms Claude 3.5 Sonnet, Gemini 1.5 Pro, and GPT-4o in analyzing diagrams and documents (ChartQA and DocVQA) and in real-world use cases (MM-MT-Bench).
The model combines a 123 billion parameter multimodal decoder with a one billion parameter vision encoder. It can process up to 30 high-resolution images at once with a 128K context window.
In addition to Le Chat, Mistral AI offers Pixtral Large under two licenses on Hugging Face: a research license for academic use and a commercial license for business applications.
The company is also updating its Mistral Large language model with improved long-context understanding and more precise function calling. The updated model is available through Mistral's API and will soon come to Google Cloud and Microsoft Azure.