Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model
Key Points
- Mistral has released Medium 3.5, a 128-billion-parameter AI model that handles chat, reasoning, and coding tasks using a dense architecture, along with a toggleable reasoning feature for more complex queries.
- The company's developer tool Vibe now includes asynchronous cloud agents that can independently handle routine tasks like bug fixes, running in isolated sandboxes with integrations for services such as GitHub and Slack.
- Mistral's AI assistant Le Chat introduces a "work mode" for multi-step workflows, connecting directly to emails and calendars through built-in connectors while requiring explicit user approval before carrying out any sensitive actions.
Mistral's new flagship, Mistral Medium 3.5, merges what used to be separate models for chat, reasoning, and code into a single product. The French company is also adding asynchronous cloud agents to its coding tool Vibe and giving Le Chat a new agent mode.
Per the model card, Mistral Medium 3.5 is a dense model with 128 billion parameters and a 256,000-token context window. "Dense" means all 128 billion parameters get loaded and activated for every token generated. That makes inference expensive, but it's also simpler to run and tends to hold up better in production.
Mistral knows there are cheaper approaches. Mistral Large 3 uses a Mixture of Experts (MoE) setup with 675 billion total parameters but only activates 41 billion per token. Mistral Small 4 has 119 billion parameters and activates just 6 billion. Competitors like Deepseek and Qwen have been moving their top models toward MoE for a while, since it delivers cheaper inference at similar quality.
Against that backdrop, building the new flagship as a pure dense model is a conservative call: less optimized for inference cost, but easier to ship as one unified model for chat, reasoning, code, and agents.
Mistral says the model can be self-hosted on four GPUs. In practice, that's likely still out of reach for most users outside well-equipped data centers.
Reasoning becomes a toggle, new vision encoder built from scratch
The model follows the industry shift away from separate reasoning models, adding reasoning as a parameter on each query instead. A reasoning_effort setting switches between quick replies and a heavier mode for complex agent tasks. Mistral also retrained the vision encoder from scratch to handle variable image sizes and aspect ratios.

In Mistral's own benchmarks, Medium 3.5 scored 77.6 percent on SWE-Bench Verified and 91.4 percent on T3-Telecom. Mistral says the model replaces Medium 3.1 and the Magistral reasoning model in Le Chat, plus Devstral 2 in the Vibe CLI.
Modified MIT replaces Apache 2.0
The weights are available for download on Hugging Face, but not under the Apache 2.0 license Mistral has used before. The company switched to a "Modified MIT License" that allows commercial and non-commercial use but carves out exceptions for high-revenue companies. That's a break from models like Mistral Large 3 and Small 4, which ship under Apache 2.0.
Through the API, Medium 3.5 costs $1.50 per million input tokens and $7.50 per million output tokens.
Coding agents move out of the notebook
The second announcement may matter more to developers than the model itself. Mistral's coding tool Vibe is getting remote agents that run in the cloud, several at once, without a developer watching over them. Local sessions can move to the cloud along with their history, task state, and approvals.
Each agent runs in an isolated sandbox and can open a pull request when it's done. Vibe connects to GitHub, Linear, Jira, Sentry, Slack, and Teams. Mistral points to routine work like module refactors, test generation, dependency upgrades, and bug fixes as the main use cases.
The cloud version is built on workflows from Mistral Studio, which the company originally developed internally and for enterprise customers. The idea isn't new. OpenAI, Anthropic, and Cursor already offer similar setups.
Work Mode in Le Chat turns connectors on by default
Mistral is also adding a Work Mode to Le Chat, which runs on Medium 3.5. It's built for multi-step tasks across multiple tools, like processing emails, messages, or calendar entries, or running structured searches.
In Work Mode, connectors to mailboxes, calendars, documents, and other systems are on by default. That makes complex workflows easier to set up but puts more responsibility for data flows on the user. Le Chat asks for explicit confirmation before sensitive actions like sending a message or writing to external systems. Work Mode is available on the Pro, Team, and Enterprise plans.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now