AI research institute Allen AI has released SERA, a family of open-source coding agents designed for easy adaptation to private code bases. The top model, SERA-32B, solves up to 54.2 percent of problems in the SWE-Bench-Test Verified coding benchmark (64K context), outperforming comparable open-source models.
SERA outperforms comparable open-source coding agents on the SWE-Bench-Test Verified benchmark with 32K context. | Image: Allen AI
According to AI2, training takes just 40 GPU days and costs between $400 to match previous open-source results and $12,000 for performance on par with leading industry models. This makes training on proprietary code data realistic even for small teams. SERA uses a simplified training method called "Soft-verified Generation" that doesn't require perfectly correct code examples. Technical details can be found in the blog.
The models work with Claude Code and can be launched with just two lines of code, according to Allen AI. All models, code, and instructions are available on Hugging Face under the Apache 2.0 license.
Mistral AI has unveiled Mistral Vibe 2.0, an upgrade to its terminal-based coding agent powered by the Devstral 2 model. The tool enables developers to control code using natural language, orchestrate multiple files simultaneously, and leverage full codebase context.
New in version 2.0 are custom subagents for specific tasks like testing or code reviews, clarifying questions when instructions are ambiguous instead of automatic decisions, and slash commands for preconfigured workflows.
Mistral Vibe is available through Le Chat Pro ($14.99/month) and Team plans ($24.99/seat). Devstral 2 moves to paid API access – free usage remains available for testing on the Experiment plan. For enterprises, Mistral additionally offers fine-tuning, reinforcement learning, and code modernization services.
Former Tesla AI chief Andrej Karpathy now codes "mostly in English" just three months after calling AI agents useless
Just last October, Andrej Karpathy dismissed AI agents: “They just don’t work.” Now he says 80 percent of his coding is agent-based and calls it the “biggest change to my basic coding workflow in ~2 decades.” A typically measured voice is joining the agent coding hype, but with some warnings attached.
OpenAI is charging around $60 per 1,000 impressions for its initial ChatGPT ads, far above typical online advertising rates in the low single digits and closer to what advertisers pay for premium TV spots like NFL games, according to The Information. The ads show up below ChatGPT responses in the free and lower-cost "Go" tiers.
Microsoft has unveiled its new AI inference chip, Maia 200. Built specifically for inference workloads, the chip delivers 30 percent better performance per dollar than current-generation chips in Microsoft's data centers, the company claims. It's manufactured using TSMC's 3-nanometer process, packs over 140 billion transistors, and features 216 GB of high-speed memory.
According to Microsoft, the Maia 200 is now the most powerful in-house chip among major cloud providers. The company claims it delivers three times the FP4 performance of Amazon's Trainium 3 while also outperforming Google's TPU v7 in FP8 calculations—though independent benchmarks have yet to verify these figures.
Microsoft's comparison shows the Maia 200 outperforming Amazon's Trainium 3 and Google's TPU v7 across key specifications. | Image: Microsoft
Microsoft says the chip already powers OpenAI's GPT 5.2 models and Microsoft 365 Copilot. Developers interested in trying it out can sign up for a preview of the Maia SDK. The Maia 200 is currently available in Microsoft's Iowa data center, with Arizona coming next. More technical details about the chip are available here.
Emergency meetings and failed billion-dollar talks reveal the chaos behind Apple's pivot to Google Gemini
Internal crisis meetings, a leader who cried “bullshit” and convinced no one, and billion-dollar negotiations that fell apart: Bloomberg reveals the backstory behind Apple’s decision to partner with Google.