Essential AI's new open-source model, Rnj-1, outperforms significantly larger competitors on the "SWE-bench Verified" test. This benchmark is considered particularly challenging because it evaluates an AI's ability to independently solve real-world programming problems. Despite being a compact model with just eight billion parameters, Rnj-1 scores 20.8 points.

By comparison, similarly sized models like Qwen 3 (without reasoning, 8B) only reach 4.5 points in Essential AI's testing. The system was introduced by Ashish Vaswani, co-founder of Essential AI and co-author of the famous "Attention Is All You Need" paper that launched the Transformer architecture. Rnj-1 is also Transformer-based, specifically utilizing the Gemma 3 architecture. According to the company, development focused primarily on better pre-training rather than post-training methods like reinforcement learning. These improvements also result in lower pre-training computational costs, thanks to the use of the Muon optimizer.
Demis Hassabis, CEO of Google Deepmind, expects the next year to bring major progress in multimodal models, interactive video worlds, and more reliable AI agents. Speaking at the Axios AI+ Summit, Hassabis noted that Gemini's multimodal capabilities are already powering new applications. He used a scene from "Fight Club" to illustrate the point: instead of just describing the action, the AI interpreted a character removing a ring as a philosophical symbol of renouncing everyday life. Google's latest image model uses similar capabilities to precisely understand visual content, allowing it to generate complex outputs like infographics, something that wasn't previously possible.
Hassabis says AI agents will be "close" to handling complex tasks autonomously within a year, aligning with the timeline he predicted in May 2024. The goal is a universal assistant that works across devices to manage daily life. Deepmind is also developing "world models" like Genie 3, which generate interactive, explorable video spaces.