Nvidia's DreamDojo is an open source world model for robot training
Nvidia wants to move robot training out of the real world and into an AI world model. DreamDojo generates simulated futures from video data, no 3D engine required.
Nvidia wants to move robot training out of the real world and into an AI world model. DreamDojo generates simulated futures from video data, no 3D engine required.
AI agents are supposed to revolutionize how we work. But Anthropic’s own data tells a different story: so far, that revolution is almost entirely limited to software engineering. And even there, users aren’t letting agents work nearly as autonomously as the technology would allow.
Google's Gemini 3.1 Pro Preview leads the Artificial Analysis Intelligence Index four points ahead of Anthropic's Claude Opus 4.6, at less than half the cost. The model ranks first in six of ten categories, including agent-based coding, knowledge, scientific reasoning, and physics. Its hallucination rate dropped 38 percentage points compared to Gemini 3 Pro, which struggled in that area. The index rolls ten benchmarks into one overall score.

Running the full index test with Gemini costs $892, compared to $2,304 for GPT-5.2 and $2,486 for Claude Opus 4.6. Gemini used just 57 million tokens, well under GPT-5.2's 130 million. Open-source models like GLM-5 come in even cheaper at $547. When it comes to real-world agent tasks, though, Gemini 3.1 Pro still falls behind Claude Sonnet 4.6, Opus 4.6, and GPT-5.2.
As always, benchmarks only go so far. In our own internal fact-checking test, 3.1 Pro does significantly worse than Opus 4.6 or GPT-5.2, verifying only about a quarter of statements in initial tests, even fewer than Gemini 3 Pro, which was already weak here. So find your own benchmarks.
Sam Altman says AGI is “pretty close” and superintelligence “not that far off.” Speaking at the Express Adda event in India, the OpenAI CEO suggested the company’s internal models are already accelerating its own research and that “the world is not prepared” for what’s coming.
Anthropic is rolling out new desktop features for Claude Code that take development automation a step further. The AI can now spin up development servers and display running web apps right in the interface, spot errors, and fix them on its own.
There's also a new code review feature that checks changes and drops comments directly in the diff view. For GitHub projects, Claude keeps an eye on pull requests in the background, automatically fixes CI errors, and can even merge PRs on its own once tests pass. That means developers can move on to new tasks while Claude Code works through open PRs behind the scenes. Sessions pick up seamlessly across CLI, desktop, web, and mobile. All updates are available now.