Ad
Skip to content
Read full article about: Claude Code sessions now accessible from any device

Claude Code users can now continue a locally running programming session from their smartphone, tablet, or browser. The session keeps running on the user's own machine - no data moves to the cloud. Local files, servers, and project configurations all remain accessible. Users connect through claude.ai/code or the Claude app for iOS and Android and can switch seamlessly between terminal, browser, and phone. If the network drops, the session automatically reconnects, though it ends after roughly ten minutes offline.

The feature is initially available as a research preview for Max subscribers, with Pro users next in line. Unlike Claude Code on the web, which has been running tasks in Anthropic's cloud environments since last year, remote control sessions run entirely on the user's own computer.

Anthropic is aggressively building out Claude Code, adding automated code reviews and GitHub integrations. The company is also raising $10 billion at a $350 billion valuation. Inventor Boris Cherny says the new Claude Cowork tool was built almost entirely with Claude Code itself.

 

Read full article about: Claude can now jump between Excel and PowerPoint on its own

Anthropic now lets Claude switch independently between Excel and PowerPoint, for example, running an analysis and then building a presentation directly from the results. The company is also expanding Cowork for enterprise customers with private plugin marketplaces, letting admins curate and distribute plugin collections to specific teams. New templates cover HR, design, engineering, finance, asset management, and more.

In finance, new MCP interfaces for FactSet and MSCI provide real-time market data and index analysis; S&P Global (Capital IQ Pro) and LSEG have contributed their own plugins.

New third-party integrations include Google Workspace, DocuSign, Salesforce, Slack, and FactSet. Admins gain finer user-access controls plus OpenTelemetry support for cost and usage monitoring. The Excel-PowerPoint feature is available as a research preview on all paid plans. Cowork is Anthropic's desktop tool for agent-based office work; plugins were added in late January but have known security vulnerabilities.

Ad

Deepmind suggests AI should occasionally assign humans busywork so we do not forget how to do our jobs

AI systems should sometimes give tasks to humans they could easily handle themselves, just so people don’t forget how to do their jobs. That’s one of the more striking recommendations from a new Google Deepmind paper on how AI agents should delegate work.

Read full article about: OpenAI ships API upgrades targeting voice reliability and agent speed for developers

OpenAI has shipped two API updates for developers: the new gpt-realtime-1.5 model for the real-time API is designed to make voice commands more reliable. In internal testing, OpenAI saw roughly a ten percent improvement in transcribing numbers and letters, a five percent bump in logical audio tasks, and seven percent better instruction following. The audio model has also been updated to version 1.5.

The Responses API also now supports WebSockets. Instead of retransmitting the full context with every request, this opens a persistent connection that only sends new data as it comes in. According to OpenAI, the change speeds up complex AI agents with many tool calls by 20 to 40 percent.

Ad
Read full article about: Google, OpenAI, and Anthropic are all bracing for Deepseek's next big release

Chinese AI startup Deepseek has apparently trained its latest AI model on Nvidia's most powerful Blackwell chips, despite the US export ban. That's according to Reuters, citing a senior Trump administration official. The model is expected to drop next week. Rumors about chip smuggling had already been circulating since late last year.

The official says the Blackwell chips are believed to be in a data center in Inner Mongolia, and Deepseek is expected to scrub technical fingerprints of US chip usage before release. The official wouldn't say how Deepseek obtained the chips. Nvidia declined to comment, and neither Deepseek nor the US Department of Commerce responded to Reuters.

If the timing of these leaks is any indicator, Deepseek may be on the verge of another major splash. Google, OpenAI, and Anthropic have all been complaining about distillation attacks on their models by Chinese startups, and OpenAI recently moved to relativize a well-known coding benchmark. Together, these moves suggest Deepseek is about to deliver strong results at rock-bottom prices once again. Back in January 2025, China's leading AI startup sent shockwaves through US tech stocks riding the AI bubble.

Read full article about: Anthropic accuses Deepseek, Moonshot, and MiniMax of stealing Claude's AI data through 16 million queries

Anthropic says it has caught Chinese AI labs Deepseek, Moonshot, and MiniMax running large-scale distillation attacks on Claude, a technique where a weaker model learns from the output of a stronger one. Over 24,000 fake accounts fired off more than 16 million queries targeting Claude's reasoning, programming, and tool usage capabilities. The labs used proxy services to bypass China's access restrictions.

Lab Requests Targets
Deepseek 150,000+ Extracting reasoning steps, reward model data for reinforcement learning, censorship-compliant answers on politically sensitive topics
Moonshot AI 3.4 million+ Agent-based reasoning, tool usage, programming, data analysis, computer vision, reconstructing Claude's thought processes
MiniMax 13 million+ Agent-based programming, tool usage and orchestration; pivoted to new Claude model within 24 hours

Deepseek specifically targeted Claude's reasoning chain, extracting thought processes and censorship-compliant answers on sensitive topics. MiniMax ran the biggest campaign by far with over 13 million requests. When Anthropic shipped a new model, MiniMax pivoted within 24 hours and redirected nearly half its traffic to the updated system, Anthropic says.

OpenAI and Google report similar attempts from Chinese labs. Anthropic is calling on the industry and policymakers to mount a coordinated response.

Read full article about: OpenAI wants to retire the AI coding benchmark that everyone has been competing on

OpenAI says the SWE-bench Verified programming benchmark has lost its value as a meaningful measure of AI coding ability. The company points to two main problems: at least 59.4 percent of the benchmark's tasks are flawed, rejecting correct solutions because they enforce specific implementation details or check functions not described in the task.

Many tasks and solutions have also leaked into leading models' training data. OpenAI reports that GPT-5.2, Claude Opus 4.5, and Gemini 3 Flash Preview could reproduce some original fixes from memory, meaning benchmark progress increasingly reflects what a model has seen, not how well it codes. OpenAI recommends SWE-bench Pro instead and is building its own non-public tests.

There's a possible strategic angle here: a "contaminated" benchmark can make rivals—especially open-source models—look better and skew rankings. SWE-bench Verified was long the gold standard for AI coding evaluation, with OpenAI, Anthropic, Google, and many Chinese open-weight models competing for small leads. AI benchmarks can provide useful signal, but their real-world value remains limited.

Ad