Investors are betting that AI will replace labor costs, not software budgets.
"We took a view that AI is not 'enterprise' software in the traditional sense of going after IT budgets: it captures labour spend, at some point you’re taking over human workflows end to end," Sebastian Duesterhoeft, a partner at Lightspeed Venture Partners, told the Financial Times.
This logic underpins the current funding round valuing Anthropic at $350 billion: While classic SaaS solutions compete for limited IT budgets, "agentic AI" systems target the far larger pool of labor costs.
The explosive nature of this shift has already been felt in the markets. A series of developments—including new models, specialized industry tools, and news that Goldman Sachs plans to automate banking roles—collectively helped trigger a sell-off in public markets for traditional software stocks. According to the FT, investors are increasingly realizing that autonomous AI agents could threaten existing business models.
Best multimodal models still can't crack 50 percent on basic visual entity recognition
A new benchmark called WorldVQA tests whether multimodal AI models actually recognize what they see or just make it up. Even the best performer, Gemini 3 Pro, tops out at 47.4 percent when asked for specific details like exact species or product names instead of generic labels. Worse, the models are convinced they’re right even when they’re wrong.
Claude Opus 4.6 is the new top-ranked AI model, at least until Artificial Analysis finishes benchmarking OpenAI's Codex 5.3, which will likely pull ahead in coding. Anthropic's latest model leads the Artificial Analysis Intelligence Index, a composite of ten tests covering coding, agent tasks, and scientific reasoning, with first-place finishes in agent-based work tasks, terminal coding, and physics research problems.
Artificial Analysis
Running the complete test suite costs $2,486, more than the $2,304 required for GPT-5.2 at maximum reasoning performance. Opus 4.6 consumed roughly 58 million output tokens, twice as many as Opus 4.5 but significantly fewer than GPT-5.2's 130 million. The higher total price comes down to Anthropic's token pricing of $5 and $25 per million input and output tokens, respectively.
Opus 4.6 is available through the Claude.ai apps and via Anthropic's API, Google Vertex, AWS Bedrock, and Microsoft Azure.
Anthropic just launched a new fast mode for Claude, and the pricing is steep: the "Fast Mode" for Opus 4.6 costs up to six times the standard rate. In return, Anthropic says the model responds 2.5 times faster at the same quality level. The mode is built for live debugging, rapid code iterations, and time-critical tasks. For longer autonomous runs, batch processing/CI-CD pipelines, and cost-sensitive workloads, Anthropic says you're better off sticking with standard mode.
Standard
Fast mode
Input ≤ 200K tokens
$5 / MTok
$30 / MTok
Input > 200K tokens
$10 / MTok
$60 / MTok
Output ≤ 200K tokens
$25 / MTok
$150 / MTok
Output > 200K tokens
$37,50 / MTok
$225 / MTok
Fast Mode can be toggled on in Claude Code with /fast and works across Cursor, GitHub Copilot, Figma, and Windsurf. There's a 50 percent introductory discount running until February 16. The mode isn't available through Amazon Bedrock, Google Vertex AI, or Microsoft Azure Foundry. Anthropic plans to expand API access down the line, interested developers can sign up for a waiting list.
Study finds AI reasoning models generate a "society of thought" with arguing voices inside their process
New research reveals that reasoning models like Deepseek-R1 simulate entire teams of experts when solving problems: some extraverted, some neurotic, all conscientious. This internal debate doesn’t just look like teamwork. It measurably boosts performance.
Integrating AI agents into enterprise operations takes more than a few ChatGPT accounts. OpenAI is hiring hundreds of engineers for its technical consulting team to customize models with customer data and build AI agents, The Information reports. The company currently has about 60 such engineers plus over 200 in technical support. Anthropic is also working directly with customers.
The problem: AI agents often don't work reliably out of the box. Retailer Fnac tested models from OpenAI and Google for customer support, but the agents kept mixing up serial numbers. The system reportedly only worked after getting help from AI21 Labs.
OpenAI's new agentic enterprise platform "Frontier" shows just how complex AI integration can get: the technology needs to connect to existing enterprise systems ("systems of record"), understand business context, and execute and optimize agents—all before users ever touch an interface. | Image: OpenAI
This need for hands-on customization could slow how fast AI providers scale their B2B agent business and raises questions about how quickly tools like Claude Cowork can deliver value in an enterprise context. Model improvements and better reliability on routine tasks could help, but fundamental LLM-based security risks remain.