Ad
Skip to content

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Read full article about: Anthropic drops the surcharge for million-token context windows, making Opus 4.6 and Sonnet 4.6 far cheaper

Anthropic is making Claude's extra-large context window a lot cheaper. The Opus 4.6 and Sonnet 4.6 models now offer a context window of one million tokens at the standard price. Until now, Anthropic charged a surcharge of up to 100 percent for requests exceeding 200,000 tokens. The context window determines how much text an AI model can process in a single request.

Opus 4.6 still costs $5/$25 per million tokens (input/output), and Sonnet 4.6 runs $3/$15. But whether a prompt contains 9,000 or 900,000 tokens no longer matters for pricing. On top of that, the media limit jumps from 100 to 600 images or PDF pages per request. The new pricing applies to Claude Code (Max, Team, and Enterprise) and is available through Amazon Bedrock (except for the media limit), Google Cloud Vertex AI, and Microsoft Foundry.

The GraphWalks BFS benchmark measures how well AI models handle logical reasoning across large amounts of text. Opus 4.6 reportedly shows almost no drop in performance even at full context length. | Image: Anthropic

According to Anthropic, both models achieve the highest accuracy among comparable models at full context length in benchmark tests. That said, the broader problem of declining precision as context windows fill up is still far from solved.

Read full article about: Elon Musk admits xAI "was not built right first time around," launches full restructuring

Elon Musk's AI company xAI is going through a major shake-up. Musk acknowledged on X that the company "was not built right first time around" and is now being rebuilt from the ground up. Six of the twelve co-founders have left xAI since January, most recently Guodong Zhang and Zihang Dai. Only Manuel Kroiss and Ross Nordeen have stayed on alongside Musk.

via X

At a recent conference, Musk admitted that Grok is falling behind competitors like Google, Anthropic, and OpenAI when it comes to coding - but said the company aims to close the gap by mid-2026. To get there, xAI has hired two senior executives from the AI coding startup Cursor: Andrew Milich and Jason Ginsberg, both reporting directly to Musk. According to the Financial Times, Musk has also brought in "problem solvers" from SpaceX and Tesla to help restructure xAI.

Google explains the differences between its three Nano Banana image generation models

A new guide from Google breaks down the three Nano Banana image models and when to use each one. The cheaper Nano Banana 2 reportedly delivers 95 percent of Pro’s capabilities and can search the web for reference images on its own before generating output.

Read full article about: Meta delays its next AI model Avocado after internal tests show it can't keep up with Google and OpenAI

Meta has reportedly delayed its next AI model, codenamed "Avocado." Originally set for mid-March 2026, it won't ship until May at the earliest, reports the New York Times, citing three people familiar with the matter.

In internal tests, Avocado fell short of leading models from Google, OpenAI, and Anthropic in logical reasoning, programming, and writing. It beat Meta's previous model and Google's Gemini 2.5 but couldn't match Gemini 3.0. Meta's leadership even discussed temporarily licensing Gemini, though no decision was made. A next-gen model codenamed "Watermelon" is already planned. Meta is also building an image and video generator codenamed "Mango."

Meta says updates are coming "very soon," with more models planned this year. The company found early success with its open Llama models but lost momentum with Llama 4. CEO Mark Zuckerberg has since poured billions into AI, including $14.3 billion in Scale AI. Scale AI's CEO Alexandr Wang now runs Meta's frontier AI division, "TBD Lab," tasked with building superintelligent AI systems. Reports also suggest Meta may be moving away from its open-source strategy.

Read full article about: Grok 4.20 trails Gemini and GPT-5.4 by a wide margin but sets a new record for not hallucinating

xAI's Grok 4.20 can't keep up with the top AI models in benchmarks but hallucinates less than any other model tested. According to Artificial Analysis, Grok 4.20 Beta scores 48 on the Intelligence Index with reasoning enabled, well behind Gemini 3.1 Pro Preview and GPT-5.4 at 57, but still a 6-point improvement over Grok 4.

Grok hängt den neuesten Modellen der großen KI-Labore hinterher. | Bild: Artificial Analysis
Grok trails the latest models from major AI labs in overall benchmark performance. | Image: Artificial Analysis

xAI shipped three API variants: with reasoning, without reasoning, and a multi-agent mode. The model supports a 2-million-token context window and costs 2 or 6 dollars per million tokens; cheaper than Grok 4 and competitively priced among Western models.

Where Grok 4.20 stands out, of all things, is factual reliability. On the AA Omniscience test, it hit a 78 percent non-hallucination rate, a record, according to Artificial Analysis. The test measures how often a model fabricates an answer instead of admitting it doesn't know, alongside factual recall. Grok 4.20 only got it wrong about one in five times when it didn't have the answer.

Read full article about: US War Department CTO says Anthropic's AI models "pollute" the supply chain with built-in ethics

Emil Michael, the US Department of War's chief technology officer, made clear that classifying Anthropic as a supply chain risk is an ideologically motivated move. Claude models "pollute" the supply chain because they have a "different policy preference" baked into them, Michael told CNBC. He pointed to Anthropic's "constitution," a ruleset emphasizing ethics and safety, which he said could result in soldiers receiving "ineffective weapons, ineffective body armor, ineffective protection." The measure was "not meant to be punitive," he added.

Anthropic is the first US company to receive this classification, which is normally reserved for foreign adversaries. The AI company is suing over the designation and has drawn support from Microsoft, OpenAI, and Google employees, as well as former US military personnel. Anthropic has previously pushed back against its own AI models being used for US mass surveillance and autonomous weapons.

The administration has already signaled its intent to control AI along ideological lines by enacting regulations targeting so-called "woke AI," framed as a commitment to political neutrality. The approach echoes the Chinese government's own efforts to exert political control over AI models.

Comment Source: CNBC

Copilot Health marks Microsoft's entry into the AI health race alongside OpenAI and Anthropic

Microsoft is launching Copilot Health, an AI health assistant that pulls data from wearables, medical records, and lab results to deliver personalized health advice. Long term, the company says it’s working toward “medical superintelligence.”

Read full article about: ChatGPT still leads the chatbot market but its dominance is slipping as Google's Gemini gains ground

ChatGPT still dominates the chatbot market, but its lead is shrinking. New data from Similarweb shows OpenAI's chatbot accounted for just 61.7 percent of global AI web traffic in February 2026, down from 75.7 percent twelve months earlier. The biggest winner is Google Gemini, which more than quadrupled its share from 5.7 percent to 24.4 percent over the same period. Grok (3.4 percent) and Claude (3.3 percent) have overtaken DeepSeek (3.2 percent) for the first time, claiming third and fourth place. Claude crossed the three percent mark for the first time in February, though it's much stronger in the B2B market, according to a separate study.

ChatGPT still leads overall, but Google Gemini has closed the gap significantly. These figures only cover web traffic. | Image: Similarweb

In absolute numbers, ChatGPT recorded 5.35 billion visits in February, while Gemini pulled in 2.11 billion. Grok came in at 298.5 million visits, Claude at 290.3 million, Deepseek at 246.4 million, and Perplexity at 153.8 million. Microsoft's Copilot stagnated at 1.1 percent market share, though that only reflects the web version. Microsoft's actual share of the enterprise market is likely much higher.

Read full article about: Google's new Ask Maps lets you search for places in plain language using Gemini AI

Google has introduced "Ask Maps," a conversational feature powered by its Gemini models. Users can ask questions in plain language, like "Is there a public tennis court with lights on that I can play at tonight?" or "My phone is dying — where can I charge it without having to wait in a long line for coffee?" The feature taps into data from more than 300 million locations and reviews from over 500 million contributors.

Results show up on a personalized map based on past searches and saved places. Users can book tables, save or share locations, and jump into navigation directly. Ask Maps is rolling out first in the US and India on Android and iOS, with a desktop version on the way.

Google also announced "Immersive Navigation," a revamped turn-by-turn system with a 3D view of surroundings, including buildings, overpasses, and lane markings. Gemini models generate the visuals by analyzing Street View and aerial imagery.

Immersive Navigation launches first in the US, expanding to more iOS and Android devices, CarPlay, Android Auto, and cars with built-in Google over the coming months.

Read full article about: OpenAI is reportedly planning to integrate its video AI Sora into ChatGPT

OpenAI is reportedly planning to fold its video AI Sora directly into ChatGPT. So far, Sora has only been available as a standalone mobile and web app. OpenAI originally pitched it as a viral hit and potential TikTok alternative, a strategy that seemed to work early on, partly thanks to massive copyright infringements.

That momentum didn't last. According to The Information, the app has slid from No. 1 to No. 165 in the Apple App Store since launching last fall. CEO Sam Altman reportedly admitted internally that hardly anyone was sharing videos publicly. Rolling Sora into ChatGPT might fix that: with around 920 million weekly active users, the move would naturally drive more video generation. The standalone app will stick around for now, The Information reports.

Google already offers video generation in Gemini, though with tight capacity limits and only for paying subscribers. OpenAI will likely go a similar route: the company is strapped for compute, burns through cash supporting the roughly 95 percent of free ChatGPT accounts, and video generation is especially resource-hungry.