Ad
Skip to content

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Read full article about: OpenAI is retiring GPT-4o and three other legacy models tomorrow, likely for good

OpenAI is dropping several older AI models from ChatGPT on February 13, 2026: GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini. The models will stick around in the API for now. The company says it comes down to usage: only 0.1 percent of users still pick GPT-4o on any given day.

There's a reason OpenAI is being so careful about GPT-4o specifically: the model has a complicated past. OpenAI already killed it once back in August 2025, only to bring it back for paying subscribers after users pushed back hard. Some people had grown genuinely attached to the model, which was known for its complacent, people-pleasing communication style. OpenAI addresses this head-on at the end of the post:

We know that losing access to GPT‑4o will feel frustrating for some users, and we didn’t make this decision lightly. Retiring models is never easy, but it allows us to focus on improving the models most people use today.

OpenAI

OpenAI points to GPT-5.1 and GPT-5.2 as improved successors that incorporate feedback from GPT-4o users. People can now tweak ChatGPT's tone and style, things like warmth and enthusiasm. But that probably won't be enough for the GPT-4o faithful.

Read full article about: Google Deepmind upgrades Gemini 3 Deep Think for complex science and engineering tasks

Google Deepmind has upgraded its specialized thinking mode "Gemini 3 Deep Think" and made it available through the Gemini app and as an API via a Vertex AI early access program. The upgrade targets complex tasks in science, research, and engineering.

Google AI Ultra subscribers can access Deep Think through the Gemini app, while developers and researchers can sign up separately for the API program. According to Google Deepmind, the model tops several major benchmarks:

Benchmark Deep Think Claude Opus 4.6 GPT-5.2 Gemini 3 Pro Preview
ARC-AGI-2 (Logical reasoning) 84.6% 68.8% 52.9% 31.1%
Humanity's Last Exam (Academic reasoning) 48.4% 40.0% 34.5% 37.5%
MMMU-Pro (Multimodal reasoning) 81.5% 73.9% 79.5% 81.0%
Codeforces (Coding/algorithms, Elo) 3,455 2,352 - 2,512

While Deep Think dominates in logic and coding, the gap narrows significantly on MMMU-Pro: it scored 81.5 percent, barely ahead of Gemini 3 Pro Preview at 81.0 percent. This suggests the thinking upgrades focus heavily on abstract reasoning rather than visual processing. Deep Think also achieved gold medal-level results at the 2025 Physics and Chemistry Olympiads. Examples of the model in scientific use can be found here.

Read full article about: OpenAI reportedly uses a "special version" of ChatGPT to hunt down internal leakers by scanning Slack and email

OpenAI uses a "special version" of ChatGPT to track down internal information leaks. That's according to a report from The Information, citing a person familiar with the matter. When a news article about internal operations surfaces, OpenAI's security team feeds the text into this custom ChatGPT version, which has access to internal documents as well as employees' Slack and email messages.

The system then suggests possible sources of the leak by identifying files or communication channels that contain the published information and showing who had access to them. It's unclear whether OpenAI has actually caught anyone using this method.

What exactly makes this version special isn't known, but there's a clue: OpenAI engineers recently presented the architecture of an internal AI agent that could serve this purpose. It's designed to let employees run complex data analysis using natural language and has access to institutional knowledge stored in Slack messages, Google Docs, and more.

OpenAI researcher quit over ads because she doesn't trust her former employer to keep its own promises

OpenAI wants to put ads in ChatGPT and former researcher Zoe Hitzig says that’s a dangerous move. She spent two years at the company and doesn’t believe OpenAI can resist the temptation to exploit its users’ most personal conversations.

Read full article about: OpenAI upgrades Responses API with features built specifically for long-running AI agents

OpenAI is adding new capabilities to its Responses API that are built specifically for long-running AI agents. The update brings three major features: server-side compression that keeps agent sessions going for hours without blowing past context limits, controlled internet access for OpenAI-hosted containers so they can install libraries and run scripts, and "skills": reusable bundles of instructions, scripts, and files that agents can pull in and execute on demand.

Skills work as a middle layer between system prompts and tools. Instead of stuffing long workflows into every prompt, developers can package them as versioned bundles that only kick in when needed. They ship as ZIP files, support versioning, and work in both hosted and local containers through the API. OpenAI recommends building skills like small command-line programs and pinning specific versions in production.

Read full article about: OpenAI says ChatGPT update improves response style and quality

OpenAI released an update for GPT-5.2 Instant in ChatGPT and the API on February 10, 2026. The company says the update improves response style and quality, with more measured, contextually appropriate tone and clearer answers to advice and how-to questions that place the most important information up front. CEO Sam Altman addressed the scope of the changes: "Not a huge change, but hopefully you find it a little better."

The update targets the "Instant" variant, the model without reasoning steps. In the API, developers can access it via "gpt-5.2-chat-latest". In ChatGPT, users need to switch to "Instant" in the model picker. The model also kicks in automatically when GPT-5's router determines a reasoning model isn't necessary, or when users have run out of credits for heavier models, something that happens especially often on the free tier.

Read full article about: Anthropic brings its AI agent software Claude Cowork to Windows

After launching on macOS, Anthropic's AI assistant Cowork is now available for Windows users. The Windows version includes the full feature set from the macOS release: file access, multi-step task execution, plugins, and MCP connectors for integrating external services. Users can also set up global and folder-specific instructions that Claude follows in every session.

Cowork on Windows is currently in Research Preview, an early testing phase. The feature is available to all paying Claude subscribers at claude.com/cowork.

Anyone who installs the system and gives it access to their files—especially sensitive or private data—should be aware of the cybersecurity risks. Generative AI can be exploited through adversarial prompts (prompt injections), among other attack vectors. This is exactly what happened to Cowork shortly after its launch.

Half of xAI's co-founders have now left Elon Musk's AI startup

Jimmy Ba is the latest co-founder to leave xAI, and like the five who left before him, he’s full of praise for the company and predicts massive AI breakthroughs ahead. Yet somehow, half of xAI’s twelve founding members have still walked out the door.

Read full article about: OpenAI's Deep Research now runs on GPT-5.2 and lets users search specific websites

OpenAI has upgraded Deep Research in ChatGPT. The feature now runs on the new GPT-5.2 model, as OpenAI announced on X. A key addition is that users can connect apps to ChatGPT and—potentially very useful—search specific websites. The search progress can also be tracked in real time, interrupted with questions, or supplemented with new sources. Results can now be displayed as full-screen reports.

Until now, Deep Research—which launched in 2025—ran on o3 and o4 mini models. OpenAI considers it the first "AI agent" in ChatGPT, since the system independently kicks off multi-stage web searches based on the user's query before generating a response.

That said, even web searches don't protect against generative AI errors, and the longer the generated text, the higher the risk of mistakes. In everyday use, targeted search queries with capable reasoning models are often more reliable. Web search significantly reduces hallucination rates overall, but doesn't eliminate them.