Ad
Skip to content

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Read full article about: OpenAI hires Anthropic's Dylan Scandinaro to lead AI safety as "extremely powerful models" loom

OpenAI has filled its "Head of Preparedness" position with Dylan Scandinaro, who previously worked on AI safety at competitor Anthropic. CEO Sam Altman announced the hire on X, calling Scandinaro "by far the best candidate" for the role. With OpenAI working on "extremely powerful models," Altman said strong safety measures are essential.

In his own post, Scandinaro acknowledged the technology's major potential benefits but "risks of extreme and even irrecoverable harm." OpenAI recently disclosed that a new coding model received a "high" risk rating in cybersecurity evaluations.

There’s a lot of work to do, and not much time to do it!

Dylan Scandinaro

Scandinaro's Anthropic background adds an interesting layer. The company was founded by former OpenAI employees concerned about OpenAI's product focus and what they saw as insufficient safety measures, and has since become known as one of the more safety-conscious AI developers. Altman says he plans to work with Scandinaro to implement changes across the company.

A new platform lets AI agents pay humans to do the real-world work they can't

On Rentahuman.ai, AI agents can hire people for real-world tasks, from holding signs to picking up packages. It sounds absurd, but it shows what happens when language models stop just talking and start taking action.

Read full article about: Gemini models dominate new AI rankings for strategic board games

Google's Gemini models are outperforming the competition in board game benchmarks. Google Deepmind and Kaggle have expanded their "Game Arena" platform with two new games: Werewolf and Poker. The platform tests AI models across strategic games that measure different cognitive abilities—chess evaluates logical thinking, Werewolf tests social skills like communication and detecting deception, and Poker assesses how models handle risk and incomplete information.

These games provide objective ways to measure skills like planning and decision-making under uncertainty. Gemini 3 Pro and Gemini 3 Flash currently hold the top spots in all rankings. The Werewolf benchmark serves double duty for security research as well: it tests whether models can detect manipulation without any real-world consequences. According to Google Deepmind CEO Demis Hassabis, the AI industry needs more rigorous tests to properly evaluate the latest models.

Read full article about: Firefox users will soon be able to block all generative AI features in one place

Mozilla is rolling out new AI settings with Firefox 148 on February 24. Users will be able to manage all the browser's generative AI features from a single location, or turn them off entirely, the company announced in a blog post.

The new settings cover translations, automatic image descriptions in PDFs, AI-powered tab grouping, link previews, and a chatbot in the sidebar. The chatbot supports services like Anthropic Claude, ChatGPT, Microsoft Copilot, Google Gemini, and Le Chat Mistral.

For users who want nothing to do with AI features, a single toggle blocks all AI extensions. Once enabled, no pop-ups or notifications about current or future AI features will appear. The settings persist through updates. Users who want to try the feature early can find it in Firefox Nightly.

Read full article about: OpenAI launches Codex app for macOS to manage multiple AI agents

OpenAI has released the Codex app for macOS, letting developers control multiple AI agents simultaneously and run tasks in parallel. According to OpenAI, it's easier to use than a terminal, making it accessible to more developers. Users can manage agents asynchronously across projects, automate recurring tasks, and connect agents to external tools via "skills." They can also review and correct work without losing context.

The Codex Mac app is available for ChatGPT Plus, Pro, Business, Enterprise, and Edu accounts. OpenAI is also doubling usage limits for paid plans. The app integrates with the CLI, IDE extension, and cloud through a single account. Free and Go users can try it for a limited time—likely a response to Claude Code's success with knowledge workers and growing demand for agentic systems (see Claude Cowork) that handle more complex tasks than standard chatbots.

Read full article about: Former OpenAI researcher says current AI models can't learn from mistakes, calling it a barrier to AGI

Jerry Tworek, one of the minds behind OpenAI's reasoning models, sees a fundamental problem with current AI: it can't learn from mistakes. "If they fail, you get kind of hopeless pretty quickly," Tworek says in the Unsupervised Learning podcast. "There isn't a very good mechanism for a model to update its beliefs and its internal knowledge based on failure."

The researcher, who worked on OpenAI's reasoning models like o1 and o3, recently left OpenAI to tackle this problem. "Unless we get models that can work themselves through difficulties and get unstuck on solving a problem, I don't think I would call it AGI," he explains, describing AI training as a "fundamentally fragile process." Human learning, by contrast, is robust and self-stabilizing. "Intelligence always finds a way," Tworek says.

Other scientists have described this fragility in detail. Apple researchers recently showed that reasoning models can suffer a "reasoning collapse" when faced with problems outside of the patterns they learned in training.

OpenClaw (formerly Clawdbot) and Moltbook let attackers walk through the front door

How secure are AI agents? Not very, it turns out. OpenClaw’s system prompts can be extracted with a single attempt. Moltbook’s database was publicly accessible—including API keys that could let anyone impersonate users like Andrej Karpathy.