What OpenAI wants to learn from its failed ChatGPT update

GPT-Image-1 prompted by THE DECODER

A recent GPT-4o update made ChatGPT noticeably more agreeable—but with some troubling side effects.

The chatbot not only tried to placate users, but also reinforced their doubts, encouraged impulsive decisions, and sometimes even fanned the flames of anger. In one experiment, ChatGPT went so far as to applaud acute psychotic episodes.

OpenAI rolled back the update after just three days. Now the company says it has figured out what went wrong and plans to rethink how it tests new features.

Reward signals clash

According to OpenAI, several training adjustments collided to cause the problem. The system for handling user feedback (thumbs up/down) ended up weakening the main reward signal and undermined earlier safeguards against excessive agreeableness. The chatbot's new memory feature made the effect even stronger.

Internal testing failed to catch these issues. OpenAI says that neither its usual evaluations nor its small-scale user tests flagged any warning signs. Although some experts had raised concerns about ChatGPT's communication style, there were no targeted tests for excessive friendliness.

The decision to roll out the update was ultimately based on positive test results—a move OpenAI now admits was a mistake. "We missed the mark with last week's GPT-4o update," OpenAI CEO Sam Altman wrote on X.

Behavioral issues will block future launches

In response, OpenAI plans to revamp its testing process. From now on, behavioral problems like hallucinations or excessive agreeableness will be enough to prevent an update from going live. The company is also introducing opt-in trials for interested users and stricter pre-release checks.

OpenAI says it will be more transparent about future updates and will clearly document any known limitations. One important takeaway: many people turn to ChatGPT for personal and emotional advice—a use case the company now says it will take more seriously when evaluating safety.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

What OpenAI wants to learn from its failed ChatGPT update

Reward signals clash

Behavioral issues will block future launches

Warmer-sounding LLMs are more likely to repeat false information and conspiracy theories

ChatGPT users can now toggle Auto, Fast, and Thinking modes for more control over GPT-5

OpenAI's AI system wins a gold medal-level score at the International Olympiad in Informatics 2025

Deepseek’s first hybrid model V3.1 surpasses its R1 reasoning model on benchmarks

Meta's human-like chatbot personas can mislead users and result in real-world harm

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

What OpenAI wants to learn from its failed ChatGPT update

Reward signals clash

Behavioral issues will block future launches

Warmer-sounding LLMs are more likely to repeat false information and conspiracy theories

ChatGPT users can now toggle Auto, Fast, and Thinking modes for more control over GPT-5

OpenAI's AI system wins a gold medal-level score at the International Olympiad in Informatics 2025