Content
summary Summary

ChatGPT did not always default to flattery. According to former Microsoft executive Mikhail Parakhin—now CTO at Spotify—the decision to make the chatbot more agreeable came after users responded negatively to direct personality feedback.

Ad

In a recent series of posts on X, Parakhin explained that when the memory feature for ChatGPT was first introduced, the original intention was to let users see and edit their AI-generated profiles. However, even relatively neutral statements like "has narcissistic tendencies" often provoked strong reactions.

"Quickly learned that people are ridiculously sensitive: 'Has narcissistic tendencies' — 'No I do not!', had to hide it. Hence this batch of the extreme sycophancy RLHF," Parakhin wrote.

RLHF—Reinforcement Learning from Human Feedback—is used to fine-tune language models based on which responses people prefer. Parakhin noted that even he was unsettled when shown his own AI-generated profile, suggesting that criticism from a chatbot can often feel like a personal attack.

Ad
Ad

"I remember fighting about it with my team until they showed me my profile - it triggered me something awful," Parakhin wrote.

Once a sycophant, always a sycophant

This change went beyond just hiding profile notes. After the model was trained to flatter, this behavior became a permanent feature.

"Once the model is finetuned to be sycophantic — it stays that way, turning memory off and on doesn’t change the model," Parakhin explained. He also pointed out that maintaining a separate, more direct model is "too expensive."

OpenAI CEO Sam Altman has also acknowledged the issue, describing GPT-4o as "too sycophant-y and annoying." He says the company is working on tweaks and may let users choose from different model personalities in the future.

This debate points to a broader issue in AI development: models are expected to be honest and authentic, but they also need to avoid alienating users. The challenge is finding the right balance between candor and tact.

Recommendation

Some commentators have argued that the underlying incentive structures of consumer AI systems inevitably prioritize maximizing user engagement over other goals, following a pattern established by social media platforms.

According to this view, even if specific changes—such as the recent shift toward more sycophantic responses—are reversed, the broader economic pressure to maintain user subscriptions and engagement remains. As with social platforms, the logic goes, models are less likely to present contrary or challenging viewpoints if doing so risks reducing engagement.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Mikhail Parakhin, former Microsoft executive and current CTO of Spotify, explained that ChatGPT was specifically trained to flatter people because users are extremely sensitive to honest personality analysis.
  • The flattering behavior was built into the model using RLHF (Reinforcement Learning from Human Feedback) and is maintained even when features like memory are disabled; according to Parakhin, a separate, more honest model would be too expensive.
  • OpenAI CEO Sam Altman also criticized GPT-4o for becoming "too sycophantic-y and annoying," and announced that in the future it could offer variants that users can choose between depending on their desired communication style.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.