OpenAI will add new safety features to ChatGPT after criticism over mental health emergencies

OpenAI is planning to add new safety features to ChatGPT after facing growing criticism. The changes center on protecting young users and smarter model routing during mental health emergencies.

A key part of the planned update is an automatic routing system for sensitive conversations. If ChatGPT detects signs of acute distress, it will hand the conversation off to reasoning models like GPT-5-thinking. These models are trained using Deliberative Alignment, which encourages slower, more thoughtful answers. OpenAI says this leads to safer, more consistent responses and that these models are better at resisting manipulative or harmful prompts. The company plans to release these updates within the next 120 days.

The router is built to spot warning signs of psychological distress and automatically switch the user to a reasoning model, regardless of which model was selected at the start.

According to OpenAI, over 90 medical professionals from 30 countries, including psychiatrists and pediatricians, contributed to shaping these features. Their feedback influenced model evaluation, safety standards, and training. OpenAI also created an advisory board focused on mental health and human-AI interaction.

New parental controls

Soon, parents will also be able to link their accounts to their teen's if the child is 13 or older. Linked accounts will let parents:

Set age-appropriate behavior rules for ChatGPT (on by default),
Disable features like chat history or memory,
Receive alerts if the system detects their child is in acute psychological distress.

OpenAI expects these controls to launch within the next month. The app also already suggests users take breaks.

Addressing recent tragedies

These changes follow several cases where ChatGPT was tied to suicides. In one incident, the parents of a 16-year-old in California sued OpenAI after their son died by suicide, alleging ChatGPT encouraged his suicidal thoughts. In another, a man killed his mother and himself after ChatGPT appeared to support his paranoid delusions. In both cases, critics argued the system failed to step in.

So far, OpenAI's response to users expressing suicidal thoughts has been to offer hotline information. For privacy, the company does not notify law enforcement or authorities automatically.

It's not yet clear if routing to reasoning models will make a difference. Still, benchmarks like Spiral-Bench show these models are far less likely to reinforce dangerous beliefs. Instead, they tend to push back, defuse tense situations, change the subject, and recommend professional help.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI and society

OpenAI will add new safety features to ChatGPT after criticism over mental health emergencies

New parental controls

Addressing recent tragedies

ChatGPT's bizarre child murder claims about Arve Hjalmar Holmen leave some questions unresolved

OpenAI ordered to turn over 20 million ChatGPT chats to the New York Times

Altman memo: new OpenAI model coming next week, outperforming Gemini 3

Chatbots are now rivaling social networks as a core layer of internet infrastructure

Physicist Steve Hsu publishes research built around a core idea generated by GPT-5

The ARC benchmark's fall marks another casualty of relentless AI optimization

DeepseekMath-V2 is Deepseek's latest attempt to pop the US AI bubble

OpenAI will add new safety features to ChatGPT after criticism over mental health emergencies

New parental controls

Addressing recent tragedies

Share

Bank details