OpenAI reportedly "squeezed" through safety testing for GPT-4 Omni in just one week

Midjourney prompted by THE DECODER

OpenAI has reportedly compressed safety testing for its latest AI model, GPT-4 Omni, to just one week.

The Washington Post reports that some employees criticized the company for prioritizing speed over thoroughness. According to three sources familiar with the matter, members of the safety team felt pressured to accelerate the new catastrophic risk testing protocol to meet the May launch date set by leadership. "We basically failed at the process," one anonymous source said.

OpenAI had even invited employees to a launch celebration party before testing began. "They planned the launch after-party prior to knowing if it was safe to launch," an insider revealed.

OpenAI spokeswoman Lindsey Held stated the company "didn’t cut corners on our safety process, though we recognize the launch was stressful for our teams." She added that OpenAI conducted "extensive internal and external" testing to meet regulatory obligations.

An unnamed representative of the preparedness team acknowledged that all the testing was done, but on a compressed timeline. OpenAI is now "rethinking our whole way of doing it" and the Omni approach was "just not the best way to do it", the representative said.

Is OpenAI reckless - or is AI safety overhyped?

OpenAI's rush is also evident in the delayed release of GPT-4 Omni's voice functionality, now slated for fall due to ongoing safety tests. This follows confusing communications that led many users to expect voice capabilities immediately upon Omni's launch.

Several high-ranking safety researchers have also left OpenAI recently, some openly criticizing the company's safety practices. William Saunders, who departed in February 2024, said in a podcast that OpenAI had become more of a product company, adding he "didn't want to end up working on the Titanic of AI."

The big picture allows two conclusions: Either OpenAI is acting recklessly and negligently, accepting social risks for the sake of commercial success.

Or management believes that safety concerns about today's generative AI are exaggerated, that the emergence of AGI is completely unclear, and that the issue of AI safety is therefore overrated and serves primarily a marketing function.

Recommendation

AI in practice

Update

Google DeepMind's Gemini wins Mathematical Olympiad gold using only natural language

A past example shows that this marketing approach can work. In 2019, OpenAI classified its GPT-2 model as too dangerous to release publicly, which brought massive attention to the company. Just a few months later, two students replicated GPT-2's level of performance. Compared to today's freely available models, GPT-2's performance is ridiculously low.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

OpenAI reportedly "squeezed" through safety testing for GPT-4 Omni in just one week

Is OpenAI reckless - or is AI safety overhyped?

Google DeepMind's Gemini wins Mathematical Olympiad gold using only natural language

OpenAI tests „Confessions“ to uncover hidden AI misbehavior

OpenAI releases gpt-oss-safeguard open source models for flexible AI safety

Anthropic teams up with OpenAI for security tests and warns that AI is enabling cybercrime

Physicist Steve Hsu publishes research built around a core idea generated by GPT-5

The ARC benchmark's fall marks another casualty of relentless AI optimization

DeepseekMath-V2 is Deepseek's latest attempt to pop the US AI bubble

OpenAI reportedly "squeezed" through safety testing for GPT-4 Omni in just one week

Is OpenAI reckless - or is AI safety overhyped?

Share

Bank details