OpenAI has reportedly compressed safety testing for its latest AI model, GPT-4 Omni, to just one week.
The Washington Post reports that some employees criticized the company for prioritizing speed over thoroughness. According to three sources familiar with the matter, members of the safety team felt pressured to accelerate the new catastrophic risk testing protocol to meet the May launch date set by leadership. "We basically failed at the process," one anonymous source said.
OpenAI had even invited employees to a launch celebration party before testing began. "They planned the launch after-party prior to knowing if it was safe to launch," an insider revealed.
OpenAI spokeswoman Lindsey Held stated the company "didn’t cut corners on our safety process, though we recognize the launch was stressful for our teams." She added that OpenAI conducted "extensive internal and external" testing to meet regulatory obligations.
An unnamed representative of the preparedness team acknowledged that all the testing was done, but on a compressed timeline. OpenAI is now "rethinking our whole way of doing it" and the Omni approach was "just not the best way to do it", the representative said.
Is OpenAI reckless - or is AI safety overhyped?
OpenAI's rush is also evident in the delayed release of GPT-4 Omni's voice functionality, now slated for fall due to ongoing safety tests. This follows confusing communications that led many users to expect voice capabilities immediately upon Omni's launch.
Several high-ranking safety researchers have also left OpenAI recently, some openly criticizing the company's safety practices. William Saunders, who departed in February 2024, said in a podcast that OpenAI had become more of a product company, adding he "didn't want to end up working on the Titanic of AI."
The big picture allows two conclusions: Either OpenAI is acting recklessly and negligently, accepting social risks for the sake of commercial success.
Or management believes that safety concerns about today's generative AI are exaggerated, that the emergence of AGI is completely unclear, and that the issue of AI safety is therefore overrated and serves primarily a marketing function.
A past example shows that this marketing approach can work. In 2019, OpenAI classified its GPT-2 model as too dangerous to release publicly, which brought massive attention to the company. Just a few months later, two students replicated GPT-2's level of performance. Compared to today's freely available models, GPT-2's performance is ridiculously low.