Several key AI safety researchers, including team leaders, have left OpenAI in recent months amid a reported loss of trust in CEO Sam Altman, according to sources cited by Vox.
Since November, at least seven employees working on AI safety in the context of a potential AGI have departed OpenAI. Among them are former Chief Scientist Ilya Sutskever and Jan Leike, who co-led the company's Superalignment Team, a group tasked with ensuring a future superintelligence remains aligned with human goals.
Safety researchers Cullen O'Keefe, Daniel Kokotajlo, and William Saunders have also resigned, while Superalignment Team members Leopold Aschenbrenner and Pavel Izmailov were dismissed over alleged leaks, Vox reports.
Leike has publicly criticized OpenAI for not taking the task of AGI safety seriously enough, a signal that could carry weight as former employees are barred from speaking publicly due to restrictive confidentiality agreements. These agreements threaten the loss of potentially millions in shares for violating the terms, according to Vox's insider sources.
OpenAI does not dispute the existence of these clauses, but notes that no shares have been canceled for current or former employees who did not sign a confidentiality or non-disclosure agreement upon leaving the company.
It's unclear if and how Leike is affected by these clauses, and whether he took a personal risk by going public with his criticism. Leike will be replaced as head of the Superalignment team by OpenAI co-founder John Schulman.
Alleged loss of trust in OpenAI CEO Sam Altman by AI safety researchers
AI safety researcher Daniel Kokotajlo, who did not sign the restrictive documents when he left, told Vox a similar version of Leike's criticism. Because he signed confidentiality or non-disclosure agreements when he was hired, he's still not allowed to say everything.
Kokotajlo left OpenAI in April 2024, stating that he gave up a large sum of money, which would have equaled 85 percent of his family's net worth "at least", to avoid signing a non-disparagement agreement.
"I joined with substantial hope that OpenAI would rise to the occasion and behave more responsibly as they got closer to achieving AGI. It slowly became clear to many of us that this would not happen. I gradually lost trust in OpenAI leadership and their ability to responsibly handle AGI, so I quit."
Daniel Kokotajlo via Vox
Another insider described "trust collapsing bit by bit, like dominoes falling one by one" regarding AI safety at OpenAI, with Altman accused of being manipulative, and behaving in ways that contradict his public statements on AI safety.
Altman's November dismissal by the board, which stated he had not been "consistently candid," was allegedly influenced by debates around AI safety. However, an investigation by a law firm called in after Altman's return exonerated him, deeming the dismissal unjustified.
OpenAI staff largely backed Altman after his dismissal, accelerating his return and forcing the board to relent or risk the company's collapse or a mass migration to Microsoft, suggesting no widespread leadership issues with Altman.
If OpenAI is indeed failing to address AGI safety research adequately, it could be due to pressure from competitors to prioritize speed over safety or a shift in management's view on the feasibility of near-term AGI development.
Altman has recently communicated more defensively on AGI, stating that scaling existing systems is insufficient and new technologies are needed. In mid-January, he said AGI was conceivable "reasonably close-ish future".