OpenAI's newest safety team prepares for catastrophic AI risks

OpenAI establishes a new "Preparedness Team" to prevent potentially catastrophic risks from frontier AI models.

In a new blog post, OpenAI acknowledges the potential benefits of frontier AI models, which could far exceed the capabilities of existing systems. But it also recognizes the serious risks these models pose.

OpenAI's stated goal is to develop AGI that is, at a minimum, a human-like intelligent machine capable of rapidly acquiring new knowledge and generalizing across domains.

To address these risks, OpenAI aims to answer questions such as

How dangerous is the misuse of frontier AI systems today and in the future?
How can we develop a robust framework for monitoring, evaluating, predicting, and protecting against the dangerous capabilities of frontier AI systems?
If frontier AI model weights were stolen, how could malicious actors use them?

The answers to these questions could help ensure the safety of advanced AI systems. The new team comes after OpenAI, along with other leading labs, made voluntary commitments to advance safety and trust in AI through the industry organization Frontier Model Forum.

The announcement of the new safety team precedes the first AI Safety Conference, to be held in the UK in early November.

OpenAI's new Preparedness Team

The Preparedness Team, led by Aleksander Madry, will focus on performance assessment, evaluation, and internal red teaming of Frontier models.

Its mission spans multiple risk categories, including individual persuasion, cybersecurity, chemical, biological, radiological, and nuclear (CBRN) threats, and autonomous replication and adaptation (ARA).

The team will also work to develop and maintain a Risk-Informed Development Policy (RDP) to establish a governance structure for accountability and oversight throughout the development process.

Recommendation

AI research

OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits

The RDP is intended to complement and extend OpenAI's existing work on risk mitigation and contribute to the safety and alignment of new high-performance systems before and after implementation.

In addition to building the team, OpenAI is launching an AI Preparedness Challenge to prevent catastrophic misuse. The challenge will award $25,000 worth of API credits to up to ten top submissions, and OpenAI will publish innovative ideas and contributions. The Lab will also seek candidates for the Preparedness Team from the top Challenge applicants.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

OpenAI's newest safety team prepares for catastrophic AI risks

OpenAI's new Preparedness Team

OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits

OpenAI reportedly plans to invest in Merge Labs for brain-computer interfaces

ChatGPT users can now toggle Auto, Fast, and Thinking modes for more control over GPT-5

OpenAI's AI system wins a gold medal-level score at the International Olympiad in Informatics 2025

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Google upgrades Gemini with Deep Think and flags early warning risks

OpenAI's newest safety team prepares for catastrophic AI risks

OpenAI's new Preparedness Team

Share

Bank details