OpenAI releases gpt-oss-safeguard open source models for flexible AI safety

Oct 30, 2025

OpenAI has launched gpt-oss-safeguard, a new set of open source models built for flexible security classification. The models come in two sizes, 120b and 20b, and are available under the Apache 2.0 license for anyone to use and modify. Unlike traditional classifiers that need to be retrained whenever safety rules change, these models can interpret policies in real time, according to OpenAI. This lets organizations update their rules instantly, without retraining the model.

The models are designed to be more transparent as well. Developers can see exactly how the models make decisions, making it easier to understand and audit how security is enforced. gpt-oss-safeguard is based on OpenAI's gpt-oss open source model and is part of a larger collaboration with ROOST, an open source platform focused on building tools and infrastructure for AI safety, security, and governance.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

OpenAI releases gpt-oss-safeguard open source models for flexible AI safety

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.