OpenAI opens GPT-5.5-Cyber to vetted security researchers
Key Points
- OpenAI released GPT-5.5-Cyber, a model with reduced safety filters that lets vetted security researchers do tasks like penetration testing and malware analysis.
- Access is tiered, with the least restricted version limited to authorized defenders of critical infrastructure, partnering with firms like Cisco and CrowdStrike.
- The model performs roughly on par with Anthropic's Mythos in cyberattack benchmarks, while the White House considers regulating such releases.
OpenAI is giving security researchers access to GPT-5.5 and releasing a specialized variant called GPT-5.5-Cyber that refuses far fewer requests. For now, only vetted defenders protecting critical infrastructure can get access through the company's "Trusted Access for Cyber" program.
Standard chatbots typically block requests that sound like they're asking for hacking instructions, a safeguard against misuse. But those same filters also get in the way of legitimate security work, like when a researcher needs to reproduce a known vulnerability to patch it.
OpenAI is now splitting access into three tiers: the public model with standard restrictions, a middle tier with relaxed filters for defensive work, and GPT-5.5-Cyber with the fewest restrictions for authorized penetration testing.
The system allows tasks like analyzing malware or reviewing security patches. According to OpenAI, it still blocks things like stealing passwords or attacking third-party systems.
How much the guardrails actually move
The examples in the announcement show just how far the restrictions have been loosened. Ask the public model to write a working exploit for a known vulnerability, and it refuses. The middle tier delivers the code along with documentation. GPT-5.5-Cyber goes a step further. In a demo scenario, it actually runs the attack against a test server, takes over the system, and reads out system information.
OpenAI stresses that the Cyber variant isn't smarter than the standard model, just less restrictive on security topics. Starting June 1, 2026, individual users on the highest access tier will need to enable phishing-resistant authentication. Launch partners include Cisco, CrowdStrike, Palo Alto Networks, Cloudflare, Intel, Snyk, and SentinelOne. Through Codex Security, select developers working on major open-source projects also get discounted access.
Racing Anthropic's Mythos
The release comes at a time when Silicon Valley and the White House are both grappling with the offensive capabilities of new AI models. A source told tech outlet Axios that GPT-5.5-Cyber performs roughly on par with Anthropic's Mythos Preview when it comes to finding and exploiting software vulnerabilities.
Anthropic takes a more restrictive approach, limiting Mythos access to about 40 organizations through its Project Glasswing. OpenAI is going broader with its tiered system. Meanwhile, the White House is reportedly discussing executive orders that would give the government more say over how these kinds of models get released.
The UK's AI Security Institute recently tested GPT-5.5 in a simulated attack series against a corporate network involving 32 steps. The model completed the full chain in 2 out of 10 runs, while Mythos managed 3 out of 10. On individual expert-level tasks, GPT-5.5 actually came out slightly ahead.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now