Ad
Skip to content

Anthropic tests its "next-generation system for AI safety mitigations"

Anthropic is expanding its bug bounty program to test its "next-generation system for AI safety mitigations." The program focuses on identifying and defending against "universal jailbreak attacks." Anthropic is prioritizing critical vulnerabilities in high-risk areas like chemical, biological, radiological and nuclear (CBRN) defense and cybersafety. Participants get early access to Anthropic's latest safety systems before public release. Their task is to find vulnerabilities or ways to bypass safety measures. Anthropic is offering rewards up to $15,000 for discovering new universal jailbreak attacks.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Read on for the full picture.
Subscribe for hype-free coverage.

  • Access to all THE DECODER articles.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder