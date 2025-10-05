AI in practice
Matthias Bastian

Reasoning models like Claude Sonnet 4.5 are getting better at spotting security flaws

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Profile
E-Mail

Anthropic sees growing potential for language models in cybersecurity. The company cites results from the CyberGym leaderboard: Claude Sonnet 4 uncovers new software vulnerabilities about 2 percent of the time, while Sonnet 4.5 increases that rate to 5 percent. In repeated tests, Sonnet 4.5 finds new vulnerabilities in more than a third of projects.

Ad
Image: Anthropic

In a recent DARPA AI Cyber Challenge, Anthropic notes that teams used large language models like Claude "to build 'cyber reasoning systems' that examined millions of lines of code for vulnerabilities to patch." Anthropic calls this a possible "inflection point for AI’s impact on cybersecurity."

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Sources
Anthropic
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Profile
E-Mail
AI in practice

OpenAI's new AI device reportedly faces technical hurdles that could delay its launch

News, tests and reports about VR, AR and MIXED Reality.
What happens next with MIXED My personal farewell to MIXED Meta and Anduril are now jointly developing XR headsets for the US military MIXED-NEWS.com
AI in practice

Meta's Yann LeCun reportedly clashed with the company over new publication rules

AI in practice

OpenAI's Sora 2 answers science questions directly in its generated videos

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Reasoning models like Claude Sonnet 4.5 are getting better at spotting security flaws

Bank details

IBAN: DE88 2507 0070 0053 0014 00
BIC: DEUTDE2HXXX
Account holder: Deep Content GmbH
Purpose: Support THE DECODER
AI in practice

OpenAI suddenly remembers that copyright law exists after a few days of wild Sora videos

AI in practice
Update

OpenAI unveils Sora 2 video model with realistic physics, high-quality audio, and a new social app

AI in practice

Deepmind says video models for visual tasks could become what LLMs are for text tasks

Google News