Ad
Skip to content
Read full article about: AI offensive cyber capabilities are doubling every six months, safety researchers find

AI safety research firm Lyptus Research has published a new study on the offensive cybersecurity capabilities of AI models. The study is based on the METR time-horizon method and involved testing with ten professional security experts.

According to the findings, AI's offensive cyber capability has been doubling every 9.8 months since 2019, and since 2024, that pace has accelerated to every 5.7 months. Opus 4.6 and GPT-5.3 Codex can now solve tasks at a 50 percent success rate with a two-million-token budget that would take human experts roughly three hours to complete.

Chart showing the rise in offensive cyber capability of AI models between 2019 and 2026, measured by time horizon in human-equivalent task time. Two trend lines illustrate the doubling times of 9.8 and 5.7 months.
Offensive cyber capability of AI models over time: From GPT-2 (2019) to Opus 4.6 and GPT-5.3 Codex (2026), the time horizon grew from 30 seconds to roughly three hours. The doubling time accelerated from 9.8 months (since 2019) to 5.7 months (since 2024). | Image: Lyptus Research

Performance jumps significantly with higher token budgets: GPT-5.3 Codex goes from a 3.1-hour to a 10.5-hour time horizon when given ten million tokens instead of two million. The researchers say this suggests they're still underestimating the actual rate of progress. Open-source models trail their closed-source counterparts by about 5.7 months.

The study drew on 291 tasks in total. All data is available on GitHub and Hugging Face, with the full report available here.

Read full article about: AI chatbot traffic grows seven times faster than social media but still trails by a factor of four

Social media still pulls in four times more traffic than AI chatbot services, but AI is growing seven times faster, according to an analysis by Similarweb. Gender and age profiles are similar across both categories, peaking in the 25-34 age group, though AI users skew slightly older.

Two Similarweb infographics side by side: total website visits for social media (41 billion) vs. AI chatbots (9.3 billion) on the left, and year-over-year traffic growth on the right, showing social media at 6.32 percent and AI chatbots at 44.39 percent.
Social media vs. AI tools: four times more traffic, but seven times slower growth. | Image: Similarweb

Device usage shows a clear split: social media divides roughly evenly between desktop and mobile, while 72 percent of AI tool traffic comes from desktop, suggesting chatbots serve mainly as work and productivity tools. Social media users also spend more time per session, while AI users work in shorter, task-oriented bursts.

Both categories rely on direct traffic, but AI services more so: 73 percent vs. 50 percent for social media. Social media draws far more organic search traffic, likely because its publicly indexed content surfaces in search results. AI chatbots don't produce searchable content, so users likely navigate straight to their preferred tool.

Alibaba's Qwen team makes AI models think deeper with new algorithm

Reinforcement learning hits a wall with reasoning models because every token gets the same reward. A new algorithm from Alibaba’s Qwen team fixes this by weighting each step based on how much it shapes what comes next, doubling the length of thought processes in the process.

Read full article about: Netflix open-sources VOID, an AI framework that erases video objects and rewrites the physics they left behind

Netflix has open-sourced an AI framework that can remove objects from videos and automatically adjust the physical effects those objects had on the rest of the scene. The system is called VOID, short for "Video Object and Interaction Deletion." What makes it special is that beyond erasing objects from a scene, it also handles the downstream physical effects, like collisions, that the removed object originally caused.

VOID is built on top of Alibaba's CogVideoX video diffusion model, fine-tuned with synthetic data from Google's Kubric and Adobe's HUMOTO for interaction detection. Google's Gemini 3 Pro analyzes the scene and identifies affected areas, while Meta's SAM2 handles segmenting the objects that need to be removed. An optional second pass uses optical flow to correct any shape distortions.

The project was developed by Netflix researchers in collaboration with INSAIT Sofia University. Code, paper, and demo are available on GitHub, arXiv, and Hugging Face. The system ships under the Apache 2.0 license, which means it can be used commercially.

Read full article about: OpenAI reshuffles leadership as health issues force key executives to step back

Several leadership changes are underway at OpenAI. Fidji Simo, CEO of the newly created "AGI Deployment" division, is taking sick leave for several weeks to deal with an autoimmune disease affecting her nervous system. While she's away, OpenAI President Greg Brockman will take over product responsibilities, including the company's super app plans. On the business side, CSO Jason Kwon, CFO Sarah Friar, and CRO Denise Dresser will step in.

Head of Marketing Kate Rouch is also stepping down for health reasons. Rouch plans to return in a smaller role once her health improves. Gary Briggs will fill in as her temporary replacement.

COO Brad Lightcap is stepping down as well, moving to a new "special projects" team reporting directly to CEO Sam Altman. Dresser is picking up most of his responsibilities. Lightcap's work on government relations and "OpenAI for Countries" shifts to the strategy department.