ExploitBench benchmark shows frontier AI models can stage Chrome exploit chains against vulnerable V8 builds
Technical Analysis
Summary
Hide ▲
Show ▼
Bugcrowd’s ExploitBench now shows frontier AI models can progress through staged Google Chrome exploit chains, raising the risk of faster AI-assisted exploit development. In head-to-head runs against a vulnerable V8 build, Claude Mythos outscored GPT-5.5, reaching the top tier on 21 of 41 vulnerabilities and averaging 9.90/16 versus 5.51. The benchmark measures five exploitation tiers up to arbitrary code execution rather than a simple crash/no-crash result, giving a clearer signal of offensive capability. The findings suggest models are closing the gap with elite human researchers and could shorten the time from flaw discovery to usable exploit.
Related Happenings
Anthropic Project Glasswing expands Claude Mythos Preview access
Security Tool/Service
First: 03.06.2026 12:30
Last: 03.06.2026 12:30
Sources 1
About this happening:
**Anthropic** expanded **Project Glasswing** on **June 2**, extending **Claude Mythos Preview** to **150 additional organizations** and widening a security-focused AI program used...
Anthropic Project Glasswing expands Claude Mythos Preview access
Security Tool/ServiceAbout this happening: **Anthropic** expanded **Project Glasswing** on **June 2**, extending **Claude Mythos Preview** to **150 additional organizations** and widening a security-focused AI program used...
Google AI Threat Defense launch adds autonomous AI-attack detection and remediation for enterprises
Security Tool/Service
First: 28.05.2026 12:55
Last: 28.05.2026 12:55
Sources 1
About this happening:
Google Cloud launched **Google AI Threat Defense**, an **always-on autonomous** security platform aimed at stopping **AI-powered cyberattacks** across enterprise environments. The...
Google AI Threat Defense launch adds autonomous AI-attack detection and remediation for enterprises
Security Tool/ServiceAbout this happening: Google Cloud launched **Google AI Threat Defense**, an **always-on autonomous** security platform aimed at stopping **AI-powered cyberattacks** across enterprise environments. The...
Microsoft MDASH enters limited private preview for AI-driven vulnerability discovery at scale
Security Tool/Service
First: 13.05.2026 16:46
Last: 13.05.2026 16:46
Sources 1
About this happening:
Microsoft's **MDASH** has entered **limited private preview**, adding a new **AI-driven vulnerability discovery** service that can validate and prove exploitable defects at scale....
Microsoft MDASH enters limited private preview for AI-driven vulnerability discovery at scale
Security Tool/ServiceAbout this happening: Microsoft's **MDASH** has entered **limited private preview**, adding a new **AI-driven vulnerability discovery** service that can validate and prove exploitable defects at scale....
Prominent cybercrime threat actors AI-assisted zero-day exploitation campaign
Campaign
First: 11.05.2026 16:00
Last: 11.05.2026 16:00
Sources 1
About this happening:
An **AI-assisted zero-day exploitation campaign** was planned by **prominent cybercrime threat actors**, but the effort was **disrupted before deployment** and did not reach its i...
Prominent cybercrime threat actors AI-assisted zero-day exploitation campaign
CampaignAbout this happening: An **AI-assisted zero-day exploitation campaign** was planned by **prominent cybercrime threat actors**, but the effort was **disrupted before deployment** and did not reach its i...
Google GTIG analysis of adversary AI use for exploit development and attack orchestration
Technical Analysis
First: 11.05.2026 16:00
Last: 11.05.2026 16:00
Sources 1
About this happening:
**Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...
Google GTIG analysis of adversary AI use for exploit development and attack orchestration
Technical AnalysisAbout this happening: **Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...
Timeline
-
04.06.2026 16:00 2 articles · 2h ago
ExploitBench shows Claude Mythos outperforms GPT-5.5 on Chrome exploit chains
Technical Analysis UpdateBugcrowd presented the first findings of ExploitBench at Infosecurity Europe 2026, describing an independent graded benchmark launched in May 2026 with Carnegie Mellon University experts and Chrome vulnerability researchers. The benchmark scores staged exploitation outcomes against a vulnerable V8 build up to arbitrary code execution, and the presented runs showed Anthropic’s Claude Mythos averaging 9.90 out of 16 and reaching the highest tier on 21 of 41 vulnerabilities, while OpenAI’s GPT-5.5 averaged 5.51 and reached the top tier on two cases. Brumley said Mythos could exploit a Chrome one-day vulnerability about 50% of the time, while cautioning that the results apply to a sophisticated browser target and should not be generalized to all web applications.
Show sources
- Infosecurity Europe: Mythos Outperforms GPT5.5 on Google Chrome Vulnerability Exploits, Says New Benchmark — www.infosecurity-magazine.com — 04.06.2026 16:00
- Infosecurity Europe: Mythos Outperforms GPT5.5 on Google Chrome Vulnerability Exploits, Says New Benchmark — www.infosecurity-magazine.com — 04.06.2026 16:00