Anthropic Petri open-source auditing tool launch for risky AI interaction testing
Security Tool/Service
Summary
Hide ▲
Show ▼
Anthropic released Petri, an open-source auditing tool that expands practical AI safety testing for risky model behaviors. The tool is designed to probe interactions involving deception, sycophancy, user delusion, harmful requests, and self-perseveration. Its release matters because it gives researchers a repeatable way to evaluate target models with automated multi-turn conversations and judge scoring.
Related Happenings
Microsoft MDASH enters limited private preview for AI-driven vulnerability discovery at scale
Security Tool/Service
First: 13.05.2026 16:46
Last: 13.05.2026 16:46
Sources 1
About this happening:
Microsoft's **MDASH** has entered **limited private preview**, adding a new **AI-driven vulnerability discovery** service that can validate and prove exploitable defects at scale....
Microsoft MDASH enters limited private preview for AI-driven vulnerability discovery at scale
Security Tool/ServiceAbout this happening: Microsoft's **MDASH** has entered **limited private preview**, adding a new **AI-driven vulnerability discovery** service that can validate and prove exploitable defects at scale....
Google GTIG analysis of adversary AI use for exploit development and attack orchestration
Technical Analysis
First: 11.05.2026 16:00
Last: 11.05.2026 16:00
Sources 1
About this happening:
**Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...
Google GTIG analysis of adversary AI use for exploit development and attack orchestration
Technical AnalysisAbout this happening: **Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...
Prominent cybercrime threat actors AI-assisted zero-day exploitation campaign
Campaign
First: 11.05.2026 16:00
Last: 11.05.2026 16:00
Sources 1
About this happening:
An **AI-assisted zero-day exploitation campaign** was planned by **prominent cybercrime threat actors**, but the effort was **disrupted before deployment** and did not reach its i...
Prominent cybercrime threat actors AI-assisted zero-day exploitation campaign
CampaignAbout this happening: An **AI-assisted zero-day exploitation campaign** was planned by **prominent cybercrime threat actors**, but the effort was **disrupted before deployment** and did not reach its i...
Widespread exposure and misconfiguration in self-hosted AI infrastructure
Target Trend
First: 05.05.2026 13:30
Last: 05.05.2026 13:30
Sources 1
About this happening:
A large-scale measurement found **self-hosted AI infrastructure** was being deployed with **widespread exposure and no authentication**, creating a broad risk of data theft, workf...
Widespread exposure and misconfiguration in self-hosted AI infrastructure
Target TrendAbout this happening: A large-scale measurement found **self-hosted AI infrastructure** was being deployed with **widespread exposure and no authentication**, creating a broad risk of data theft, workf...
Anthropic launches Project Glasswing with Claude Mythos for vulnerability discovery
Security Tool/Service
First: 08.04.2026 12:16
Last: 08.04.2026 12:16
Sources 1
About this happening:
**Anthropic’s Project Glasswing** is now showing measurable results: since launching last month, the **Claude Mythos Preview**-based initiative has uncovered **more than 10,000**...
Anthropic launches Project Glasswing with Claude Mythos for vulnerability discovery
Security Tool/ServiceAbout this happening: **Anthropic’s Project Glasswing** is now showing measurable results: since launching last month, the **Claude Mythos Preview**-based initiative has uncovered **more than 10,000**...
Latest development: 23.05.2026 14:55
Anthropic said Project Glasswing has uncovered more than 10,000 high- or critical-severity vulnerabilities across widely used software since the program launched last month, including 6,202 high/critical flaws affecting more than 1,000 open-source projects, 1,726 validated true positives, 1,094 high/critical flaws, a critical WolfSSL flaw tracked as CVE-2026-5194 with CVSS score 9.1, 97 upstream patches, and 88 advisories.
Timeline
-
08.10.2025 10:16 2 articles · 7mo ago
Anthropic launches Petri open-source auditing tool
Industry Or Public Sector UpdateAnthropic releases Petri, a short-form name for the Parallel Exploration Tool for Risky Interactions, as an open-source auditing tool for AI safety research and target-model testing. The tool uses an automated agent to run diverse multi-turn conversations with simulated users and tools, then scores the resulting transcripts so researchers can evaluate model behaviors such as deception, sycophancy, encouragement of user delusion, cooperation with harmful requests, and self-perseveration.
Show sources
- OpenAI Disrupts Russian, North Korean, and Chinese Hackers Misusing ChatGPT for Cyberattacks — thehackernews.com — 08.10.2025 10:16
- OpenAI Disrupts Russian, North Korean, and Chinese Hackers Misusing ChatGPT for Cyberattacks — thehackernews.com — 08.10.2025 10:16