Anthropic Petri open-source auditing tool launch for risky AI interaction testing

Security Tool/Service

First reported

08.10.2025 10:16

Last updated

08.10.2025 10:16

Happening score

H score 14

1 unique sources, 1 articles

Summary

Hide ▲

Anthropic released Petri, an open-source auditing tool that expands practical AI safety testing for risky model behaviors. The tool is designed to probe interactions involving deception, sycophancy, user delusion, harmful requests, and self-perseveration. Its release matters because it gives researchers a repeatable way to evaluate target models with automated multi-turn conversations and judge scoring.

Related Happenings

Anthropic Project Glasswing expands Claude Mythos Preview access

Security Tool/Service

H score54 First: 03.06.2026 12:30 Last: 03.06.2026 12:30 Sources 1

About this happening: **Anthropic** expanded **Project Glasswing** on **June 2**, extending **Claude Mythos Preview** to **150 additional organizations** and widening a security-focused AI program used...

Open Happening

Microsoft MDASH enters limited private preview for AI-driven vulnerability discovery at scale

Security Tool/Service

H score26 First: 13.05.2026 16:46 Last: 13.05.2026 16:46 Sources 1

About this happening: Microsoft's **MDASH** has entered **limited private preview**, adding a new **AI-driven vulnerability discovery** service that can validate and prove exploitable defects at scale....

Open Happening

Google GTIG analysis of adversary AI use for exploit development and attack orchestration

Technical Analysis

H score33 First: 11.05.2026 16:00 Last: 11.05.2026 16:00 Sources 1

About this happening: **Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...

Open Happening

Prominent cybercrime threat actors AI-assisted zero-day exploitation campaign

Campaign

H score30 First: 11.05.2026 16:00 Last: 11.05.2026 16:00 Sources 1

About this happening: An **AI-assisted zero-day exploitation campaign** was planned by **prominent cybercrime threat actors**, but the effort was **disrupted before deployment** and did not reach its i...

Open Happening

Widespread exposure and misconfiguration in self-hosted AI infrastructure

Trend

H score76 First: 05.05.2026 13:30 Last: 05.05.2026 13:30 Sources 1

About this happening: A large-scale measurement found **self-hosted AI infrastructure** was being deployed with **widespread exposure and no authentication**, creating a broad risk of data theft, workf...

Open Happening

Timeline

08.10.2025 10:16 2 articles · 9mo ago

Anthropic launches Petri open-source auditing tool

Industry Or Public Sector Update
Anthropic releases Petri, a short-form name for the Parallel Exploration Tool for Risky Interactions, as an open-source auditing tool for AI safety research and target-model testing. The tool uses an automated agent to run diverse multi-turn conversations with simulated users and tools, then scores the resulting transcripts so researchers can evaluate model behaviors such as deception, sycophancy, encouragement of user delusion, cooperation with harmful requests, and self-perseveration.
Show sources

OpenAI Disrupts Russian, North Korean, and Chinese Hackers Misusing ChatGPT for Cyberattacks — thehackernews.com — 08.10.2025 10:16

OpenAI Disrupts Russian, North Korean, and Chinese Hackers Misusing ChatGPT for Cyberattacks — thehackernews.com — 08.10.2025 10:16
Open in new tab

Summary

Related Happenings

Anthropic Project Glasswing expands Claude Mythos Preview access

Microsoft MDASH enters limited private preview for AI-driven vulnerability discovery at scale

Google GTIG analysis of adversary AI use for exploit development and attack orchestration

Prominent cybercrime threat actors AI-assisted zero-day exploitation campaign

Widespread exposure and misconfiguration in self-hosted AI infrastructure

Timeline

Anthropic launches Petri open-source auditing tool