Cisco AI Defense analysis of adaptive multi-turn adversarial weakness in open-weight LLMs

Technical Analysis

First reported

06.11.2025 17:00

Last updated

06.11.2025 17:00

Happening score

H score 22

1 unique sources, 1 articles

Summary

Hide ▲

Cisco AI Defense found that open-weight LLMs remain vulnerable to adaptive multi-turn adversarial attacks, creating a real risk that iterative prompting can bypass safety controls. In testing across 1,000+ prompts per model and 499 simulated conversations, attack styles such as Crescendo, Role-Play, and Refusal Reframe drove success rates above 90% against most defenses. The most critical failure modes were malicious code generation and data exfiltration. The findings point to a need for multi-turn testing, runtime guardrails, and continuous monitoring in production deployments.

Related Happenings

Cisco findings on multi-turn guardrail bypass in major LLMs

Technical Analysis

H score16 First: 27.05.2026 16:00 Last: 27.05.2026 16:00 Sources 1

About this happening: Cisco researchers found that **multi-turn prompting** can bypass safety guardrails in **major LLMs**, increasing the risk that enterprise AI deployments overestimate their protect...

Open Happening

Timeline

06.11.2025 17:00 2 articles · 8mo ago

Cisco AI Defense reports multi-turn adversarial weakness in open-weight LLMs

Technical Analysis Update
Cisco AI Defense reported that open-weight large language models remain highly vulnerable to adaptive multi-turn adversarial attacks, with persistent multi-step conversations surpassing 90% success against most tested defenses despite strong single-turn results. The analysis covered over 1,000 prompts per model and 499 simulated conversations, identified the highest failure rates across 15 sub-threat categories within 102 total threat types, and recommended strict system prompts, model-agnostic runtime guardrails, regular AI red-teaming, and limits on automated external integrations.
Show sources

Multi-Turn Attacks Expose Weaknesses in Open-Weight LLM Models — www.infosecurity-magazine.com — 06.11.2025 17:00

Multi-Turn Attacks Expose Weaknesses in Open-Weight LLM Models — www.infosecurity-magazine.com — 06.11.2025 17:00
Open in new tab

Summary

Related Happenings

Cisco findings on multi-turn guardrail bypass in major LLMs

Timeline

Cisco AI Defense reports multi-turn adversarial weakness in open-weight LLMs