Find notable cyber news and cases, enriched with sources, timelines, and signals.

Cisco AI Defense analysis of adaptive multi-turn adversarial weakness in open-weight LLMs

Technical Analysis
First reported
Last updated
Happening score
H score 16
1 unique sources, 1 articles

Summary

Hide ▲

Cisco AI Defense found that open-weight LLMs remain vulnerable to adaptive multi-turn adversarial attacks, creating a real risk that iterative prompting can bypass safety controls. In testing across 1,000+ prompts per model and 499 simulated conversations, attack styles such as Crescendo, Role-Play, and Refusal Reframe drove success rates above 90% against most defenses. The most critical failure modes were malicious code generation and data exfiltration. The findings point to a need for multi-turn testing, runtime guardrails, and continuous monitoring in production deployments.

Related Happenings

Cisco findings on multi-turn guardrail bypass in major LLMs

Technical Analysis
First: 27.05.2026 16:00 Last: 27.05.2026 16:00 Sources 1

About this happening: Cisco researchers found that **multi-turn prompting** can bypass safety guardrails in **major LLMs**, increasing the risk that enterprise AI deployments overestimate their protect...

Timeline

  1. 06.11.2025 17:00 2 articles · 6mo ago

    Cisco AI Defense reports multi-turn adversarial weakness in open-weight LLMs

    Technical Analysis Update

    Cisco AI Defense reported that open-weight large language models remain highly vulnerable to adaptive multi-turn adversarial attacks, with persistent multi-step conversations surpassing 90% success against most tested defenses despite strong single-turn results. The analysis covered over 1,000 prompts per model and 499 simulated conversations, identified the highest failure rates across 15 sub-threat categories within 102 total threat types, and recommended strict system prompts, model-agnostic runtime guardrails, regular AI red-teaming, and limits on automated external integrations.

    Show sources