Cisco AI Defense analysis of adaptive multi-turn adversarial weakness in open-weight LLMs
Technical Analysis
Summary
Hide ▲
Show ▼
Cisco AI Defense found that open-weight LLMs remain vulnerable to adaptive multi-turn adversarial attacks, creating a real risk that iterative prompting can bypass safety controls. In testing across 1,000+ prompts per model and 499 simulated conversations, attack styles such as Crescendo, Role-Play, and Refusal Reframe drove success rates above 90% against most defenses. The most critical failure modes were malicious code generation and data exfiltration. The findings point to a need for multi-turn testing, runtime guardrails, and continuous monitoring in production deployments.
Related Happenings
Cisco findings on multi-turn guardrail bypass in major LLMs
Technical Analysis
First: 27.05.2026 16:00
Last: 27.05.2026 16:00
Sources 1
About this happening:
Cisco researchers found that **multi-turn prompting** can bypass safety guardrails in **major LLMs**, increasing the risk that enterprise AI deployments overestimate their protect...
Cisco findings on multi-turn guardrail bypass in major LLMs
Technical AnalysisAbout this happening: Cisco researchers found that **multi-turn prompting** can bypass safety guardrails in **major LLMs**, increasing the risk that enterprise AI deployments overestimate their protect...
Timeline
-
06.11.2025 17:00 2 articles · 6mo ago
Cisco AI Defense reports multi-turn adversarial weakness in open-weight LLMs
Technical Analysis UpdateCisco AI Defense reported that open-weight large language models remain highly vulnerable to adaptive multi-turn adversarial attacks, with persistent multi-step conversations surpassing 90% success against most tested defenses despite strong single-turn results. The analysis covered over 1,000 prompts per model and 499 simulated conversations, identified the highest failure rates across 15 sub-threat categories within 102 total threat types, and recommended strict system prompts, model-agnostic runtime guardrails, regular AI red-teaming, and limits on automated external integrations.
Show sources
- Multi-Turn Attacks Expose Weaknesses in Open-Weight LLM Models — www.infosecurity-magazine.com — 06.11.2025 17:00
- Multi-Turn Attacks Expose Weaknesses in Open-Weight LLM Models — www.infosecurity-magazine.com — 06.11.2025 17:00