CyberHappenings logo

Track cybersecurity events as they unfold. Sourced timelines. Filter, sort, and browse. Fast, privacy‑respecting. No invasive ads, no tracking.

Adaptive Multi-Turn Attacks Bypass Defenses in Open-Weight LLMs

First reported
Last updated
1 unique sources, 1 articles

Summary

Hide ▲

Open-weight large language models (LLMs) remain vulnerable to adaptive multi-turn adversarial attacks, despite robust single-turn defenses. These persistent, multi-step conversations can achieve over 90% success rates against most tested defenses. Researchers from Cisco AI Defense identified 15 critical sub-threat categories, including malicious code generation, data exfiltration, and ethical boundary violations. The study highlights the need for enhanced security measures to protect against iterative manipulation. The findings emphasize the importance of implementing strict system prompts, deploying runtime guardrails, and conducting regular AI red-teaming assessments to mitigate risks.

Timeline

  1. 06.11.2025 17:00 1 articles · 4d ago

    Cisco AI Defense Report on Multi-Turn Attacks Against Open-Weight LLMs

    A new report from Cisco AI Defense reveals that open-weight LLMs are highly vulnerable to adaptive multi-turn adversarial attacks. These attacks can achieve over 90% success rates, bypassing traditional safety filters. The study identified 15 critical sub-threat categories and recommended enhanced security measures to mitigate risks. The findings highlight the need for continuous monitoring and threat-specific mitigation to protect against data breaches and malicious manipulations.

    Show sources

Information Snippets