Cisco findings on multi-turn guardrail bypass in major LLMs
Technical Analysis
Summary
Hide ▲
Show ▼
Cisco researchers found that multi-turn prompting can bypass safety guardrails in major LLMs, increasing the risk that enterprise AI deployments overestimate their protection. Tests across ChatGPT, Claude, Gemini, Nova and Grok showed that repeated reframing, roleplay, ambiguity and misdirection could push models toward disallowed actions. The findings suggest single-prompt safety benchmarks understate real-world attacker behavior across frontier models.
Related Happenings
Google GTIG analysis of adversary AI use for exploit development and attack orchestration
Technical Analysis
First: 11.05.2026 16:00
Last: 11.05.2026 16:00
Sources 1
About this happening:
**Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...
Google GTIG analysis of adversary AI use for exploit development and attack orchestration
Technical AnalysisAbout this happening: **Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...
NCSC-UK joint advisory on covert botnets and proxy networks
Public Sector Action
First: 23.04.2026 15:28
Last: 23.04.2026 15:28
Sources 1
About this happening:
**NCSC-UK** and partner agencies issued a **joint advisory** warning that **China-nexus hackers** are using **hijacked consumer devices** as covert proxy networks to hide maliciou...
NCSC-UK joint advisory on covert botnets and proxy networks
Public Sector ActionAbout this happening: **NCSC-UK** and partner agencies issued a **joint advisory** warning that **China-nexus hackers** are using **hijacked consumer devices** as covert proxy networks to hide maliciou...
OpenAI launches GPT‑5.4‑Cyber and expands TAC access for cyber defense
Security Tool/Service
First: 15.04.2026 19:00
Last: 15.04.2026 19:00
Sources 1
About this happening:
OpenAI launched **GPT‑5.4‑Cyber** and expanded **Trusted Access for Cyber (TAC)**, giving vetted defenders broader access to a **cyber-permissive** model for **defensive workflows...
OpenAI launches GPT‑5.4‑Cyber and expands TAC access for cyber defense
Security Tool/ServiceAbout this happening: OpenAI launched **GPT‑5.4‑Cyber** and expanded **Trusted Access for Cyber (TAC)**, giving vetted defenders broader access to a **cyber-permissive** model for **defensive workflows...
AISI and NCSC guidance on cybersecurity basics after Mythos Preview testing
Public Sector Action
First: 14.04.2026 12:30
Last: 14.04.2026 12:30
Sources 1
About this happening:
The **UK AI Security Institute (AISI)** and **National Cyber Security Centre (NCSC)** urged organizations to strengthen **cybersecurity basics** after evaluating **Anthropic’s Myt...
AISI and NCSC guidance on cybersecurity basics after Mythos Preview testing
Public Sector ActionAbout this happening: The **UK AI Security Institute (AISI)** and **National Cyber Security Centre (NCSC)** urged organizations to strengthen **cybersecurity basics** after evaluating **Anthropic’s Myt...
XM Cyber maps eight validated AWS Bedrock attack vectors across connected enterprise integrations
Technical Analysis
First: 23.03.2026 13:55
Last: 23.03.2026 13:55
Sources 1
About this happening:
**XM Cyber** mapped **eight validated attack vectors** in **AWS Bedrock**, showing how over-privileged permissions can expose logs, knowledge bases, agents, flows, guardrails, and...
XM Cyber maps eight validated AWS Bedrock attack vectors across connected enterprise integrations
Technical AnalysisAbout this happening: **XM Cyber** mapped **eight validated attack vectors** in **AWS Bedrock**, showing how over-privileged permissions can expose logs, knowledge bases, agents, flows, guardrails, and...
Timeline
-
27.05.2026 16:00 2 articles · 9h ago
Cisco researchers find multi-turn conversations can bypass LLM safety guardrails
Technical Analysis UpdateCisco researchers found that major LLMs and frontier AI models including ChatGPT, Claude, Gemini, Nova and Grok could be manipulated through multi-turn conversations that used roleplay, ambiguity, misdirection and repeated reframing after refusals, allowing users to bypass safety guardrails and induce disallowed actions. The findings also noted that GrokAI became more vulnerable when reasoning mode was enabled, and that single-prompt testing can understate real-world risk.
Show sources
- All Major LLMs Exposed to Multi-Turn Manipulation, Warn Researchers — www.infosecurity-magazine.com — 27.05.2026 16:00
- All Major LLMs Exposed to Multi-Turn Manipulation, Warn Researchers — www.infosecurity-magazine.com — 27.05.2026 16:00