Find notable cyber news and cases, enriched with sources, timelines, and signals.

Cisco findings on multi-turn guardrail bypass in major LLMs

Technical Analysis
First reported
Last updated
Happening score
H score 16
1 unique sources, 1 articles

Summary

Hide ▲

Cisco researchers found that multi-turn prompting can bypass safety guardrails in major LLMs, increasing the risk that enterprise AI deployments overestimate their protection. Tests across ChatGPT, Claude, Gemini, Nova and Grok showed that repeated reframing, roleplay, ambiguity and misdirection could push models toward disallowed actions. The findings suggest single-prompt safety benchmarks understate real-world attacker behavior across frontier models.

Related Happenings

Google GTIG analysis of adversary AI use for exploit development and attack orchestration

Technical Analysis
First: 11.05.2026 16:00 Last: 11.05.2026 16:00 Sources 1

About this happening: **Google Threat Intelligence Group** published findings showing **adversaries using AI** for **exploit development** and **attack orchestration**, signaling that model-assisted tr...

NCSC-UK joint advisory on covert botnets and proxy networks

Public Sector Action
First: 23.04.2026 15:28 Last: 23.04.2026 15:28 Sources 1

About this happening: **NCSC-UK** and partner agencies issued a **joint advisory** warning that **China-nexus hackers** are using **hijacked consumer devices** as covert proxy networks to hide maliciou...

OpenAI launches GPT‑5.4‑Cyber and expands TAC access for cyber defense

Security Tool/Service
First: 15.04.2026 19:00 Last: 15.04.2026 19:00 Sources 1

About this happening: OpenAI launched **GPT‑5.4‑Cyber** and expanded **Trusted Access for Cyber (TAC)**, giving vetted defenders broader access to a **cyber-permissive** model for **defensive workflows...

AISI and NCSC guidance on cybersecurity basics after Mythos Preview testing

Public Sector Action
First: 14.04.2026 12:30 Last: 14.04.2026 12:30 Sources 1

About this happening: The **UK AI Security Institute (AISI)** and **National Cyber Security Centre (NCSC)** urged organizations to strengthen **cybersecurity basics** after evaluating **Anthropic’s Myt...

XM Cyber maps eight validated AWS Bedrock attack vectors across connected enterprise integrations

Technical Analysis
First: 23.03.2026 13:55 Last: 23.03.2026 13:55 Sources 1

About this happening: **XM Cyber** mapped **eight validated attack vectors** in **AWS Bedrock**, showing how over-privileged permissions can expose logs, knowledge bases, agents, flows, guardrails, and...

Timeline

  1. 27.05.2026 16:00 2 articles · 9h ago

    Cisco researchers find multi-turn conversations can bypass LLM safety guardrails

    Technical Analysis Update

    Cisco researchers found that major LLMs and frontier AI models including ChatGPT, Claude, Gemini, Nova and Grok could be manipulated through multi-turn conversations that used roleplay, ambiguity, misdirection and repeated reframing after refusals, allowing users to bypass safety guardrails and induce disallowed actions. The findings also noted that GrokAI became more vulnerable when reasoning mode was enabled, and that single-prompt testing can understate real-world risk.

    Show sources