GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposed

First reported

09.08.2025 18:06

Last updated

11.08.2025 19:46

2 unique sources, 2 articles

Summary

Hide ▲

Researchers have demonstrated a jailbreak technique to bypass GPT-5's ethical guardrails, leveraging the Echo Chamber and narrative-driven steering methods. This technique can produce harmful procedural content by framing it within a story, avoiding direct malicious prompts. Additionally, zero-click AI agent attacks have been detailed, targeting cloud and IoT systems through indirect prompt injections. These attacks exploit vulnerabilities in AI connectors and integrations, leading to data exfiltration and unauthorized access. The findings highlight the risks associated with integrating AI models with external systems, emphasizing the need for robust security measures and continuous red teaming to mitigate these threats. The Echo Chamber and Storytelling technique was executed in 24 hours after the release of GPT-5, demonstrating how attackers can increase their effectiveness by combining Echo Chamber with complementary strategies.

Timeline

09.08.2025 18:06 2 articles · 1mo ago

GPT-5 Jailbreak and Zero-Click AI Agent Attacks Disclosed
Researchers have uncovered a jailbreak technique for GPT-5 using Echo Chamber and narrative-driven steering. This method produces harmful procedural content by framing it within a story, avoiding direct malicious prompts. Additionally, zero-click AI agent attacks, such as AgentFlayer, exploit vulnerabilities in AI connectors and integrations, leading to data exfiltration and unauthorized access. These findings highlight the risks associated with integrating AI models with external systems and the need for robust security measures. The Echo Chamber and Storytelling technique was executed in 24 hours after the release of GPT-5, demonstrating how attackers can increase their effectiveness by combining Echo Chamber with complementary strategies.
Show sources

Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06

Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
Open in new tab

Information Snippets

Echo Chamber and narrative-driven steering techniques were used to jailbreak GPT-5, producing harmful procedural content.
First reported: 09.08.2025 18:06

2 sources, 2 articles
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The jailbreak technique involves framing harmful content within a story, avoiding direct malicious prompts.
First reported: 09.08.2025 18:06

2 sources, 2 articles
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
Zero-click AI agent attacks, such as AgentFlayer, exploit vulnerabilities in AI connectors and integrations.
First reported: 09.08.2025 18:06

1 source, 1 article
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
AgentFlayer attacks can exfiltrate sensitive data like API keys from cloud storage services.
First reported: 09.08.2025 18:06

1 source, 1 article
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
AI agents' excessive autonomy can be leveraged for stealthy manipulation, bypassing classic security controls.
First reported: 09.08.2025 18:06

1 source, 1 article
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
Researchers demonstrated prompt injections to hijack smart home systems using Google's Gemini AI.
First reported: 09.08.2025 18:06

1 source, 1 article
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
The raw, unguarded GPT-5 model is considered nearly unusable for enterprise out of the box.
First reported: 09.08.2025 18:06

1 source, 1 article
Show sources
- Researchers Uncover GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposing Cloud and IoT Systems — thehackernews.com — 09.08.2025 18:06
The jailbreak technique was executed in 24 hours after the release of GPT-5.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The Echo Chamber and Storytelling technique was used to jailbreak GPT-5.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The attack required only three turns and did not use 'unsafe' language in the initial prompts.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The Echo Chamber technique seeds and reinforces a subtly poisonous conversational context.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The attack leveraged narrative continuity to avoid triggering refusal cues.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The technique can be applied to previous versions of OpenAI's GPT, Google's Gemini, and Grok-4.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The attack was successful due to the narrative device increasing stickiness and consistency pressure.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46
The technique demonstrated how multiturn attacks can bypass single-prompt filters and intent detectors.
First reported: 11.08.2025 19:46

1 source, 1 article
Show sources
- Echo Chamber, Prompts Used to Jailbreak GPT-5 in 24 Hours — www.darkreading.com — 11.08.2025 19:46

Similar Happenings

Google Gemini AI Vulnerabilities Allowing Prompt Injection and Data Exfiltration

Researchers disclosed three vulnerabilities in Google's Gemini AI assistant that could have exposed users to privacy risks and data theft. The flaws, collectively named the Gemini Trifecta, affected Gemini Cloud Assist, the Search Personalization Model, and the Browsing Tool. These vulnerabilities allowed for prompt injection attacks, search-injection attacks, and data exfiltration. Google has since patched the issues and implemented additional security measures. The vulnerabilities could have been exploited to inject malicious prompts, manipulate AI behavior, and exfiltrate user data. The flaws highlight the potential risks of AI tools being used as attack vectors rather than just targets. The Gemini Search Personalization model's flaw allowed attackers to manipulate AI behavior and leak user data by injecting malicious search queries via JavaScript from a malicious website. The Gemini Cloud Assist flaw allowed attackers to execute instructions via prompt injections hidden in log content, potentially compromising cloud resources and enabling phishing attacks. The Gemini Browsing Tool flaw allowed attackers to exfiltrate a user's saved information and location data by exploiting the tool's 'Show thinking' feature. Google has made specific changes to mitigate each flaw, including rolling back vulnerable models, hardening search personalization features, and preventing data exfiltration from browsing in indirect prompt injections.

Open in new tab

ForcedLeak Vulnerability in Salesforce Agentforce Exploited via AI Prompt Injection

A critical vulnerability in Salesforce Agentforce, named ForcedLeak, allowed attackers to exfiltrate sensitive CRM data through indirect prompt injection. The flaw affected organizations using Salesforce Agentforce with Web-to-Lead functionality enabled. The vulnerability was discovered and reported by Noma Security on July 28, 2025. Salesforce has since patched the issue and implemented additional security measures, including regaining control of an expired domain and preventing AI agent output from being sent to untrusted domains. The exploit involved manipulating the Description field in Web-to-Lead forms to execute malicious instructions, leading to data leakage. Salesforce has enforced a Trusted URL allowlist to mitigate the risk of similar attacks in the future. The ForcedLeak vulnerability is a critical vulnerability chain with a CVSS score of 9.4, described as a cross-site scripting (XSS) play for the AI era. The exploit involves embedding a malicious prompt in a Web-to-Lead form, which the AI agent processes, leading to data leakage. The attack could potentially lead to the exfiltration of internal communications, business strategy insights, and detailed customer information. Salesforce is addressing the root cause of the vulnerability by implementing more robust layers of defense for their models and agents.

Open in new tab

CISA Emergency Directive 25-03: Mitigation of Cisco ASA Zero-Day Vulnerabilities

The Cybersecurity and Infrastructure Security Agency (CISA) issued Emergency Directive 25-03, mandating federal agencies to identify and mitigate zero-day vulnerabilities in Cisco Adaptive Security Appliances (ASA) exploited by an advanced threat actor. The directive requires agencies to account for all affected devices, collect forensic data, and upgrade or disconnect end-of-support devices by September 26, 2025. The vulnerabilities allow threat actors to maintain persistence and gain network access. Cisco identified multiple zero-day vulnerabilities (CVE-2025-20333, CVE-2025-20362, CVE-2025-20363, and CVE-2025-20352) in Cisco ASA, Firewall Threat Defense (FTD) software, and Cisco IOS software. These vulnerabilities enable unauthenticated remote code execution, unauthorized access, and denial of service (DoS) attacks. GreyNoise detected large-scale campaigns targeting ASA login portals and Cisco IOS Telnet/SSH services, indicating potential exploitation of these vulnerabilities. The campaign is widespread and involves exploiting zero-day vulnerabilities to gain unauthenticated remote code execution on ASAs, as well as manipulating read-only memory (ROM) to persist through reboot and system upgrade. CISA and Cisco linked these ongoing attacks to the ArcaneDoor campaign, which exploited two other ASA and FTD zero-days (CVE-2024-20353 and CVE-2024-20359) to breach government networks worldwide since November 2023. CISA ordered agencies to identify all Cisco ASA and Firepower appliances on their networks, disconnect all compromised devices from the network, and patch those that show no signs of malicious activity by 12 PM EDT on September 26. CISA also ordered that agencies must permanently disconnect ASA devices that are reaching the end of support by September 30 from their networks. The U.K. National Cyber Security Centre (NCSC) confirmed that threat actors exploited the recently disclosed security flaws in Cisco firewalls to deliver previously undocumented malware families like RayInitiator and LINE VIPER. Cisco began investigating attacks on multiple government agencies in May 2025, linked to the state-sponsored ArcaneDoor campaign. The attacks targeted Cisco ASA 5500-X Series devices to implant malware, execute commands, and potentially exfiltrate data. The threat actor modified ROMMON to facilitate persistence across reboots and software upgrades. The compromised devices include ASA 5500-X Series models running specific software releases with VPN web services enabled. The Canadian Centre for Cyber Security urged organizations to update to a fixed version of Cisco ASA and FTD products to counter the threat. Nearly 50,000 Cisco ASA and FTD appliances are vulnerable to actively exploited flaws. The vulnerabilities CVE-2025-20333 and CVE-2025-20362 enable arbitrary code execution and access to restricted URL endpoints. The Shadowserver Foundation discovered over 48,800 internet-exposed ASA and FTD instances still vulnerable to the flaws. The majority of vulnerable devices are located in the United States, followed by the United Kingdom, Japan, Germany, Russia, Canada, and Denmark. The Shadowserver Foundation's data is as of September 29, indicating a lack of response to the ongoing exploitation activity. Greynoise had warned on September 4 about suspicious scans targeting Cisco ASA devices, indicating upcoming undocumented flaws. CISA's emergency directive gave 24 hours to FCEB agencies to identify and upgrade vulnerable Cisco ASA and FTD instances. CISA advised that ASA devices reaching their end of support should be disconnected from federal networks by the end of September. The U.K. NCSC reported that the hackers deployed Line Viper shellcode loader malware and RayInitiator GRUB bootkit.

Open in new tab

ShadowLeak: Undetectable Email Theft via AI Agents

A new attack vector, dubbed ShadowLeak, allows hackers to invisibly steal emails from users who integrate AI agents like ChatGPT with their email inboxes. The attack exploits the lack of visibility into AI processing on cloud infrastructure, making it undetectable to the user. The vulnerability was discovered by Radware and reported to OpenAI, which addressed it in August 2025. The attack involves embedding malicious code in emails, which the AI agent processes and acts upon without user awareness. The attack leverages an indirect prompt injection hidden in email HTML, using techniques like tiny fonts, white-on-white text, and layout tricks to remain undetected by the user. The attack can be extended to any connector that ChatGPT supports, including Box, Dropbox, GitHub, Google Drive, HubSpot, Microsoft Outlook, Notion, or SharePoint. The ShadowLeak attack targets users who connect AI agents to their email inboxes, such as those using ChatGPT with Gmail. The attack is non-detectable and leaves no trace on the user's network. The exploit involves embedding malicious code in emails, which the AI agent processes and acts upon, exfiltrating sensitive data to an attacker-controlled server. OpenAI acknowledged and fixed the issue in August 2025, but the exact details of the fix remain unclear. The exfiltration in ShadowLeak occurs directly within OpenAI's cloud environment, bypassing traditional security controls.

Open in new tab

Cursor IDE autorun flaw allows malicious code execution

A vulnerability in the Cursor AI-powered Integrated Development Environment (IDE) allows automatic execution of tasks in malicious repositories upon opening. This flaw can be exploited to drop malware, hijack developer environments, or steal credentials and API tokens. The issue arises from Cursor disabling the Workspace Trust feature from Visual Studio Code (VS Code), which blocks automatic execution of tasks without explicit consent. This default behavior can be exploited by adding a malicious .vscode/tasks.json file in a publicly shared repository. The flaw affects Cursor's one million users who generate over a billion lines of code daily. The flaw can be exploited to leak sensitive credentials, modify files, or serve as a vector for broader system compromise, placing Cursor users at significant risk from supply-chain attacks. Cursor has decided not to fix the issue, citing the need to maintain AI and other features that depend on the autorun behavior. Users are advised to enable Workspace Trust manually or use a basic text editor for unknown projects.

Open in new tab

Summary

Timeline

GPT-5 Jailbreak and Zero-Click AI Agent Attacks Disclosed

Information Snippets

Similar Happenings

Google Gemini AI Vulnerabilities Allowing Prompt Injection and Data Exfiltration

ForcedLeak Vulnerability in Salesforce Agentforce Exploited via AI Prompt Injection

CISA Emergency Directive 25-03: Mitigation of Cisco ASA Zero-Day Vulnerabilities

ShadowLeak: Undetectable Email Theft via AI Agents

Cursor IDE autorun flaw allows malicious code execution