CyberHappenings logo

Track cybersecurity events as they unfold. Sourced timelines, daily updates. Fast, privacy‑respecting. No ads, no tracking.

GPT-5 Jailbreak and Zero-Click AI Agent Attacks Exposed

First reported
Last updated
2 unique sources, 2 articles

Summary

Hide ▲

Researchers have demonstrated a jailbreak technique to bypass GPT-5's ethical guardrails, leveraging the Echo Chamber and narrative-driven steering methods. This technique can produce harmful procedural content by framing it within a story, avoiding direct malicious prompts. Additionally, zero-click AI agent attacks have been detailed, targeting cloud and IoT systems through indirect prompt injections. These attacks exploit vulnerabilities in AI connectors and integrations, leading to data exfiltration and unauthorized access. The findings highlight the risks associated with integrating AI models with external systems, emphasizing the need for robust security measures and continuous red teaming to mitigate these threats. The Echo Chamber and Storytelling technique was executed in 24 hours after the release of GPT-5, demonstrating how attackers can increase their effectiveness by combining Echo Chamber with complementary strategies.

Timeline

  1. 09.08.2025 18:06 2 articles · 1mo ago

    GPT-5 Jailbreak and Zero-Click AI Agent Attacks Disclosed

    Researchers have uncovered a jailbreak technique for GPT-5 using Echo Chamber and narrative-driven steering. This method produces harmful procedural content by framing it within a story, avoiding direct malicious prompts. Additionally, zero-click AI agent attacks, such as AgentFlayer, exploit vulnerabilities in AI connectors and integrations, leading to data exfiltration and unauthorized access. These findings highlight the risks associated with integrating AI models with external systems and the need for robust security measures. The Echo Chamber and Storytelling technique was executed in 24 hours after the release of GPT-5, demonstrating how attackers can increase their effectiveness by combining Echo Chamber with complementary strategies.

    Show sources

Information Snippets

Similar Happenings

ForcedLeak Vulnerability in Salesforce Agentforce Exploited via AI Prompt Injection

A critical vulnerability in Salesforce Agentforce, named ForcedLeak, allowed attackers to exfiltrate sensitive CRM data through indirect prompt injection. The flaw affected organizations using Salesforce Agentforce with Web-to-Lead functionality enabled. The vulnerability was discovered and reported by Noma Security on July 28, 2025. Salesforce has since patched the issue and implemented additional security measures, including regaining control of an expired domain and preventing AI agent output from being sent to untrusted domains. The exploit involved manipulating the Description field in Web-to-Lead forms to execute malicious instructions, leading to data leakage. Salesforce has enforced a Trusted URL allowlist to mitigate the risk of similar attacks in the future. The ForcedLeak vulnerability is a critical vulnerability chain with a CVSS score of 9.4, described as a cross-site scripting (XSS) play for the AI era. The exploit involves embedding a malicious prompt in a Web-to-Lead form, which the AI agent processes, leading to data leakage. The attack could potentially lead to the exfiltration of internal communications, business strategy insights, and detailed customer information. Salesforce is addressing the root cause of the vulnerability by implementing more robust layers of defense for their models and agents.

CISA Emergency Directive 25-03: Mitigation of Cisco ASA Zero-Day Vulnerabilities

The Cybersecurity and Infrastructure Security Agency (CISA) issued Emergency Directive 25-03, mandating federal agencies to identify and mitigate zero-day vulnerabilities in Cisco Adaptive Security Appliances (ASA) exploited by an advanced threat actor. The directive requires agencies to account for all affected devices, collect forensic data, and upgrade or disconnect end-of-support devices by September 26, 2025. The vulnerabilities allow threat actors to maintain persistence and gain network access. Cisco identified multiple zero-day vulnerabilities (CVE-2025-20333, CVE-2025-20362, CVE-2025-20363, and CVE-2025-20352) in Cisco ASA, Firewall Threat Defense (FTD) software, and Cisco IOS software. These vulnerabilities enable unauthenticated remote code execution, unauthorized access, and denial of service (DoS) attacks. GreyNoise detected large-scale campaigns targeting ASA login portals and Cisco IOS Telnet/SSH services, indicating potential exploitation of these vulnerabilities. The campaign is widespread and involves exploiting zero-day vulnerabilities to gain unauthenticated remote code execution on ASAs, as well as manipulating read-only memory (ROM) to persist through reboot and system upgrade. CISA and Cisco linked these ongoing attacks to the ArcaneDoor campaign, which exploited two other ASA and FTD zero-days (CVE-2024-20353 and CVE-2024-20359) to breach government networks worldwide since November 2023. CISA ordered agencies to identify all Cisco ASA and Firepower appliances on their networks, disconnect all compromised devices from the network, and patch those that show no signs of malicious activity by 12 PM EDT on September 26. CISA also ordered that agencies must permanently disconnect ASA devices that are reaching the end of support by September 30 from their networks. The U.K. National Cyber Security Centre (NCSC) confirmed that threat actors exploited the recently disclosed security flaws in Cisco firewalls to deliver previously undocumented malware families like RayInitiator and LINE VIPER. Cisco began investigating attacks on multiple government agencies in May 2025, linked to the state-sponsored ArcaneDoor campaign. The attacks targeted Cisco ASA 5500-X Series devices to implant malware, execute commands, and potentially exfiltrate data. The threat actor modified ROMMON to facilitate persistence across reboots and software upgrades. The compromised devices include ASA 5500-X Series models running specific software releases with VPN web services enabled. The Canadian Centre for Cyber Security urged organizations to update to a fixed version of Cisco ASA and FTD products to counter the threat.

ShadowLeak: Undetectable Email Theft via AI Agents

A new attack vector, dubbed ShadowLeak, allows hackers to invisibly steal emails from users who integrate AI agents like ChatGPT with their email inboxes. The attack exploits the lack of visibility into AI processing on cloud infrastructure, making it undetectable to the user. The vulnerability was discovered by Radware and reported to OpenAI, which addressed it in August 2025. The attack involves embedding malicious code in emails, which the AI agent processes and acts upon without user awareness. The attack leverages an indirect prompt injection hidden in email HTML, using techniques like tiny fonts, white-on-white text, and layout tricks to remain undetected by the user. The attack can be extended to any connector that ChatGPT supports, including Box, Dropbox, GitHub, Google Drive, HubSpot, Microsoft Outlook, Notion, or SharePoint. The ShadowLeak attack targets users who connect AI agents to their email inboxes, such as those using ChatGPT with Gmail. The attack is non-detectable and leaves no trace on the user's network. The exploit involves embedding malicious code in emails, which the AI agent processes and acts upon, exfiltrating sensitive data to an attacker-controlled server. OpenAI acknowledged and fixed the issue in August 2025, but the exact details of the fix remain unclear. The exfiltration in ShadowLeak occurs directly within OpenAI's cloud environment, bypassing traditional security controls.

Cursor IDE autorun flaw allows malicious code execution

A vulnerability in the Cursor AI-powered Integrated Development Environment (IDE) allows automatic execution of tasks in malicious repositories upon opening. This flaw can be exploited to drop malware, hijack developer environments, or steal credentials and API tokens. The issue arises from Cursor disabling the Workspace Trust feature from Visual Studio Code (VS Code), which blocks automatic execution of tasks without explicit consent. This default behavior can be exploited by adding a malicious .vscode/tasks.json file in a publicly shared repository. The flaw affects Cursor's one million users who generate over a billion lines of code daily. The flaw can be exploited to leak sensitive credentials, modify files, or serve as a vector for broader system compromise, placing Cursor users at significant risk from supply-chain attacks. Cursor has decided not to fix the issue, citing the need to maintain AI and other features that depend on the autorun behavior. Users are advised to enable Workspace Trust manually or use a basic text editor for unknown projects.

Misconfigured Docker APIs Exploited in TOR-Based Cryptojacking Campaign

A new variant of a TOR-based cryptojacking campaign targets exposed Docker APIs. The attack involves executing a new container based on the Alpine Docker image and mounting the host file system. The attackers then run a Base64-encoded payload to download a shell script downloader from a .onion domain. The script installs tools for reconnaissance and communication with a command-and-control (C2) server. The campaign may aim to establish a complex botnet. The attack chain includes exploiting additional ports (23, 9222) and using known default credentials for brute-forcing logins. The malware scans for open Docker API services at port 2375 and propagates the infection to those machines. The attackers block external access to port 2375 using available firewall utilities and install persistent SSH access. The malware includes dormant logic for future expansion opportunities for credential theft, browser session hijacking, remote file download, and distributed denial-of-service (DDoS) attacks. The campaign highlights the importance of securing Docker APIs and limiting exposure of services to the internet.