CyberHappenings logo

Track cybersecurity events as they unfold. Sourced timelines. Filter, sort, and browse. Fast, privacy‑respecting. No invasive ads, no tracking.

Lies-in-the-Loop Attack Exploits AI Coding Agents

First reported
Last updated
1 unique sources, 1 articles

Summary

Hide ▲

A new attack vector called 'lies-in-the-loop' (LITL) exploits AI coding agents to deceive users into granting permissions for dangerous actions. The attack manipulates AI agents into presenting seemingly safe contexts, leveraging human trust and fallibility. This technique was demonstrated on Anthropic's Claude Code, showing potential for software supply chain attacks. The LITL attack exploits the intersection of agentic tooling and human fallibility, targeting AI agents that rely on human-in-the-loop interactions for safety and security approvals. The attack can be applied to any AI agent that uses human-in-the-loop mechanisms. The researchers from Checkmarx Zero demonstrated the attack by convincing Claude Code to run arbitrary commands, including a command injection that could lead to a software supply chain attack. The attack highlights the risks of prompt injection and the need for vigilance in reviewing AI-generated prompts.

Timeline

  1. 15.09.2025 12:11 1 articles · 17d ago

    Lies-in-the-Loop Attack Demonstrated on Anthropic's Claude Code

    Researchers from Checkmarx Zero demonstrated the 'lies-in-the-loop' (LITL) attack on Anthropic's Claude Code, an AI code assistant. The attack exploits the trust between humans and AI agents, using prompt injection to deceive users into granting dangerous permissions. The researchers successfully ran arbitrary commands and submitted malicious npm packages to GitHub repositories, highlighting the potential for software supply chain attacks. The attack underscores the need for vigilance in reviewing AI-generated prompts and careful management of AI agent adoption.

    Show sources

Information Snippets

  • The 'lies-in-the-loop' (LITL) attack targets AI coding agents to deceive users into granting dangerous permissions.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The attack exploits the trust between humans and AI agents, manipulating them to present fake, safe contexts.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The LITL attack was demonstrated on Anthropic's Claude Code, an AI code assistant known for its safety considerations.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The attack involves prompt injection, where malicious commands are hidden within long responses to deceive users.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The researchers successfully demonstrated the attack by running arbitrary commands and submitting malicious npm packages to GitHub repositories.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The attack highlights the risks of prompt injection and the need for careful review of AI-generated prompts.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • Organizations are increasingly adopting AI agents, with 79% using AI-assisted coding agents in some workflows.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The security of AI agents remains a concern, especially in software development workflows.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources
  • The researchers recommend practicing suspicion with AI agents and external content, as well as careful management of AI agent adoption.

    First reported: 15.09.2025 12:11
    1 source, 1 article
    Show sources

Similar Happenings

ForcedLeak Vulnerability in Salesforce Agentforce Exploited via AI Prompt Injection

A critical vulnerability in Salesforce Agentforce, named ForcedLeak, allowed attackers to exfiltrate sensitive CRM data through indirect prompt injection. The flaw affected organizations using Salesforce Agentforce with Web-to-Lead functionality enabled. The vulnerability was discovered and reported by Noma Security on July 28, 2025. Salesforce has since patched the issue and implemented additional security measures, including regaining control of an expired domain and preventing AI agent output from being sent to untrusted domains. The exploit involved manipulating the Description field in Web-to-Lead forms to execute malicious instructions, leading to data leakage. Salesforce has enforced a Trusted URL allowlist to mitigate the risk of similar attacks in the future. The ForcedLeak vulnerability is a critical vulnerability chain with a CVSS score of 9.4, described as a cross-site scripting (XSS) play for the AI era. The exploit involves embedding a malicious prompt in a Web-to-Lead form, which the AI agent processes, leading to data leakage. The attack could potentially lead to the exfiltration of internal communications, business strategy insights, and detailed customer information. Salesforce is addressing the root cause of the vulnerability by implementing more robust layers of defense for their models and agents.

Critical deserialization flaw in GoAnywhere MFT (CVE-2025-10035) patched

Fortra has disclosed and patched a critical deserialization vulnerability (CVE-2025-10035) in GoAnywhere Managed File Transfer (MFT) software. This flaw, rated 10.0 on the CVSS scale, allows for arbitrary command execution if the system is publicly accessible over the internet. The vulnerability was actively exploited in the wild as early as September 10, 2025, a week before public disclosure. Fortra has released patches in versions 7.8.4 and 7.6.3. The flaw impacts the same license code path as the earlier CVE-2023-0669, which was widely exploited by multiple ransomware and APT groups in 2023, including LockBit. The vulnerability was discovered during a security check on September 11, 2025. Fortra advised customers to review configurations immediately and remove public access from the Admin Console. The Shadowserver Foundation is monitoring over 470 GoAnywhere MFT instances, but the number of patched instances is unknown. The flaw is highly dependent on systems being externally exposed to the internet. The exploitation sequence involved creating a backdoor account and uploading additional payloads, originating from an IP address flagged for brute-force attacks.