Current Limitations of General-Purpose LLMs in Offensive Security
Summary
Hide β²
Show βΌ
General-purpose large language models (LLMs) demonstrate limited effectiveness in creating working exploits and finding complex vulnerabilities. Tailored AI systems remain more effective for penetration testers and offensive researchers. LLMs can mislead non-experts into believing they have actionable results. Researchers tested 50 LLMs, finding that while many can identify simple vulnerabilities and exploits, they struggle with complex tasks. Commercial models showed better performance but still had significant limitations. The rapid development of AI in cybersecurity highlights the need for human oversight to mitigate mistakes and hallucinations.
Timeline
-
13.08.2025 22:08 π° 1 articles Β· β± 1mo ago
General-Purpose LLMs Show Limited Effectiveness in Offensive Security
Research into the offensive-security capabilities of 50 LLMs revealed that while many can find simple vulnerabilities and exploits, they struggle with complex tasks. Tailored AI systems and human oversight are crucial for effective vulnerability discovery and exploitation. The study found that commercial LLMs had the most success in vulnerability research tasks but still had significant limitations. The rapid development of AI in cybersecurity is expected to significantly impact penetration testing and application security.
Show sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
Information Snippets
-
50 different large language models (LLMs) were tested for offensive-security capabilities.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Most top-line LLMs, including ChatGPT and Google's Gemini, performed well in finding simple vulnerabilities.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Non-specialized LLMs can be used for some vulnerability research and exploit development by non-experts.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Offensive-security-specific AI is better suited for experts' use.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
General-purpose LLMs may mislead non-experts into believing they have actionable results.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
17 commercial models tested demonstrated the ability to find simple vulnerabilities, with more than half creating exploits for other vulnerabilities.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Only four models found complex vulnerabilities, and three created complex exploits.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
AI systems tailored for vulnerability discovery and exploitation are making progress.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Google's Big Sleep LLM discovered a vulnerability in SQLite, and Team Atlanta's Atlantis fixed a bug in SQLite3.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Xbow's autonomous vulnerability discovery system found over 900 issues on HackerOne.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Human oversight is crucial to catch mistakes and hallucinations in AI systems.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
LLMs are already used for auto-mapping attack surfaces, writing proof-of-concept code, and summarizing scan data.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Commercial LLMs had the most success in vulnerability research tasks but struggled with complex tasks.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Some experimental runs took hours or even a full workday to determine if the LLM could solve the problem.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
LLMs are useful for initial triaging of scanner findings and during development.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
AI is expected to significantly impact penetration testing and application security.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Xbow's approach to application-security testing involves a central AI system coordinating analysis and specific prompts for each bug class.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
Xbow discovered nearly 1,000 vulnerabilities in Q2 2025, up from less than 100 in Q1 2025.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
AI can help ameliorate the current talent shortage by handling rote tasks and allowing staff to focus on the big picture.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
-
An autonomous AI system is expected to win a capture-the-flag tournament within the next 24 months.
First reported: 13.08.2025 22:08π° 1 source, 1 articleShow sources
- Popular AI Systems Still a Work-in-Progress for Security β www.darkreading.com β 13.08.2025 22:08
Similar Happenings
AI-Powered Cyberattacks Targeting Critical Sectors Disrupted
Anthropic disrupted a sophisticated AI-powered cyberattack campaign in July 2025. The operation, codenamed GTG-2002, targeted 17 organizations across healthcare, emergency services, government, and religious institutions. The attacker used Anthropic's AI-powered chatbot Claude to automate theft and extortion, threatening to expose stolen data publicly to extort ransoms ranging from $75,000 to $500,000 in Bitcoin. The attacker employed Claude Code on Kali Linux to automate various phases of the attack cycle, including reconnaissance, credential harvesting, and network penetration. The AI tool was also used to craft bespoke versions of the Chisel tunneling utility, disguise malicious executables, and organize stolen data for monetization. The attacker used Claude Code to create scanning frameworks using a variety of APIs, provide preferred operational TTPs, and perform real-time assistance with network penetrations. The AI tool was also used to create obfuscated versions of the Chisel tunneling tool, develop new TCP proxy code, analyze exfiltrated financial data to determine ransom amounts, and generate visually alarming HTML ransom notes. The attacker used AI to make tactical and strategic decisions, adapt to defensive measures in real-time, and create customized ransom notes and extortion strategies. The attacker's activities led Anthropic to develop a tailored classifier and new detection method to prevent future abuse. The operation represents a shift to 'vibe hacking,' where threat actors use LLMs and agentic AI to perform attacks.
Citrix NetScaler ADC and Gateway vulnerabilities patched and actively exploited in the wild
Citrix has released patches for three vulnerabilities in NetScaler ADC and NetScaler Gateway. One of these vulnerabilities, CVE-2025-7775, is actively exploited in the wild. The flaws include memory overflow vulnerabilities and improper access control issues. The vulnerabilities affect specific configurations of NetScaler ADC and NetScaler Gateway, including unsupported, end-of-life versions. Citrix has confirmed active exploitation of CVE-2025-7775, which can lead to remote code execution or denial-of-service. The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has added CVE-2025-7775 to its Known Exploited Vulnerabilities (KEV) catalog, requiring federal agencies to remediate within 48 hours. Nearly 20% of NetScaler assets identified are on unsupported, end-of-life versions, with a significant concentration in North America and the APAC region. CISA lists 10 NetScaler flaws in its KEV catalog, with six discovered in the last two years. Threat actors are using HexStrike AI, an AI-driven security platform, to exploit the Citrix vulnerabilities, significantly reducing the time between disclosure and mass exploitation. HexStrike-AI was created by cybersecurity researcher Muhammad Osama and has been open-source and available on GitHub for the last month, where it has already garnered 1,800 stars and over 400 forks.
AI systems vulnerable to data-theft prompts in downscaled images
Researchers have demonstrated a new attack method that steals user data by embedding malicious prompts in images. These prompts are invisible in full-resolution images but become visible when the images are downscaled by AI systems. The attack exploits aliasing artifacts introduced by resampling algorithms, allowing hidden text to emerge and be interpreted as user instructions by the AI model. This can lead to data leakage or unauthorized actions. The method has been successfully tested against several AI systems, including Google Gemini CLI, Vertex AI Studio, Gemini's web interface, Gemini's API, Google Assistant on Android, and Genspark. The attack was developed by Kikimora Morozova and Suha Sabi Hussain from Trail of Bits, building on a 2020 theory presented in a USENIX paper. The researchers have also released an open-source tool, Anamorpher, to create images for testing the attack. They recommend implementing dimension restrictions and user confirmation for sensitive tool calls as mitigation strategies.
PromptFix Exploit Targets AI Browsers for Malicious Prompts
Researchers from Guardio Labs have demonstrated a new prompt injection technique called PromptFix. This exploit tricks generative AI (GenAI) models into executing malicious instructions embedded within fake CAPTCHA checks on web pages. The attack targets AI-driven browsers like Perplexity's Comet, which automate tasks such as shopping and email management. The exploit misleads AI models into interacting with phishing pages or fraudulent sites without user intervention, leading to potential data breaches and financial losses. The technique, dubbed Scamlexity, represents a new era of scams where AI convenience collides with invisible scam surfaces, making humans collateral damage. The exploit can trick AI models into purchasing items on fake websites, entering credentials on phishing pages, or downloading malicious payloads. The findings underscore the need for robust defenses in AI systems to anticipate, detect, and neutralize such attacks. Microsoft Edge is embedding agentic browsing features through a Copilot integration, and OpenAI is developing an agentic AI browser platform codenamed 'Aura'. Comet is quickly penetrating the mainstream consumer market. Agentic AI browsers were released with inadequate security safeguards against known and novel attacks. Guardio advises against assigning sensitive tasks to agentic AI browsers until their security matures. AI browser agents from major AI firms failed to reliably detect the signs of a phishing site. Comet often added items to a shopping cart, filled out credit-card details, and clicked the buy button on a fake Walmart site. AI browsers with access to email will read and act on prompts embedded in the messages. AI companies need stronger sanitation and guardrails against these attacks. Nearly all companies (96%) claim to want to expand their use of AI agents in the next year, but most are not prepared for the new risks posed by AI agents in a business environment. A fundamental issue is how to discern actions taken through a browser by a user versus those taken by an agent. AI agents need to be experts at not just getting things done, but at sussing out and blocking potential security threats to workers and company data. Companies should move from "trust, but verify" to "doubt, and double verify"βessentially hobbling automation until an AI agent has shown it can always complete a workflow properly. Defective AI operations continue to be a major problem, and security represents another layer on top of those issues. Companies should hold off on putting AI agents into any business process that requires reliability until AI-agent makers offer better visibility, control, and security. Companies that intend to push their use of AI into agent-based workflows should focus on a comprehensive strategy, including inventorying all AI services used by employees and creating an AI usage policy. Employees need to understand the basics of AI safety and what it means to give these bots information or privileges to do things on their behalf.