Frontier AI dependency recommendations found to generate flawed upgrade and patch guidance

First reported

26.03.2026 16:44

Last updated

1 unique sources, 1 articles

Summary

Hide ▲

A study by Sonatype analyzing 258,000 AI-generated dependency upgrade recommendations across Maven Central, npm, PyPI, and NuGet from June to August 2025 revealed that frontier AI models—including GPT-5.2, Claude Sonnet 3.7/4.5, Claude Opus 4.6, and Gemini 2.5 Pro/3 Pro—frequently produce hallucinated or incorrect upgrade paths, security fixes, and version recommendations. Nearly 28% of recommendations from earlier models were hallucinations, while even improved frontier models introduced faulty advice, leaving critical and high-severity vulnerabilities unresolved in production environments. The issue stems from the models’ lack of real-time dependency, vulnerability, compatibility, and enterprise policy context, leading to wasted developer time, unresolved exposures, and increased technical debt. Notably, some recommendations introduced known vulnerabilities into AI tooling stacks themselves, exacerbating risk within the models’ own infrastructure.

Timeline

26.03.2026 16:44 1 articles · 23h ago

Frontier AI dependency recommendations found to generate flawed upgrade and patch guidance
Analysis of 258,000 AI-generated dependency upgrade recommendations across major package registries reveals consistent hallucinations and incorrect guidance from frontier models. Nearly 28% of recommendations were hallucinations in earlier models, with improved frontier models still producing faulty advice in ~6% of cases. Models frequently recommended "no change" for components, failing to remediate 800–900 critical/high-severity vulnerabilities. Grounding models with real-time dependency intelligence reduced critical and high risks by nearly 70%.
Show sources

AI-Powered Dependency Decisions Introduce, Ignore Security Bugs — www.darkreading.com — 26.03.2026 16:44
Open in new tab

Information Snippets

Sonatype analyzed 258,000 AI-generated dependency upgrade recommendations across Maven Central, npm, PyPI, and NuGet between June and August 2025, generated by seven frontier models from Anthropic, OpenAI, and Google.
First reported: 26.03.2026 16:44

1 source, 1 article
Show sources
- AI-Powered Dependency Decisions Introduce, Ignore Security Bugs — www.darkreading.com — 26.03.2026 16:44
Nearly 28% of recommendations from earlier models were hallucinations, while improved frontier models still generated faulty or fabricated advice in approximately 1 out of every 16 recommendations.
First reported: 26.03.2026 16:44

1 source, 1 article
Show sources
- AI-Powered Dependency Decisions Introduce, Ignore Security Bugs — www.darkreading.com — 26.03.2026 16:44
Frontier models recommended "no change" for roughly one-third of components, but this cautious approach failed to flag 800–900 critical and high-severity vulnerabilities that remained in production code.
First reported: 26.03.2026 16:44

1 source, 1 article
Show sources
- AI-Powered Dependency Decisions Introduce, Ignore Security Bugs — www.darkreading.com — 26.03.2026 16:44
Some AI models actively introduced known vulnerabilities by recommending software versions containing bugs, including in libraries integral to AI stack operations (training, fine-tuning, orchestration, and serving).
First reported: 26.03.2026 16:44

1 source, 1 article
Show sources
- AI-Powered Dependency Decisions Introduce, Ignore Security Bugs — www.darkreading.com — 26.03.2026 16:44
Grounding AI models with real-time dependency intelligence, vulnerability data, and compatibility context—such as Sonatype’s hybrid approach—reduced critical and high risks by nearly 70% compared to ungrounded frontier models.
First reported: 26.03.2026 16:44

1 source, 1 article
Show sources
- AI-Powered Dependency Decisions Introduce, Ignore Security Bugs — www.darkreading.com — 26.03.2026 16:44