Prompt-based Router Open-Mode Manipulation Induced via SSRF-like Queries, Reconfiguring Operations Using Trust Evasion (PROMISQROUTE) in ChatGPT
Summary
Hide β²
Show βΌ
A newly discovered technique, dubbed PROMISQROUTE, allows attackers to downgrade ChatGPT to use older, less secure models by manipulating prompts. This technique exploits the routing mechanism that directs prompts to different models based on complexity. Researchers demonstrated the vulnerability by successfully bypassing security filters in ChatGPT to obtain malicious instructions. The vulnerability stems from ChatGPT's multimodal operation, where prompts are routed to various models based on their complexity and requirements. Attackers can influence this routing to direct malicious queries to less secure models. OpenAI has acknowledged the issue but disputes the ability to downgrade to models older than GPT-5. The primary impact is the potential for attackers to bypass security measures in ChatGPT, leading to unauthorized access and malicious activities. The economic incentive for using older models to save on computing resources complicates the solution.
Timeline
-
21.08.2025 23:35 π° 1 articles Β· β± 25d ago
Prompt-based Router Open-Mode Manipulation Induced via SSRF-like Queries, Reconfiguring Operations Using Trust Evasion (PROMISQROUTE) in ChatGPT
A newly discovered technique, PROMISQROUTE, allows attackers to downgrade ChatGPT to use older, less secure models by manipulating prompts. This technique exploits the routing mechanism that directs prompts to different models based on complexity. Researchers demonstrated the vulnerability by successfully bypassing security filters in ChatGPT to obtain malicious instructions. The primary impact is the potential for attackers to bypass security measures in ChatGPT, leading to unauthorized access and malicious activities. The economic incentive for using older models to save on computing resources complicates the solution.
Show sources
- Easy ChatGPT Downgrade Attack Undermines GPT-5 Security β www.darkreading.com β 21.08.2025 23:35
Information Snippets
-
PROMISQROUTE allows attackers to downgrade ChatGPT to use older, less secure models by manipulating prompts.
First reported: 21.08.2025 23:35π° 1 source, 1 articleShow sources
- Easy ChatGPT Downgrade Attack Undermines GPT-5 Security β www.darkreading.com β 21.08.2025 23:35
-
ChatGPT's routing mechanism directs prompts to different models based on complexity and requirements.
First reported: 21.08.2025 23:35π° 1 source, 1 articleShow sources
- Easy ChatGPT Downgrade Attack Undermines GPT-5 Security β www.darkreading.com β 21.08.2025 23:35
-
Researchers demonstrated the vulnerability by bypassing security filters to obtain malicious instructions.
First reported: 21.08.2025 23:35π° 1 source, 1 articleShow sources
- Easy ChatGPT Downgrade Attack Undermines GPT-5 Security β www.darkreading.com β 21.08.2025 23:35
-
OpenAI disputes the ability to downgrade to models older than GPT-5.
First reported: 21.08.2025 23:35π° 1 source, 1 articleShow sources
- Easy ChatGPT Downgrade Attack Undermines GPT-5 Security β www.darkreading.com β 21.08.2025 23:35
-
The economic incentive to use older models to save on computing resources complicates the solution.
First reported: 21.08.2025 23:35π° 1 source, 1 articleShow sources
- Easy ChatGPT Downgrade Attack Undermines GPT-5 Security β www.darkreading.com β 21.08.2025 23:35
Similar Happenings
AI systems vulnerable to data-theft prompts in downscaled images
Researchers have demonstrated a new attack method that steals user data by embedding malicious prompts in images. These prompts are invisible in full-resolution images but become visible when the images are downscaled by AI systems. The attack exploits aliasing artifacts introduced by resampling algorithms, allowing hidden text to emerge and be interpreted as user instructions by the AI model. This can lead to data leakage or unauthorized actions. The method has been successfully tested against several AI systems, including Google Gemini CLI, Vertex AI Studio, Gemini's web interface, Gemini's API, Google Assistant on Android, and Genspark. The attack was developed by Kikimora Morozova and Suha Sabi Hussain from Trail of Bits, building on a 2020 theory presented in a USENIX paper. The researchers have also released an open-source tool, Anamorpher, to create images for testing the attack. They recommend implementing dimension restrictions and user confirmation for sensitive tool calls as mitigation strategies.