K2 Think AI Model Jailbroken via Partial Prompt Leaking

First reported

11.09.2025 15:00

Last updated

📰 1 unique sources, 1 articles

Summary

Hide ▲

K2 Think, an advanced reasoning AI model released by the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and G42, was jailbroken within hours of its public release. The model's transparency, which allows users to see its reasoning process, was exploited to bypass its safeguards. This vulnerability, known as Partial Prompt Leaking, enables attackers to manipulate the model into performing restricted actions. K2 Think, developed in the UAE, is designed to handle complex, multistep problems with a parameter efficiency that rivals larger models. Its transparency features, intended to make the model auditable, were found to be exploitable by malicious actors. The jailbreak was demonstrated by Adversa AI's Alex Polyakov, who highlighted the model's susceptibility to prompt engineering attacks.

Timeline

11.09.2025 15:00 📰 1 articles · ⏱ 6d ago

K2 Think AI Model Jailbroken via Partial Prompt Leaking
K2 Think, an advanced reasoning AI model released by MBZUAI and G42, was jailbroken within hours of its public release. The model's transparency features, which allow users to see its reasoning process, were exploited to bypass its safeguards. This vulnerability, known as Partial Prompt Leaking, enables attackers to manipulate the model into performing restricted actions. The jailbreak was demonstrated by Alex Polyakov of Adversa AI, who highlighted the model's susceptibility to prompt engineering attacks. The transparency intended to make K2 Think auditable was found to be a significant security risk.
Show sources

'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00
Open in new tab

Information Snippets

K2 Think was released on September 9, 2025, by MBZUAI and G42, a UAE-based company with significant geopolitical influence.
First reported: 11.09.2025 15:00

📰 1 source, 1 article
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00
The model is designed to be highly transparent, allowing users to see its reasoning process in plaintext.
First reported: 11.09.2025 15:00

📰 1 source, 1 article
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00
Partial Prompt Leaking was used to jailbreak K2 Think by exploiting its transparency features.
First reported: 11.09.2025 15:00

📰 1 source, 1 article
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00
The jailbreak was demonstrated by Alex Polyakov of Adversa AI, who showed how the model's reasoning process could be manipulated.
First reported: 11.09.2025 15:00

📰 1 source, 1 article
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00
K2 Think's transparency was intended to make it auditable but created a new type of vulnerability.
First reported: 11.09.2025 15:00

📰 1 source, 1 article
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00
The model's reasoning process can be manipulated to bypass its safeguards, potentially allowing it to perform restricted actions.
First reported: 11.09.2025 15:00

📰 1 source, 1 article
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release — www.darkreading.com — 11.09.2025 15:00