Jailbreak vulnerability in K2 Think AI model disclosed
Summary
Hide â˛
Show âŧ
A researcher has publicly disclosed a jailbreak vulnerability in the K2 Think AI model, released by the UAE's Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and G42. The vulnerability, called Partial Prompt Leaking, allows attackers to manipulate the model's reasoning process to bypass its safeguards. K2 Think, released on September 9, 2025, is designed to be highly transparent, revealing its reasoning methods in plaintext. This transparency, intended to make the model auditable, also exposes a new type of vulnerability that can be exploited to jailbreak the model. The jailbreak was demonstrated by Adversa AI's Alex Polyakov, who showed that the model's transparency makes it easier to map and exploit than typical models. The vulnerability allows attackers to craft manipulative prompts that can bypass the model's security rules, potentially leading to unauthorized actions.
Timeline
-
11.09.2025 15:00 đ° 1 articles
Jailbreak vulnerability in K2 Think AI model disclosed
A researcher has publicly disclosed a jailbreak vulnerability in the K2 Think AI model, released by the UAE's Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and G42. The vulnerability, called Partial Prompt Leaking, allows attackers to manipulate the model's reasoning process to bypass its safeguards. The jailbreak was demonstrated by Adversa AI's Alex Polyakov, who showed that the model's transparency makes it easier to map and exploit than typical models. The vulnerability allows attackers to craft manipulative prompts that can bypass the model's security rules, potentially leading to unauthorized actions.
Show sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
Information Snippets
-
K2 Think was released on September 9, 2025, by MBZUAI and G42.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
-
K2 Think is designed to be highly transparent, revealing its reasoning methods in plaintext.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
-
The jailbreak vulnerability, called Partial Prompt Leaking, exploits the model's transparency to bypass its safeguards.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
-
Alex Polyakov demonstrated the jailbreak, showing that the model's transparency makes it easier to exploit.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
-
The vulnerability allows attackers to craft manipulative prompts that can bypass the model's security rules.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
-
K2 Think is built on 32 billion parameters, claiming reasoning, math, and coding performance comparable to larger models.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00
-
G42 is backed by Abu Dhabi's sovereign wealth and Microsoft, and run by UAE's national security chief.
First reported: 11.09.2025 15:00đ° 1 source, 1 articleShow sources
- 'K2 Think' AI Model Jailbroken Mere Hours After Release â www.darkreading.com â 11.09.2025 15:00