Find notable cyber news and cases, enriched with sources, timelines, and signals.

K2 Think Partial Prompt Leaking security flaw

Vulnerability
First reported
Last updated
Happening score
H score 20
1 unique sources, 1 articles

Summary

Hide ▲

A K2 Think weakness called Partial Prompt Leaking enabled a successful jailbreak, showing that the model's guardrails could be bypassed to reach restricted instructions. The flaw mattered because the model exposed enough of its reasoning to help an attacker map the controls and iterate past them. The same weakness was reported to expose risk for malware-related prompts as well.

Timeline

  1. 11.09.2025 15:00 2 articles · 8mo ago

    K2 Think jailbreak publicized via Partial Prompt Leaking

    Initial Disclosure

    Adversa AI's Alex Polyakov disclosed a jailbreak method against K2 Think that exploited Partial Prompt Leaking, where the model exposed plaintext reasoning and refusal logic that revealed which rules blocked malicious prompts. After a few prompt iterations, he bypassed layered safeguards and elicited restricted instructions, including guidance for creating malware.

    Show sources