Find notable cyber news and cases, enriched with sources, timelines, and signals.

AI inference frameworks unsafe ZeroMQ/pickle deserialization (multiple vulnerabilities)

Vulnerability
First reported
Last updated
Happening score
H score 17
1 unique sources, 1 articles

Summary

Hide ▲

Researchers disclosed critical remote code execution flaws in AI inference frameworks tied to unsafe ZeroMQ/pickle deserialization, creating a path to arbitrary code execution on inference nodes. The affected scope includes Meta Llama, NVIDIA TensorRT-LLM, vLLM, SGLang, Modular Max Server, and Sarathi-Serve. Some fixes are available, but Sarathi-Serve remains unpatched and SGLang has only incomplete fixes. The underlying issue is especially risky because it can be triggered over unauthenticated ZMQ TCP sockets.

Timeline

  1. 14.11.2025 17:20 2 articles · 6mo ago

    Oligo discloses ShadowMQ RCE flaws in AI inference frameworks

    Initial Disclosure

    Oligo Security reported critical remote code execution vulnerabilities in AI inference engines from Meta, Nvidia, Microsoft, vLLM, SGLang, Modular Max Server, and Sarathi-Serve. The root cause was unsafe ZeroMQ recv_pyobj() deserialization with Python pickle over unauthenticated ZMQ TCP sockets, a code-reuse pattern dubbed ShadowMQ, which could let an attacker send malicious data for deserialization and execute arbitrary code on inference nodes. Remediation status varied: Meta's Llama framework had been patched last October, NVIDIA TensorRT-LLM was fixed in version 0.18.2, Modular Max Server was fixed, vLLM switched to the V1 engine by default, SGLang had incomplete fixes, and Sarathi-Serve remained unpatched.

    Show sources