Vulnerability  ·  2026-06-12

vLLM CVE-2026-5497 — CVSS 7.5 Unauthenticated Denial-of-Service via Unbounded Video Frame Processing in Widely Deployed AI Inference Server

VulnerabilityHigh impactGlobalCVE-2026-5497
vLLM versions 0.8.0 and later are vulnerable to an Out-of-Memory Denial of Service attack in the VideoMediaIO.load_base64() method. When processing video/jpeg data URLs, the method splits the base64 data string on commas to extract JPEG frames without enforcing any frame count limit. An attacker can craft a single API request containing thousands of comma-separated base64 JPEG frames, causing the server to decode all frames into memory until it crashes. The vulnerability is reachable via the unauthenticated OpenAI-compatible chat completions API endpoint.
Single unauthenticated HTTP request to the vLLM /v1/chat/completions endpoint with a crafted video/jpeg data URL containing thousands of comma-separated base64-encoded JPEG frames. No authentication required if the API is exposed without an auth layer (common in self-hosted deployments).
vLLM 0.8.0 and all later versions through at least the disclosure date. vLLM is one of the most widely deployed open-source LLM inference servers, used for hosting models including Llama, Mistral, Qwen, and others in enterprise and cloud environments.
Apply the patch from commit 58ee614 in the vLLM repository. If immediate patching is not possible: place vLLM inference endpoints behind an authenticated API gateway, apply request-size limits and input validation before video data URLs reach the vLLM process, and enable OOM monitoring to detect attack attempts.
Sources
NVD — CVE-2026-5497 DetailGitHub Security Advisory — GHSA-wcwg-c5fc-9vrc (vLLM OOM DoS)vLLM Patch Commit — 58ee614
See this in the live feed Explore related AI security and governance findings — updated every morning.
Open the feed →