Vulnerability  ·  2026-04-12

Sockpuppeting: Universal Single-Line Jailbreak Affects 11 Major LLMs

VulnerabilityHigh impact
Trend Micro researchers disclosed 'Sockpuppeting', a jailbreak technique that bypasses safety guardrails on 11 major LLMs using a single line of code exploiting the API assistant prefill feature. Successfully extracted functional malware code and confidential system prompts.
Injection of a fake acceptance into the assistant-role message via standard API prefill feature, exploiting the model's self-consistency tendency to continue prohibited output. Requires only API access supporting assistant prefill—no model weights, optimisation, or specialised tooling.
GPT-4o, GPT-4o-mini, Claude 4 Sonnet, Gemini 2.5 Flash (most susceptible at 15.7% ASR), and 7 other major LLMs. Three models blocked at API layer.
Implement message-ordering validation that blocks assistant-role messages at the API layer. Apply output filtering for known attack patterns. Monitor API usage for anomalous prefill patterns.
Sources
Trend Micro - Sockpuppeting How a Single Line Can Bypass LLM Safety GuardrailsCyberSecurity News - Single Line of Code Can Jailbreak 11 AI ModelsGBHackers - 11 AI Models Vulnerable to One-Line Jailbreak
See this in the live feed Explore related AI security and governance findings — updated every morning.
Open the feed →