Technical description
Researchers from the Institute of Information Engineering (Chinese Academy of Sciences), University of Chinese Academy of Sciences, and Beijing Chaitin Technology formalized 'cross-session stored prompt injection' (SPI) as a new system-level attack class distinct from single-session prompt injection. Drawing an explicit analogy to stored XSS in web systems, SPI exploits the fact that modern agentic systems maintain persistent state — memories, filesystems, RAG stores, tool/MCP metadata, and AGENTS.md system prompts — that persists across sessions. An attacker who writes adversarial content into any long-lived agent artifact (via an ordinary interaction, document upload, or web content retrieval) causes that malicious instruction to be reincorporated into downstream agent execution contexts across future sessions, users, and tasks — long after the attacker's interaction has ended. The paper provides a formalized taxonomy, benchmark, and sandbox toolkit with quantitative attack success measurements across models, attack goals, and persistence channels.
Attack vector
Attacker writes adversarial content into persistent agent state through any available input channel (user query, document, web page, tool output). The content persists in agent memory, RAG databases, filesystem artifacts, or tool metadata. In future sessions — potentially involving different users or tasks — the agent's context construction incorporates the stored instruction, triggering malicious behaviour without any further attacker interaction. Injection and exploitation are temporally decoupled, making detection far harder than real-time injection.
Affected systems
Any agentic system with persistent cross-session state: agents using long-term memory (MemGPT-style), RAG-backed knowledge bases, shared filesystems, MCP tool metadata, or AGENTS.md-style system prompts. Multi-user agent deployments are highest risk since a single stored injection can affect all subsequent users. Tested across multiple production LLMs.
Mitigation
Architectural controls suggested: (1) provenance tagging for all content written to persistent agent state, distinguishing authoritative system prompts from user/external input; (2) access controls and integrity verification on long-term memory stores and RAG knowledge bases; (3) sanitization boundaries between what gets written to persistent state vs. what gets elevated to privileged context slots; (4) routine adversarial testing of agent memory and persistent state stores. The benchmark and sandbox toolkit released alongside the paper can be used for continuous evaluation.