Definition
An attack where malicious instructions or payloads are hidden inside the descriptions, outputs, or configuration of a tool that an AI agent is designed to trust and use. When the agent calls that tool, it unknowingly executes the attacker's commands—redirecting traffic, leaking secrets, or taking destructive actions.
Why it matters
As AI agents are connected to more business tools through the Model Context Protocol and similar frameworks, tool poisoning becomes a scalable way for attackers to hijack entire automated workflows without ever touching the AI model itself. A single poisoned tool can cascade across every agent that trusts it.