Definition
An attack where a malicious instruction is hidden inside text that an AI reads — such as a document, email, or web page — tricking the AI into ignoring its original instructions and doing what the attacker wants instead. Think of it as the AI equivalent of forging a memo from the CEO and slipping it into an employee's inbox. The AI cannot reliably tell the difference between legitimate instructions from its operators and forged ones from an attacker.
Why it matters
Any AI that reads or summarises external content — customer emails, web pages, uploaded documents — is a potential target. A successful attack can cause the AI to leak confidential data, take unauthorised actions, or spread misinformation, all without the user or operator realising it.