Indirect Prompt Injection Is Architectural, Not Deployment-Specific — Brave Demonstrates Attacks Against Cloud and Local AI Tools

Technical description

Brave Security Research published empirical demonstrations on June 8, 2026 showing that indirect prompt injection — where malicious instructions embedded in third-party content hijack an AI agent's task — works equally against cloud-hosted AI (Mozilla Tabstack) and fully on-device AI (Cotypist for macOS). In the Tabstack case, invisible text on a webpage caused the agent to abandon a summarisation task, navigate to an attacker-controlled form, populate it with the user's conversation history, and submit it. In the Cotypist case, instructions in a local document influenced autocomplete suggestions and surfaced credentials. Mozilla patched Tabstack after responsible disclosure; Cotypist requires user acceptance of suggestions but is still affected by instruction manipulation. The root cause is architectural: both systems compose trusted developer prompts with untrusted external data in a single flat context window, with no reliable boundary enforcement.

Attack vector

Attacker embeds malicious instructions in any content the AI tool is likely to ingest: webpages (hidden via white-on-white text or zero-width characters), documents, email content, tool results, or retrieved context. No direct access to the AI system is required — the payload arrives through the victim's normal workflow.

Affected systems

Any AI agent or AI-assisted tool that ingests untrusted external content (webpages, documents, emails, search results) in the same context window as system and user instructions. Demonstrated against Mozilla Tabstack (cloud) and Cotypist (on-device macOS). Previously demonstrated against Opera Neon and Perplexity Comet by the same team.

Mitigation

Architectural mitigations: strict context-window segmentation that separates instruction channels from data channels; provenance tagging; requiring explicit user confirmation before any external write (form submission, API call, file write); and treating all retrieved content as data, never as instructions. Runtime: apply prompt-injection filters to content ingested from external sources; log and inspect agent decision traces for unexpected instruction sources.

Indirect Prompt Injection Is Architectural, Not Deployment-Specific — Brave Demonstrates Attacks Against Cloud and Local AI Tools

Technical description

Attack vector

Affected systems

Mitigation

Sources