What happened
On 20 May 2026, Singapore's Cyber Security Agency (CSA), GovTech Singapore, IMDA, and Google published findings from a global-first AI Agents Sandbox conducted over approximately four months from August 2025. The sandbox tested computer-use agents across three real public-sector use cases: automated quality assurance of government digital services, AI safety testing of deployed chatbots, and social assistance application guidance. Across all use cases, the most prominent cybersecurity risk identified was indirect prompt injection — specifically, the risk that an agent could be deceived into performing unintended actions including remote code execution (RCE) through malicious content it encounters in its environment. The report also identified human oversight calibration, data protection during agent-data interaction, and third-party agent customisation as key risk themes. The report recommends risk-based human oversight (pre-approval for high-risk, post-hoc review for reversible low-risk), distributed safeguards across platform, organisation, and user layers, and controlled incremental deployment.
Why it matters
This is the first government-sponsored empirical study confirming that indirect prompt injection → RCE is a real-world production risk in agentic systems, not merely a theoretical concern. The finding carries strong practical weight: these were not red-team exercises against hardened systems but real public-sector workflows running computer-use agents. The multi-agency Singapore imprimatur (CSA + GovTech + IMDA) signals that prompt-injection defences will be an expected baseline in Singapore government AI procurement and, by extension, vendor certifications such as the AI Verify framework.
Action needed
Treat indirect prompt injection as a mandatory test case for any agentic deployment — especially computer-use agents that browse the web, read emails, or process documents from external sources. Add RCE-path prompt injection tests to pre-deployment security review checklists. Evaluate whether your agent orchestration layers separate instructional content from retrieved/external content, and whether tool-call outputs are treated as untrusted input.