Definition
A structured process — currently being negotiated between the White House and Anthropic — that defines how vulnerabilities in AI models (such as jailbreaks that bypass safety controls) should be discovered, scored, reported, and remediated, similar to how software vulnerability disclosure works for traditional software.
Why it matters
Without agreed rules, AI labs and governments have no common procedure for handling dangerous model flaws — leading to ad hoc shutdowns, export bans, and conflicting responses; a formal framework is the prerequisite for predictable, coordinated AI security governance.