Defense  ·  Glossary

AI red-teaming

The practice of having a dedicated team — internal or external — try to make an AI system behave badly: generating harmful content, leaking private data, being manipulated, or failing in safety-critical ways. It mirrors the cybersecurity practice of ethical hacking but is specifically adapted for AI systems.
Standard software testing does not catch AI-specific failure modes. Red-teaming before deployment is the primary way organisations discover how their AI can be abused, and regulators are increasingly expecting evidence that it has been done.
References
NIST AI Risk Management Framework (AI RMF 1.0)
Track this in the live feed See how this plays out in real AI security and governance developments.
Open the feed →