What happened
Microsoft released ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing) as an MIT-licensed open-source framework (announced June 10, published ~June 2). It converts natural-language behavior specs, product requirements, and governance documents into executable evaluation scenarios, datasets, metrics, and scorecards for AI models and agents.
Why it matters
Directly addresses the enterprise gap where AI agent behavior is inconsistently evaluated before production. Lowers the barrier to formal behavioral testing — treating evals as a production gate rather than an afterthought — which is critical for regulated industries deploying agents.
Applicability
AI/ML engineering and AppSec teams building or deploying AI agents; adopt as part of CI/CD pipelines for behavioral regression testing. Available now.