New Microsoft tool lets devs spin up AI behavior tests using text descriptions

TechCrunch AI · Jun 2, 2026, 7:02 PM

Key takeaways

AI researchers and labs have advanced by leaps and bounds in evaluating AI models for everything from safety and compliance to sycophancy and alignment.
In a bid to make that testing process simpler, Microsoft on Tuesday took the wraps off ASSERT, short for Adaptive Spec-driven Scoring for Evaluation and Regression Testing.
It can also record the paths the AI system takes, including intermediate actions and tool calls, so developers can inspect where failures happen.

Why this matters: a development in AI with implications for how people work, create, and decide.

AI researchers and labs have advanced by leaps and bounds in evaluating AI models for everything from safety and compliance to sycophancy and alignment. But it appears companies and developers are faced with a new, specific need: making sure that their AI system behaves as intended for their specific product or service.

In a bid to make that testing process simpler, Microsoft on Tuesday took the wraps off ASSERT, short for Adaptive Spec-driven Scoring for Evaluation and Regression Testing.

The open-source framework, Microsoft says, makes evaluating application-specific AI behavior easy by using AI to turn high-level, natural-language descriptions of goals, policies, or intended behaviors into thorough, scored tests that can be investigated.

Article preview — originally published by TechCrunch AI. Full story at the source.

Read full story on TechCrunch AI → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from TechCrunch AI alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

New Microsoft tool lets devs spin up AI behavior tests using text descriptions

Key takeaways

More in ai