AI Governance

Safety Evaluation

Systematic testing of AI models for harmful outputs, dangerous capabilities, and vulnerability to misuse. Safety evaluations assess risks before deployment.

Why It Matters

Safety evaluation is becoming mandatory for frontier models. The EU AI Act and voluntary commitments require comprehensive safety testing before release.

Example

Testing whether a model can be manipulated into providing instructions for dangerous activities, generating harmful content, or leaking private training data.

Think of it like...

Like crash testing cars before they go on sale — you need to know how the system behaves in worst-case scenarios before putting it in users' hands.

Related Terms