E — Evaluate

Execute gate reviews, run audits, and benchmark progress

Definition

Evaluate is the formal validation stage of COMPEL. It verifies that every AI system meets both its business value promise and its responsible AI obligations before production deployment — and on an ongoing basis thereafter. Evaluation in COMPEL is not a final checkbox; it is a structured, repeatable process that operates at multiple timescales.

Purpose

The purpose of Evaluate is to determine what is working and what needs adjustment. Gate E reviews occur before production deployment of each new AI system. Periodic evaluation cycles assess whether deployed systems continue to meet governance standards as models drift, data distributions shift, and regulatory requirements evolve. This is where COMPEL's alignment with ISO 42001 internal audit requirements, NIST AI RMF Measure and Manage functions, and EU AI Act conformity assessment obligations is most directly operationalized.

Key Activities

Gate E review execution — formal validation of audit evidence packs against Gate E criteria
Bias and fairness testing — structured assessment of model outputs against protected characteristics and equity criteria
Business value validation — measuring actual outcomes against success criteria defined in Model stage
Stakeholder sign-off process — obtaining formal approval from business owners, risk owners, and oversight bodies
Regulatory conformity assessment — checking each system against applicable obligations by jurisdiction and risk class
Governance scorecard assessment — scoring organizational AI governance maturity on the COMPEL scale
Benchmarking against transformation success criteria
Internal audit execution — structured review of governance processes, controls, and documentation against ISO 42001 requirements
Re-attestation triggers and cycles — managing periodic re-certification of AI system compliance as conditions change
Risk acceptance reviews — formal evaluation and documentation of residual risks accepted by designated risk owners
Model retirement evaluation — assessing whether deployed AI systems should be decommissioned based on performance, relevance, or risk criteria
Audit preparation and support — organizing evidence and documentation for internal and external audit engagements

Outputs

Gate E Decision Record — formal pass/fail determination with conditions and remediation requirements
Bias and Fairness Testing Report — documented results with remediation actions for identified disparities
Business Value Validation Report — actual vs. projected outcomes with variance analysis
Conformity Assessment Record — compliance status per system per applicable regulation
COMPEL Governance Scorecard — current maturity scores across all 18 domains
Re-attestation Records — documented evidence of periodic re-certification for each AI system against current governance standards
Risk Acceptance Register — formal log of residual risks accepted by designated risk owners with justification and review dates
Stakeholder Approval Register — signed approvals from all required business owners, risk owners, and oversight body members
Transformation Effectiveness Scorecard — composite measure of governance program effectiveness across business value, risk, and compliance dimensions

Quality Gates

Audit complete with findings documented and remediation plans assigned
Gate reviews passed for all in-scope AI systems
Risk acceptance documented and approved by designated risk owners

Standards Alignment

ISO/IEC 42001:2023: Clause 9 (Performance evaluation), Clause 9.3 (Management review)
NIST AI RMF 1.0: MEASURE (testing, benchmarking, evaluation), MANAGE (risk treatment confirmation)
EU AI Act 2024/1689: Article 43 (Conformity assessment), Article 9 (Testing), Article 15 (Accuracy and robustness)
IEEE 7000: Validation of ethical requirements against implemented system behavior and outcomes

Abdelalim, T. (2025). “Evaluate Stage — COMPEL AI Transformation Framework.” COMPEL by FlowRidge. https://www.compel.one/stage/evaluate