COMPEL Certification Body of Knowledge — Module 1.5: Governance, Risk, and Compliance for AI
Article 8 of 10
An artificial intelligence (AI) model is not a static asset. It is a living system that degrades, evolves, interacts with changing data environments, and influences decisions that affect people, operations, and financial outcomes every day it operates. Managing AI models as if they were traditional software — deploy once, patch occasionally, replace when obsolete — is a governance failure that produces undetected bias, silent performance degradation, and compliance exposure that compounds over time.
Model governance and lifecycle management is the discipline of maintaining visibility, control, and accountability over AI models from conception through retirement. It is where the governance framework described in Article 3: Building an AI Governance Framework meets the operational reality of running AI systems in production. It is also where the Machine Learning Operations (MLOps) practices described in Module 1.4, Article 7: MLOps — From Model to Production acquire their governance dimension — the rules, standards, and oversight mechanisms that ensure MLOps serves organizational objectives, not just technical ones.
The Model Inventory
You cannot govern what you cannot see. The model inventory — also called the model register or model catalog — is the system of record for all AI models in the organization. It is the foundational governance asset without which model risk management is impossible.
What the Model Inventory Contains
For each model, the inventory should capture:
Identity and classification:
- Unique model identifier
- Model name and version
- Risk classification (per the framework in Article 4: AI Risk Identification and Classification)
- Model type (classification, regression, natural language processing, computer vision, generative, etc.)
- Deployment status (development, validation, production, retired)
Ownership and accountability:
- Model owner (the individual accountable for the model's performance and compliance)
- Development team
- Business sponsor
- Approving authority (who approved the model for production)
Technical description:
- Algorithm type and architecture
- Training data description (with reference to data lineage documentation from Article 7: Data Governance for AI)
- Feature descriptions and feature engineering logic
- Performance metrics (accuracy, precision, recall, fairness metrics)
- Known limitations and constraints
Governance status:
- Validation status and date of last validation
- Bias testing status and results summary
- Monitoring status and alert history
- Documentation completeness assessment
- Next scheduled review date
- Regulatory applicability (which regulations apply to this model)
Lifecycle events:
- Development date
- Initial deployment date
- Retraining history (dates, reasons, data used)
- Material change history
- Incident history
- Planned retirement date (if applicable)
Model Inventory Governance
The model inventory itself requires governance:
- Registration requirements — when must a model be registered? At project initiation? At development completion? Before deployment? The answer should be at project initiation, so that governance is engaged from the earliest stage.
- Update requirements — how frequently must inventory records be updated? What triggers a required update?
- Completeness monitoring — how does the organization detect unregistered models (the model equivalent of Shadow AI, described in Module 1.1, Article 6)?
- Access controls — who can view the inventory, who can update it, and who can approve changes to model classification or status?
Organizations that discover models running in production that are not in the inventory — a disturbingly common finding in AI governance assessments — have a governance gap that requires immediate remediation. An unregistered model is an ungoverned model, with unknown risk exposure.
Model Validation
Model validation is the independent assessment of a model's fitness for its intended purpose. It is the governance control that ensures models meet quality, performance, fairness, and compliance standards before they affect real decisions.
The Three Dimensions of Model Validation
The Federal Reserve's Supervisory Guidance on Model Risk Management (SR 11-7), originally published in 2011 by the Board of Governors of the Federal Reserve System and the Office of the Comptroller of the Currency (OCC), establishes three dimensions of model validation that apply broadly across industries:
Evaluation of conceptual soundness assesses whether the model's design and methodology are appropriate for its intended use. This includes:
- Is the modeling approach appropriate for the problem?
- Are the assumptions reasonable and documented?
- Are the input variables relevant and appropriate?
- Is the model specification (architecture, hyperparameters, training approach) well-justified?
- Have alternative approaches been considered?
For machine learning (ML) models, conceptual soundness evaluation must also address:
- Is the training data representative of the production environment?
- Is the feature engineering sound (no data leakage, no proxy variables for protected attributes)?
- Is the model complexity justified by the use case requirements?
- Are explainability requirements achievable with the chosen model type?
Outcomes analysis compares model predictions to actual outcomes to assess model accuracy. This includes:
- Back-testing on historical data
- Out-of-time testing on data from periods not used in training
- Comparison to benchmarks, simpler models, or expert judgment
- Segmented performance analysis across relevant subpopulations
- Fairness metric evaluation across protected classes
Ongoing monitoring verifies that the model continues to perform as validated after deployment. This is covered in the monitoring section below.
Independent Validation
"Independent" in model validation means that the validators are not the same individuals who developed the model and do not report to the same management chain that benefits from the model's deployment. The level of independence required should be proportionate to risk:
- High-risk models should be validated by a dedicated Model Risk Management (MRM) function or external validators who are organizationally independent from the development team and the business sponsor
- Medium-risk models may be validated by peers from other development teams, provided they have appropriate expertise and no conflicts of interest
- Low-risk models may use structured self-validation against defined standards, subject to periodic audit sampling
The concept of "effective challenge" from SR 11-7 is central: validators must have the incentive, competence, and authority to challenge the model's development team. Validation that does not produce substantive challenges is not adding value — it is providing false assurance.
Validation Frequency
Initial validation occurs before first deployment. Subsequent validations are triggered by:
- Scheduled periodic review (annually for high-risk models, per a defined schedule for others)
- Material model changes (retraining, feature changes, algorithm changes)
- Significant performance degradation detected through monitoring
- Changes in the model's use case or deployment scope
- Regulatory or governance framework changes that alter validation requirements
- Data environment changes (new data sources, significant distribution shifts)
Model Monitoring
A model that was valid at deployment may not remain valid. Model monitoring is the governance mechanism that detects degradation between validation events and triggers appropriate response.
What to Monitor
Performance metrics — accuracy, precision, recall, F1 score, Area Under the Curve (AUC), or whichever metrics are appropriate for the model type and use case. Monitoring should track both aggregate performance and performance segmented by key populations.
Fairness metrics — the fairness metrics specified during bias testing (demographic parity, equalized odds, etc., as described in Article 6: AI Ethics Operationalized) tracked on production data to detect emerging bias patterns.
Input data quality — completeness, distribution, and feature values of production input data, compared to the training data distribution. Significant input distribution shifts signal potential model drift.
Output distribution — the distribution of model predictions over time. Changes in output distribution may indicate model drift even before performance metrics degrade.
Stability metrics — Population Stability Index (PSI) and Characteristic Stability Index (CSI) measure shifts in population and variable distributions respectively, providing early warning of drift.
Operational metrics — latency, throughput, error rates, and availability. Operational degradation may indicate infrastructure issues that affect model performance.
Monitoring Architecture
Effective monitoring requires:
- Automated data collection from model inputs, outputs, and performance against ground truth (where available)
- Dashboard visibility providing model owners, validators, and governance teams with real-time and trend views of monitoring metrics
- Automated alerting with defined thresholds that trigger notifications when metrics cross acceptable boundaries
- Escalation procedures that define who is notified, what actions are required, and what timelines apply for different alert severities
- Integration with the model inventory so that monitoring status is visible as part of the model's governance record
Monitoring infrastructure should be integrated into the MLOps platform (Module 1.4, Article 7) so that monitoring is a byproduct of normal operations, not a separate manual activity.
Response to Monitoring Alerts
Governance must define the response protocol for monitoring alerts:
Yellow alerts (performance approaching thresholds) trigger enhanced monitoring, root cause investigation, and documentation. The model continues to operate.
Red alerts (performance exceeding thresholds) trigger immediate investigation, potential restriction of the model's scope or authority, and escalation to the model owner and governance function. Depending on severity, the model may be suspended pending revalidation.
Critical alerts (severe performance failure or bias detection) trigger immediate model suspension, incident response procedures (from Article 5: AI Risk Assessment and Mitigation), and escalation to senior governance leadership.
Model Documentation Standards
Documentation is the artifact that makes governance auditable. Without documentation, governance is a verbal practice that cannot be verified, reproduced, or examined by regulators and auditors.
Model Cards
Model cards, originally proposed by researchers at Google in 2019, provide a standardized format for documenting AI models. A model card typically includes:
- Model details (name, version, type, owner, date)
- Intended use and limitations
- Training data description
- Evaluation data and results
- Fairness analysis results
- Ethical considerations
- Caveats and recommendations
Model cards serve multiple audiences: technical teams use them for model comparison and selection, governance teams use them for risk assessment and audit, and business stakeholders use them to understand model capabilities and limitations.
Data Sheets
Data sheets for datasets, proposed by researchers at Microsoft in 2018, provide standardized documentation for training datasets:
- Motivation (why was the dataset created?)
- Composition (what does the dataset contain?)
- Collection process (how was the data collected?)
- Preprocessing (what transformations were applied?)
- Uses (what is the dataset intended for? What should it not be used for?)
- Distribution (how is the dataset distributed?)
- Maintenance (who maintains the dataset? How is it updated?)
Data sheets complement model cards by documenting the data foundation of each model — connecting to the data governance practices described in Article 7: Data Governance for AI.
Technical Documentation
Beyond model cards and data sheets, high-risk models require comprehensive technical documentation that covers:
- Model development methodology and rationale
- Feature selection and engineering documentation
- Training process documentation (hyperparameters, optimization approach, convergence criteria)
- Validation results and validation methodology
- Known limitations and conditions under which the model should not be relied upon
- Monitoring configuration and threshold justification
- Change history and retraining log
The European Union (EU) AI Act requires technical documentation for high-risk AI systems that is detailed enough for a regulatory authority to assess the system's compliance. Organizations that treat documentation as an afterthought will find this requirement expensive to satisfy retrospectively.
Model Retirement
Models have a lifecycle that ends. Retirement governance ensures that model decommissioning is orderly, documented, and does not create operational gaps.
Retirement Triggers
- The model is replaced by a successor model that has been validated and deployed
- The model's use case is discontinued
- The model cannot be maintained to required governance standards
- The model's performance has degraded beyond acceptable thresholds and remediation is not feasible
- Regulatory changes make the model's approach non-compliant
Retirement Process
- Retirement decision — documented approval by the model owner and governance function
- Successor verification — if a successor model exists, verification that it is validated, deployed, and performing as expected before the predecessor is retired
- Impact analysis — identification of all systems, processes, and stakeholders that depend on the retiring model
- Transition execution — planned cutover from the retiring model to the successor (or to a non-AI process)
- Archiving — preservation of the model, its training data (or references to it), its documentation, its validation results, and its monitoring history for regulatory retention requirements
- Decommissioning — removal of the model from production systems
- Inventory update — updating the model inventory to reflect retired status with retention of historical governance records
Retirement governance is often neglected because it is not associated with new capability delivery. This neglect creates governance risk: retired models that continue to operate because no one decommissioned them, archived models with inadequate documentation that cannot be audited, and successor models deployed without proper validation of their predecessor's retirement.
Model Risk Management as an Organizational Function
For organizations with significant AI portfolios — particularly in regulated industries — model governance requires a dedicated Model Risk Management (MRM) function. This function:
- Maintains the model inventory
- Sets model governance standards
- Conducts or oversees independent model validation
- Operates model monitoring infrastructure
- Manages the model lifecycle (development, deployment, monitoring, retirement)
- Reports on model risk posture to the AI Governance Council
- Coordinates with internal audit and external regulators on model risk matters
The MRM function must have organizational independence — it reports to risk leadership rather than to the technology or business functions whose models it governs. Without independence, MRM cannot provide the "effective challenge" that SR 11-7 requires and that sound governance demands.
The people, skills, and organizational structures required for effective MRM are addressed in Module 1.6: People, Change, and Organizational Readiness. MRM requires a blend of technical ML expertise, risk management expertise, and regulatory knowledge that is difficult to hire and expensive to develop. Investing in this talent is not optional for organizations that operate AI at scale in regulated environments.
Connecting Model Governance to the COMPEL Lifecycle
Model governance spans the entire COMPEL lifecycle:
- Calibrate (Module 1.2, Article 1) assesses model governance maturity and model portfolio risk
- Organize (Module 1.2, Article 2) establishes the MRM function, tools, and processes
- Model designs target-state model governance standards and infrastructure
- Produce deploys models within governance guardrails, with Stage Gate reviews (Module 1.2, Article 7) validating governance compliance at each checkpoint
- Evaluate (Module 1.2, Article 5) assesses model governance effectiveness through metrics, audits, and governance reviews
- Learn (Module 1.2, Article 6) captures model governance insights and evolves standards based on experience
Looking Ahead
Model governance and data governance produce the operational controls that protect the organization from AI risk. The next article addresses how to demonstrate that those controls work — audit preparedness and compliance operations that ensure the organization can satisfy regulatory inquiries, internal audits, and third-party assessments with organized, verifiable evidence.
© FlowRidge.io — COMPEL AI Transformation Methodology. All rights reserved.