Model Governance And Lifecycle Management

Level 1: AI Transformation Foundations Module M1.5: AI Governance and Ethics Fundamentals Article 8 of 10 13 min read Version 1.0 Last reviewed: 2025-01-15 Open Access

COMPEL Certification Body of Knowledge — Module 1.5: Governance, Risk, and Compliance for AI

Article 8 of 10

An artificial intelligence (AI) model is not a static asset. It is a living system that degrades, evolves, interacts with changing data environments, and influences decisions that affect people, operations, and financial outcomes every day it operates. Managing AI models as if they were traditional software — deploy once, patch occasionally, replace when obsolete — is a governance failure that produces undetected bias, silent performance degradation, and compliance exposure that compounds over time.

Model governance and lifecycle management is the discipline of maintaining visibility, control, and accountability over AI models from conception through retirement. It is where the governance framework described in Article 3: Building an AI Governance Framework meets the operational reality of running AI systems in production. It is also where the Machine Learning Operations (MLOps) practices described in Module 1.4, Article 7: MLOps — From Model to Production acquire their governance dimension — the rules, standards, and oversight mechanisms that ensure MLOps serves organizational objectives, not just technical ones.

The Model Inventory

You cannot govern what you cannot see. The model inventory — also called the model register or model catalog — is the system of record for all AI models in the organization. It is the foundational governance asset without which model risk management is impossible.

What the Model Inventory Contains

For each model, the inventory should capture:

Identity and classification:

Unique model identifier
Model name and version
Risk classification (per the framework in Article 4: AI Risk Identification and Classification)
Model type (classification, regression, natural language processing, computer vision, generative, etc.)
Deployment status (development, validation, production, retired)

Ownership and accountability:

Model owner (the individual accountable for the model's performance and compliance)
Development team
Business sponsor
Approving authority (who approved the model for production)

Technical description:

Algorithm type and architecture
Training data description (with reference to data lineage documentation from Article 7: Data Governance for AI)
Feature descriptions and feature engineering logic
Performance metrics (accuracy, precision, recall, fairness metrics)
Known limitations and constraints

Governance status:

Validation status and date of last validation
Bias testing status and results summary
Monitoring status and alert history
Documentation completeness assessment
Next scheduled review date
Regulatory applicability (which regulations apply to this model)

Lifecycle events:

Development date
Initial deployment date
Retraining history (dates, reasons, data used)
Material change history
Incident history
Planned retirement date (if applicable)

Model Inventory Governance

The model inventory itself requires governance:

Registration requirements — when must a model be registered? At project initiation? At development completion? Before deployment? The answer should be at project initiation, so that governance is engaged from the earliest stage.
Update requirements — how frequently must inventory records be updated? What triggers a required update?
Completeness monitoring — how does the organization detect unregistered models (the model equivalent of Shadow AI, described in Module 1.1, Article 6)?
Access controls — who can view the inventory, who can update it, and who can approve changes to model classification or status?

Organizations that discover models running in production that are not in the inventory — a disturbingly common finding in AI governance assessments — have a governance gap that requires immediate remediation. An unregistered model is an ungoverned model, with unknown risk exposure.

Model Validation

Model validation is the independent assessment of a model's fitness for its intended purpose. It is the governance control that ensures models meet quality, performance, fairness, and compliance standards before they affect real decisions.

The Three Dimensions of Model Validation

The Federal Reserve's Supervisory Guidance on Model Risk Management (SR 11-7), originally published in 2011 by the Board of Governors of the Federal Reserve System and the Office of the Comptroller of the Currency (OCC), establishes three dimensions of model validation that apply broadly across industries:

Evaluation of conceptual soundness assesses whether the model's design and methodology are appropriate for its intended use. This includes:

Is the modeling approach appropriate for the problem?
Are the assumptions reasonable and documented?
Are the input variables relevant and appropriate?
Is the model specification (architecture, hyperparameters, training approach) well-justified?
Have alternative approaches been considered?

For machine learning (ML) models, conceptual soundness evaluation must also address:

Is the training data representative of the production environment?
Is the feature engineering sound (no data leakage, no proxy variables for protected attributes)?
Is the model complexity justified by the use case requirements?
Are explainability requirements achievable with the chosen model type?

Outcomes analysis compares model predictions to actual outcomes to assess model accuracy. This includes:

Back-testing on historical data
Out-of-time testing on data from periods not used in training
Comparison to benchmarks, simpler models, or expert judgment
Segmented performance analysis across relevant subpopulations
Fairness metric evaluation across protected classes

Ongoing monitoring verifies that the model continues to perform as validated after deployment. This is covered in the monitoring section below.

Independent Validation

"Independent" in model validation means that the validators are not the same individuals who developed the model and do not report to the same management chain that benefits from the model's deployment. The level of independence required should be proportionate to risk:

High-risk models should be validated by a dedicated Model Risk Management (MRM) function or external validators who are organizationally independent from the development team and the business sponsor
Medium-risk models may be validated by peers from other development teams, provided they have appropriate expertise and no conflicts of interest
Low-risk models may use structured self-validation against defined standards, subject to periodic audit sampling

The concept of "effective challenge" from SR 11-7 is central: validators must have the incentive, competence, and authority to challenge the model's development team. Validation that does not produce substantive challenges is not adding value — it is providing false assurance.

Validation Frequency

Initial validation occurs before first deployment. Subsequent validations are triggered by:

Scheduled periodic review (annually for high-risk models, per a defined schedule for others)
Material model changes (retraining, feature changes, algorithm changes)
Significant performance degradation detected through monitoring
Changes in the model's use case or deployment scope
Regulatory or governance framework changes that alter validation requirements
Data environment changes (new data sources, significant distribution shifts)

Model Monitoring

A model that was valid at deployment may not remain valid. Model monitoring is the governance mechanism that detects degradation between validation events and triggers appropriate response.

What to Monitor

Performance metrics — accuracy, precision, recall, F1 score, Area Under the Curve (AUC), or whichever metrics are appropriate for the model type and use case. Monitoring should track both aggregate performance and performance segmented by key populations.

Fairness metrics — the fairness metrics specified during bias testing (demographic parity, equalized odds, etc., as described in Article 6: AI Ethics Operationalized) tracked on production data to detect emerging bias patterns.

Input data quality — completeness, distribution, and feature values of production input data, compared to the training data distribution. Significant input distribution shifts signal potential model drift.

Output distribution — the distribution of model predictions over time. Changes in output distribution may indicate model drift even before performance metrics degrade.

Stability metrics — Population Stability Index (PSI) and Characteristic Stability Index (CSI) measure shifts in population and variable distributions respectively, providing early warning of drift.

Operational metrics — latency, throughput, error rates, and availability. Operational degradation may indicate infrastructure issues that affect model performance.

Monitoring Architecture

Effective monitoring requires:

Automated data collection from model inputs, outputs, and performance against ground truth (where available)
Dashboard visibility providing model owners, validators, and governance teams with real-time and trend views of monitoring metrics
Automated alerting with defined thresholds that trigger notifications when metrics cross acceptable boundaries
Escalation procedures that define who is notified, what actions are required, and what timelines apply for different alert severities
Integration with the model inventory so that monitoring status is visible as part of the model's governance record

Monitoring infrastructure should be integrated into the MLOps platform (Module 1.4, Article 7) so that monitoring is a byproduct of normal operations, not a separate manual activity.

Response to Monitoring Alerts

Governance must define the response protocol for monitoring alerts:

Yellow alerts (performance approaching thresholds) trigger enhanced monitoring, root cause investigation, and documentation. The model continues to operate.

Red alerts (performance exceeding thresholds) trigger immediate investigation, potential restriction of the model's scope or authority, and escalation to the model owner and governance function. Depending on severity, the model may be suspended pending revalidation.

Critical alerts (severe performance failure or bias detection) trigger immediate model suspension, incident response procedures (from Article 5: AI Risk Assessment and Mitigation), and escalation to senior governance leadership.

Model Documentation Standards

Documentation is the artifact that makes governance auditable. Without documentation, governance is a verbal practice that cannot be verified, reproduced, or examined by regulators and auditors.

Model Cards

Model cards, originally proposed by researchers at Google in 2019, provide a standardized format for documenting AI models. A model card typically includes:

Model details (name, version, type, owner, date)
Intended use and limitations
Training data description
Evaluation data and results
Fairness analysis results
Ethical considerations
Caveats and recommendations

Model cards serve multiple audiences: technical teams use them for model comparison and selection, governance teams use them for risk assessment and audit, and business stakeholders use them to understand model capabilities and limitations.

Data Sheets

Data sheets for datasets, proposed by researchers at Microsoft in 2018, provide standardized documentation for training datasets:

Motivation (why was the dataset created?)
Composition (what does the dataset contain?)
Collection process (how was the data collected?)
Preprocessing (what transformations were applied?)
Uses (what is the dataset intended for? What should it not be used for?)
Distribution (how is the dataset distributed?)
Maintenance (who maintains the dataset? How is it updated?)

Data sheets complement model cards by documenting the data foundation of each model — connecting to the data governance practices described in Article 7: Data Governance for AI.

Technical Documentation

Beyond model cards and data sheets, high-risk models require comprehensive technical documentation that covers:

Model development methodology and rationale
Feature selection and engineering documentation
Training process documentation (hyperparameters, optimization approach, convergence criteria)
Validation results and validation methodology
Known limitations and conditions under which the model should not be relied upon
Monitoring configuration and threshold justification
Change history and retraining log

The European Union (EU) AI Act requires technical documentation for high-risk AI systems that is detailed enough for a regulatory authority to assess the system's compliance. Organizations that treat documentation as an afterthought will find this requirement expensive to satisfy retrospectively.

Model Retirement

Models have a lifecycle that ends. Retirement governance ensures that model decommissioning is orderly, documented, and does not create operational gaps.

Retirement Triggers

The model is replaced by a successor model that has been validated and deployed
The model's use case is discontinued
The model cannot be maintained to required governance standards
The model's performance has degraded beyond acceptable thresholds and remediation is not feasible
Regulatory changes make the model's approach non-compliant

Retirement Process

Retirement decision — documented approval by the model owner and governance function
Successor verification — if a successor model exists, verification that it is validated, deployed, and performing as expected before the predecessor is retired
Impact analysis — identification of all systems, processes, and stakeholders that depend on the retiring model
Transition execution — planned cutover from the retiring model to the successor (or to a non-AI process)
Archiving — preservation of the model, its training data (or references to it), its documentation, its validation results, and its monitoring history for regulatory retention requirements
Decommissioning — removal of the model from production systems
Inventory update — updating the model inventory to reflect retired status with retention of historical governance records

Retirement governance is often neglected because it is not associated with new capability delivery. This neglect creates governance risk: retired models that continue to operate because no one decommissioned them, archived models with inadequate documentation that cannot be audited, and successor models deployed without proper validation of their predecessor's retirement.

Model Risk Management as an Organizational Function

For organizations with significant AI portfolios — particularly in regulated industries — model governance requires a dedicated Model Risk Management (MRM) function. This function:

Maintains the model inventory
Sets model governance standards
Conducts or oversees independent model validation
Operates model monitoring infrastructure
Manages the model lifecycle (development, deployment, monitoring, retirement)
Reports on model risk posture to the AI Governance Council
Coordinates with internal audit and external regulators on model risk matters

The MRM function must have organizational independence — it reports to risk leadership rather than to the technology or business functions whose models it governs. Without independence, MRM cannot provide the "effective challenge" that SR 11-7 requires and that sound governance demands.

The people, skills, and organizational structures required for effective MRM are addressed in Module 1.6: People, Change, and Organizational Readiness. MRM requires a blend of technical ML expertise, risk management expertise, and regulatory knowledge that is difficult to hire and expensive to develop. Investing in this talent is not optional for organizations that operate AI at scale in regulated environments.

Connecting Model Governance to the COMPEL Lifecycle

Model governance spans the entire COMPEL lifecycle:

Calibrate (Module 1.2, Article 1) assesses model governance maturity and model portfolio risk
Organize (Module 1.2, Article 2) establishes the MRM function, tools, and processes
Model designs target-state model governance standards and infrastructure
Produce deploys models within governance guardrails, with Stage Gate reviews (Module 1.2, Article 7) validating governance compliance at each checkpoint
Evaluate (Module 1.2, Article 5) assesses model governance effectiveness through metrics, audits, and governance reviews
Learn (Module 1.2, Article 6) captures model governance insights and evolves standards based on experience

Looking Ahead

Model governance and data governance produce the operational controls that protect the organization from AI risk. The next article addresses how to demonstrate that those controls work — audit preparedness and compliance operations that ensure the organization can satisfy regulatory inquiries, internal audits, and third-party assessments with organized, verifiable evidence.