Model Drift

Technical

Model drift is the degradation of an AI model's performance over time caused by changes in the statistical properties of the input data, the target variable, or the relationship between them. Data drift (also called covariate shift) occurs when the distribution of input features changes from...

Detailed Explanation

Model drift is the degradation of an AI model's performance over time caused by changes in the statistical properties of the input data, the target variable, or the relationship between them. Data drift (also called covariate shift) occurs when the distribution of input features changes from the training distribution. Concept drift occurs when the relationship between inputs and outputs changes — the model's learned patterns become less accurate even if the input distribution remains stable. Both forms of drift cause models that performed well at deployment to produce increasingly unreliable predictions without any change to the model itself.

Why It Matters

Model drift is one of the most common and insidious failure modes in production AI systems. Unlike software bugs that produce immediate errors, drift degrades performance gradually — often below the threshold of casual observation — until the model is making material errors that affect business decisions, customer outcomes, or regulatory compliance. Organizations that do not monitor for drift discover performance problems only when downstream consequences become severe enough to trigger complaints, audits, or incidents.

COMPEL-Specific Usage

COMPEL addresses model drift at multiple stages. The Model stage requires drift monitoring plans as part of the production design specification (Gate M). The Produce stage configures drift detection infrastructure including statistical monitoring, performance tracking, and automated alerting. The Evaluate stage reviews drift metrics as a standard component of the governance scorecard. COMPEL's maturity model assesses drift management capability from manual periodic review (Level 2) to automated detection with governance-integrated remediation workflows (Level 5).

Related Standards & Frameworks

  • ISO/IEC 42001:2023 Annex A.5 (AI System Inventory)
  • NIST AI RMF MAP and MEASURE functions
  • IEEE 7000-2021

Related Terms

Common Mistakes

  • Deploying models without any drift monitoring and assuming performance will remain stable.
  • Monitoring only prediction accuracy without tracking input distribution changes that signal upcoming performance degradation.
  • Setting drift alert thresholds too loosely (missing real drift) or too tightly (creating alert fatigue).
  • Retraining models on drifted data without investigating whether the drift indicates a legitimate change in the world or a data quality issue.

References

  • COMPEL Framework — COMPEL Drift Management Protocol (Methodology)
  • NeurIPS — Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift (Research)
  • Google — Data Validation for Machine Learning (Technical Guide)

Frequently Asked Questions

What is the difference between data drift and concept drift?

Data drift (covariate shift) occurs when the statistical distribution of input features changes from the training data. Concept drift occurs when the relationship between inputs and outputs changes — the "rules" the model learned become less accurate. Both degrade performance, but they require different detection methods and remediation strategies.

How do you detect model drift in production?

Detection methods include statistical tests comparing production input distributions to training distributions (PSI, KS test, Jensen-Shannon divergence), monitoring prediction confidence scores, tracking downstream business metrics, and comparing model outputs against ground truth when available. COMPEL's Produce stage configures these monitoring instruments as part of production readiness.

How often should models be checked for drift?

It depends on the use case and data velocity. High-frequency prediction systems (fraud detection, real-time pricing) should be monitored continuously. Lower-frequency systems (quarterly risk assessments) can use periodic monitoring aligned with data refresh cycles. COMPEL requires drift monitoring plans proportionate to the AI system's risk classification.