Drift

data drift · concept drift
MLOps

When the data or relationships a model sees in production move away from what it was trained on, quietly degrading performance.


In one line

When the data or relationships a model sees in production move away from what it was trained on, quietly degrading performance.

What it actually means

There are two flavours. Data drift (covariate shift): the input distribution changes — new device types, new geographies, a marketing campaign brings in a different user mix. The mapping from input to label is the same, but you’re seeing inputs you weren’t trained on. Concept drift: the input distribution might be the same, but the relationship between inputs and the right answer has changed — fraud patterns evolve, fashion preferences shift, an exploit you used to flag is now legitimate. You usually detect drift by monitoring input feature distributions, prediction distributions, and (if you have them) label feedback.

Why it matters

Drift is the reason “set and forget” doesn’t work for ML in production. A model that was great at launch decays — sometimes slowly, sometimes overnight. Without monitoring you only find out when a customer complains or revenue drops. Drift alerting, retraining schedules, and shadow models are the standard mitigations.

Example

feature: avg_session_seconds
trained on: mean=180, std=60
last 7 days: mean=240, std=90  ← PSI = 0.31, alert

You’ll hear it when

  • Setting up model monitoring (Evidently, Arize, Fiddler, WhyLabs).
  • Diagnosing a quiet drop in offline metrics or business KPIs.
  • Defining a retraining cadence.
  • Reviewing a postmortem for an ML-driven product.
  • Discussing why a model that scored 0.92 last quarter scores 0.81 today.

Related terms