Home Services Approach Institute About Contact Book a call

Institute/Workflows/Drift Monitoring

Workflow 03 of 6

Drift Monitoring.

Production sampling, online evals, paging integration.

Eval suites in CI tell you what's true today on synthetic cases. Drift Monitoring tells you what's true in production right now — on real traffic, with real model versions, before users find the regression. This workflow exists because models silently degrade: provider changes, retrieval-corpus drift, prompt edits with downstream effects. The Drift workflow makes that degradation visible and pageable.

What this is

The Drift workflow is the procedure for continuously sampling production AI traffic, running the same eval rubric against it, comparing scores to the dev/CI baseline, and paging on-call when scores degrade beyond a documented threshold. It's the production-side complement to the Eval workflow.

The procedure

  1. Log every input/output. Every AI-feature request and response logged in production with enough context to reconstruct the eval case (system prompt version, retrieved chunks, model version).
  2. Sample for ongoing eval. Random or stratified sample from production traffic enters a separate online-eval pipeline.
  3. Compare prod scores to dev/CI. The same metrics. The same thresholds. A delta between the two is the signal.
  4. Wire paging. Score degradation below threshold pages on-call. Severity tied to which metric degraded (faithfulness drop is severe; latency drift is lower).
  5. Run online evals continuously. Not just sampled batches — continuous monitoring of the live eval surface for the most critical features.

What gets scored

Maturity dimension Drift monitoringsee the L1 → L5 progression for this dimension

The five questions on the readiness self-assessment that score this dimension are the five rungs of the procedure above. Yes on a question means the artifact named in that step exists on disk in your repo today.

Phase 1 · in active development

This page is a thin first cut. Full procedural documentation — including reference DeepEval suite scaffolds, golden-set curation rubrics, and the audit-evidence checklist — lands in Phase 2 of the Institute build-out.

Find out where your team's Drift workflow stands.

The free readiness self-assessment scores the Drift workflow as one of six dimensions. Five minutes. Your weakest workflow is the one most worth fixing first.

Take the assessment →