Drift Monitors

Drift monitoring reference guide

When To Monitor Drift

Models and their data change over time, this change is known as drift. Monitor model drift in production to catch underlying data distribution changes over to help identify and root cause model issues before they impact your model.
Monitoring feature and prediction drift is particularly useful if you receive delayed actuals (ground truth data) to use as a proxy for performance monitoring.
🏃Common Questions:
🏎️ How do I track sudden drift over time?
Use a moving window of production data as your model baseline to catch sudden drift.
🐌 How do I track gradual drift over time?
Use training data as your model baseline to catch gradual drift.
🔔 My drift alerts are too noisy/not noisy enough
Edit your threshold value above or below the default standard deviation value to temper your alerts.
🔑 Can I monitor a few key features instead of all of them?
Create custom drift monitors based on individual features by following the 'Custom Monitors' tab in the guide below.
🔍 What are the leading indicators of performance degradation?
Measure feature and prediction drift to indicate performance degradation. Arize supports various drift metrics based on your use case.
🤖 Can I create drift monitors programmatically?
Use the GraphQL API to programmatically create drift monitors.

Drift Metrics By Use Case

Arize offers various distributional drift metrics to choose from when setting up a monitor. Each metric is tailored to a specific use case; refer to this guide to help choose the appropriate metric for various ML use cases.
A metric that is less influenced by sample size and offers fewer false positives compared to the Kolmogorov-Smirnov test or Earth Mover's Distance, making it suitable for datasets with expected fluctuations. However, PSI can be affected by the chosen binning strategy. A notable attribute of PSI is its symmetry, confirming its status as a true statistical 'distance'.
Euclidean distance check determines if the group of production data’s average centroid has moved away from the baseline group For unstructured data types, learn more here
A metric that's less sensitive than others like the Kolmogorov-Smirnov statistic, thereby producing fewer false positives and making it appropriate for datasets with expected fluctuations. While its calculation can be influenced by the chosen binning strategy, it's less affected by sample size. Unlike PSI, KL divergence is non-symmetric, meaning the divergence from dataset A to B is not the same as from B to A.
Similar to Kullback-Leibler divergence but has two distinct advantages: it is always finite and symmetric. It offers an interpretable score ranging from 0, indicating identical distributions, to 1, indicating completely different distributions with no overlap. While its sensitivity is moderate compared to PSI and KL and less than KS, its results can still be influenced by the chosen binning strategy.
A non-parametric metric that does not require assumptions about the underlying data or binning for its calculation, making it a sensitive tool for detecting drift, even in large datasets. The return of a smaller p-value from KS signifies a more confident drift detection, though this sensitivity may also result in more false positives. This sensitivity enables it to detect even slight differences in data distribution.

How To Monitor Drift

Step 1: Enable Drift Monitors

Monitor how your model drift based on various drift metrics for your model use case.
You can enable managed drift monitors automatically and tailor them to your needs or fully customize your drift monitors.
Managed Monitors
Monitors configured by Arize with default settings for your threshold and evaluation window. These are meant to simple to enable and understand, with sensible defaults.
Custom Monitors
Fully customizable monitors based on various dimensions such as features, tags, evaluation windows, baselines, etc.
Managed Monitor
Custom Monitor

Managed monitors are configured by Arize with default settings.

Using Managed Monitors

Use managed monitors if this is your first time monitoring your model, you want to try a new metric, or simplify your setup workflow!
From the 'Setup Monitors' tab, enable the applicable drift monitors based on prediction or feature drift.
Enable managed drift monitors from the setup monitors tab
Managed drift monitors will create a separate drift monitor based on your desired metric across all applicable features.
Managed drift monitors will indicate how many monitors it will enable
Enabled monitors are represented in the monitors listing page
Enabled monitors in the monitors listing page

Using Custom Monitors

Since managed monitors create drift monitors for all applicable features with default settings, use custom monitors if you want to monitor a specific feature, tag, or model dimension that matters the most to you.
From the 'Setup Monitors' or 'Monitor Listing' tab, click 'Create Custom Monitor' to get started.
Enable custom drift monitors from the 'Monitor Listing' tab
From there, select the dimension category and dimension to monitor in Step 1: Define the Metric
Custom monitor page

Step 2: Configure Evaluation Window

An evaluation window defines the period of time your metric is calculated on (i.e. the previous 30 days). Increase this window to smooth out spiky or seasonal data. Decrease this for your monitors to react faster to sudden changes.
A delay window defines is the gap between the evaluation time and the window of data used for the evaluation. A delay window tells Arize how long to delay an evaluation. Change this if you have delayed actuals or predictions, so you evaluate your model on the most up-to-date data.
Managed Monitor
Custom Monitor
Managed monitors create monitors for all applicable features for a given metric with preset basic configurations. Based on the metric and feature monitor you want to edit, edit your monitor's details. These settings apply to all managed monitors of the same type.

Managed Monitors Default Configurations:

  • Evaluation Window: 72 hours of production data
  • Delay Window: 0 hours
From the 'Monitors' tab, edit the monitor configurations in the 'Managed Drift Monitors' card.
Define the various settings that go into calculating and monitoring your metric. Within monitor settings, configure the evaluation window within Step 2: Define the Data.

Custom Monitor Dimensions

Setting name
Evaluation window
Default: last 72 hours
Increase this to smooth out spikes or seasonality. Decrease this to react faster to potential incidents.
Evaluation delay
Default: delayed by 0 seconds This setting is the gap between the evaluation time and the window of data used for the evaluation. Use this if your predictions or actuals have an ingestion lag.
Model version
Filter your metric to only use certain model versions. This defaults to include all model versions.
You can filter using a variety of operators on any dimension in your model. The dimension can be a prediction, actuals, features, or tags.
step 2 in custom monitor setup

Step 3: Configure A Model Baseline

A model baseline is the reference dataset to compare your current data to identify model changes, enable analysis, and identify the root cause of performance degradation. A model baseline can be from any environment or time period.
Arize requires a set baseline applied at the model level regardless of monitor type (custom or managed monitors). Arize automatically configures all new models with a default baseline, but you can pick a new model baseline to use across all monitors, or set a custom baseline per monitor.
Edit Model Baseline
Edit Monitor Baseline
By default, all monitors are configured with a model baseline with a moving time range from your model's production data spanning a period of two weeks, delayed by three days.
If the default baseline doesn't suit your needs. Arize provides the flexibility to choose a baseline from either production data or pre-production data (training and validation).
This baseline will be configured on the model level and can be used for both managed and custom monitors.

When To Configure A Baseline With Production Data

Production baselines are helpful if your training or validation data is unavailable or unreliable as a reference point
  • Moving Production Baseline: A dynamic baseline, adjustable to the current time, allows detection of abrupt changes in feature and model distributions, avoids overlap with drift evaluation windows, and prevents issues associated with fixed, outdated baselines.
  • Fixed Production Baseline: You can choose a fixed time period in production as a baseline. This is useful if you want a fixed reference point but don't have or want to use pre-production data.
From the 'Dataset' or 'Config' tab, click on the 'Configure Baseline' button where you will be prompted to pick your baseline from production data.
When To Configure A Baseline With Pre-Production Data
Pre-production baselines use your training or validation datasets. This can be useful for models where production data is expected to be similar to the pre-production data.
  • Automatic Baseline Update: Selects the latest uploaded pre-production dataset as the baseline for training/validation. The newest dataset is determined by the most recent upload, ignoring the date range, version, or batch. Ideal for frequent model training.
  • Choose My Own Dataset: For users who train infrequently, or want the manual step of choosing the baseline when a new pre-production dataset is uploaded, manually choosing the dataset for the baseline is the best option. When selecting this, the baseline will not change until a user goes to the model config to manually change it.
From the 'Dataset' or 'Config' tab, click on the 'Configure Baseline' button where you will be prompted to pick your baseline from pre-production datasets.
Customize individual drift monitor baselines to identify changes on a feature level using:
  • A different fixed time range: you expect large changes for a specific feature
  • A moving time range: identify fluctuating changes for a feature over time
  • Different versions: compare distribution changes of your current model with versions
This can be configured for both managed and custom monitors by clicking 'Custom Baseline' in Step 2: Define the Data of the edit managed monitor or custom monitor form.
Custom baselines can utilize filters in the case of new, changing, or problematic features to see how different features/tags/etc affect your model's distribution.
Enable filters by clicking the 'Filter Baseline' button in Step 2: Define the Data.

Step 3: Calibrate Alerting Threshold

Arize monitors trigger an alert when your monitor crosses a threshold. You can use our dynamic automatic threshold or create a custom threshold. Thresholds trigger notifications, so you can adjust your threshold to be more or less noisy depending on your needs.
Automatic Threshold
Automatic thresholds set a dynamic value for each data point. Arize generates an auto threshold when there are at least 14 days of production data to determine a trend.
Custom Threshold
Set the threshold to any value for additional flexibility. The threshold in the monitor preview will update as you change this value, so you can backtest the threshold value against your metric history.
Learn more here about how an auto threshold value is calculated.
Managed Monitor
Custom Monitor
Managed monitors create monitors for all applicable features for a given metric with an automatic threshold. If you've had issues in the past, we suggest you take a look at the threshold to make sure the threshold is relevant to your needs.

How To Edit Managed Monitor's Threshold In Bulk

Change the tolerance of an existing automatic threshold by adjusting the number of standard deviations used in the calculation in the 'Managed Drift Monitors' card in the 'Config' tab on the Monitors page to edit all of your managed monitor auto thresholds in bulk.
Note: this will override any individual managed drift monitor auto threshold config, but will not change any manual thresholds configured for monitors.

How To Edit Managed Monitor's Threshold Per Monitor

Edit an individual managed monitor's threshold by referencing the 'Custom Monitor' tab.
Define the threshold value that will trigger an alert within Step 3: Define the Alerting.
This section allows you to:
  • Set a specific (custom) threshold if you already know the precise threshold value to use
  • Automatically create a dynamic threshold. You can edit your auto threshold sensitivity by changing the standard deviation number. Lowering the number of standard deviations will increase the sensitivity, and decreasing the standard deviation number will decrease the sensitivity.

Step 4: Set Notifications

Your Monitor Status provides an indication of your model health. Your monitor will either be:
  • Healthy: Sit back and relax! No action is needed
  • No Data: When the monitor does not have recent
  • Triggered: When your monitor crosses the threshold value, indicating a model issue
When a monitor is triggered, get notified when your model deviates from your threshold. You can send notifications via e-mail, PagerDuty, OpsGenie, or Slack. Learn more about notifications and integrations here.
Managed Monitor
Custom Monitor
All managed monitors will be set with the default configuration of 'No Contacts Selected'. To get the most out of Arize, set notifications so you are automatically notified when your monitor is triggered. You can edit notifications in bulk edit notifications per monitor for enhanced customizability.

How To Set Managed Monitors Notifications In Bulk

Configure drift monitor notifications for all managed monitors for an easy way to fully set up monitors in Arize in the 'Config' tab on the Monitors page.

How To Edit Managed Monitor's Notifications Per Monitor

Set notifications per monitor to limit notifications, change alerting providers or add individual emails to the alert. Within each monitor, you can add a note and edit the monitor name to better suit naming conventions you may already have.
Edit an individual managed monitor's notification setting by referencing the 'Custom Monitor' tab
Define where your alerts are sent within Step 4: Define the Notification.
Setting name
Monitor Name
The monitor name is used to identify the monitor and will be used in the notification.
Send Notifications to
Choose your notification contacts. You can select multiple contacts to receive notifications. Learn more here.
Add notes to your monitor to help the alert recipient understand the monitor and quickly debug any issues.