Drift monitoring reference guide
Models and their data change over time, this change is known as drift. Monitor model drift in production to catch underlying data distribution changes over to help identify and root cause model issues before they impact your model.
Monitoring feature and prediction drift is particularly useful if you receive delayed actuals (ground truth data) to use as a proxy for performance monitoring.
Arize offers various distributional drift metrics to choose from when setting up a monitor. Each metric is tailored to a specific use case; refer to this guide to help choose the appropriate metric for various ML use cases.
Monitor how your model drift based on various drift metrics for your model use case.
You can enable managed drift monitors automatically and tailor them to your needs or fully customize your drift monitors.
Monitors configured by Arize with default settings for your threshold and evaluation window. These are meant to simple to enable and understand, with sensible defaults.
Fully customizable monitors based on various dimensions such as features, tags, evaluation windows, baselines, etc.
Use managed monitors if this is your first time monitoring your model, you want to try a new metric, or simplify your setup workflow!
From the 'Setup Monitors' tab, enable the applicable drift monitors based on prediction or feature drift.
Enable managed drift monitors from the setup monitors tab
Managed drift monitors will create a separate drift monitor based on your desired metric across all applicable features.
Managed drift monitors will indicate how many monitors it will enable
Enabled monitors are represented in the monitors listing page
Enabled monitors in the monitors listing page
Since managed monitors create drift monitors for all applicable features with default settings, use custom monitors if you want to monitor a specific feature, tag, or model dimension that matters the most to you.
From the 'Setup Monitors' or 'Monitor Listing' tab, click 'Create Custom Monitor' to get started.
Enable custom drift monitors from the 'Monitor Listing' tab
From there, select the dimension category and dimension to monitor in Step 1: Define the Metric
Custom monitor page
An evaluation window defines the period of time your metric is calculated on (i.e. the previous 30 days). Increase this window to smooth out spiky or seasonal data. Decrease this for your monitors to react faster to sudden changes.
A delay window defines is the gap between the evaluation time and the window of data used for the evaluation. A delay window tells Arize how long to delay an evaluation. Change this if you have delayed actuals or predictions, so you evaluate your model on the most up-to-date data.
Managed monitors create monitors for all applicable features for a given metric with preset basic configurations. Based on the metric and feature monitor you want to edit, edit your monitor's details. These settings apply to all managed monitors of the same type.
- Evaluation Window: 72 hours of production data
- Delay Window: 0 hours
From the 'Monitors' tab, edit the monitor configurations in the 'Managed Drift Monitors' card.
A model baseline is the reference dataset to compare your current data to identify model changes, enable analysis, and identify the root cause of performance degradation. A model baseline can be from any environment or time period.
Arize requires a set baseline applied at the model level regardless of monitor type (custom or managed monitors). Arize automatically configures all new models with a default baseline, but you can pick a new model baseline to use across all monitors, or set a custom baseline per monitor.
Edit Model Baseline
Edit Monitor Baseline
By default, all monitors are configured with a model baseline with a moving time range from your model's production data spanning a period of two weeks, delayed by three days.
If the default baseline doesn't suit your needs. Arize provides the flexibility to choose a baseline from either production data or pre-production data (training and validation).
This baseline will be configured on the model level and can be used for both managed and custom monitors.
Production baselines are helpful if your training or validation data is unavailable or unreliable as a reference point
- Moving Production Baseline: A dynamic baseline, adjustable to the current time, allows detection of abrupt changes in feature and model distributions, avoids overlap with drift evaluation windows, and prevents issues associated with fixed, outdated baselines.
- Fixed Production Baseline: You can choose a fixed time period in production as a baseline. This is useful if you want a fixed reference point but don't have or want to use pre-production data.
From the 'Dataset' or 'Config' tab, click on the 'Configure Baseline' button where you will be prompted to pick your baseline from production data.
When To Configure A Baseline With Pre-Production Data
Pre-production baselines use your training or validation datasets. This can be useful for models where production data is expected to be similar to the pre-production data.
- Automatic Baseline Update: Selects the latest uploaded pre-production dataset as the baseline for training/validation. The newest dataset is determined by the most recent upload, ignoring the date range, version, or batch. Ideal for frequent model training.
- Choose My Own Dataset: For users who train infrequently, or want the manual step of choosing the baseline when a new pre-production dataset is uploaded, manually choosing the dataset for the baseline is the best option. When selecting this, the baseline will not change until a user goes to the model config to manually change it.
From the 'Dataset' or 'Config' tab, click on the 'Configure Baseline' button where you will be prompted to pick your baseline from pre-production datasets.
Customize individual drift monitor baselines to identify changes on a feature level using:
- A different fixed time range: you expect large changes for a specific feature
- A moving time range: identify fluctuating changes for a feature over time
- Different versions: compare distribution changes of your current model with versions
This can be configured for both managed and custom monitors by clicking 'Custom Baseline' in Step 2: Define the Data of the edit managed monitor or custom monitor form.
Custom baselines can utilize filters in the case of new, changing, or problematic features to see how different features/tags/etc affect your model's distribution.
Enable filters by clicking the 'Filter Baseline' button in Step 2: Define the Data.
Arize monitors trigger an alert when your monitor crosses a threshold. You can use our dynamic automatic threshold or create a custom threshold. Thresholds trigger notifications, so you can adjust your threshold to be more or less noisy depending on your needs.
Automatic thresholds set a dynamic value for each data point. Arize generates an auto threshold when there are at least 14 days of production data to determine a trend.
Set the threshold to any value for additional flexibility. The threshold in the monitor preview will update as you change this value, so you can backtest the threshold value against your metric history.
Managed monitors create monitors for all applicable features for a given metric with an automatic threshold. If you've had issues in the past, we suggest you take a look at the threshold to make sure the threshold is relevant to your needs.
Change the tolerance of an existing automatic threshold by adjusting the number of standard deviations used in the calculation in the 'Managed Drift Monitors' card in the 'Config' tab on the Monitors page to edit all of your managed monitor auto thresholds in bulk.
Note: this will override any individual managed drift monitor auto threshold config, but will not change any manual thresholds configured for monitors.
Edit an individual managed monitor's threshold by referencing the 'Custom Monitor' tab.
Define the threshold value that will trigger an alert within Step 3: Define the Alerting.
This section allows you to:
- Set a specific (custom) threshold if you already know the precise threshold value to use
- Automatically create a dynamic threshold. You can edit your auto threshold sensitivity by changing the standard deviation number. Lowering the number of standard deviations will increase the sensitivity, and decreasing the standard deviation number will decrease the sensitivity.
Your Monitor Status provides an indication of your model health. Your monitor will either be:
- Healthy: Sit back and relax! No action is needed
- No Data: When the monitor does not have recent
- Triggered: When your monitor crosses the threshold value, indicating a model issue
When a monitor is triggered, get notified when your model deviates from your threshold. You can send notifications via e-mail, PagerDuty, OpsGenie, or Slack. Learn more about notifications and integrations here.
All managed monitors will be set with the default configuration of 'No Contacts Selected'. To get the most out of Arize, set notifications so you are automatically notified when your monitor is triggered. You can edit notifications in bulk edit notifications per monitor for enhanced customizability.
Configure drift monitor notifications for all managed monitors for an easy way to fully set up monitors in Arize in the 'Config' tab on the Monitors page.
Set notifications per monitor to limit notifications, change alerting providers or add individual emails to the alert. Within each monitor, you can add a note and edit the monitor name to better suit naming conventions you may already have.
Edit an individual managed monitor's notification setting by referencing the 'Custom Monitor' tab
Define where your alerts are sent within Step 4: Define the Notification.