Arize AI

Drift Monitors


Drift monitors allows for tracking of feature drift and model prediction drift from a baseline sample.
The core metric used for drift monitors is a modified version of Population Stability Index (PSI). The algorithm and modifications are covered in the whitepaper linked at the bottom of this page. The key modifications allow for graceful handling of out of distribution events for both categorical and numeric distributions.
The different areas of monitoring drift in the platform fall into three categories:
  • Feature Drift
  • Prediction Drift
  • Actuals Drift (Concept Drift Tracking)
Feature and Data Tracking

Drift Monitor Creation

The are multiple ways to create drift monitors, from either the Monitors tab, or from the model's Overview page. To learn more visit the Set up Model Monitors page.
Drift monitors can be setup across different baselines that can be set based on the environment the data is tracked. Visit the environments page to learn more about model baselines.

Embedding Drift Monitors

For monitoring unstructured data, users can create embedding drift monitors. Embedding drift is tracked by calculating the Euclidean Distance between embedding vectors. To create an embedding drift monitor simply follow the flow above and select an embedding feature in the "Drift of" section of the monitor form. After creating an embedding drift monitor, click "Troubleshoot Drift" to navigate to the embedding details page for further troubleshooting (more here).

Automatic Thresholds

Automatic thresholds are set by Arize when there is sufficient production data to determine a trend. The threshold is determined by looking back at a historical time window for a metric and calculating the variance of data in that time period (more here). Automatic thresholding is enabled by default for new monitors. Users can toggle automatic thresholds on or off from the “Edit monitor” configuration.

Setting Custom Thresholds

With auto thresholds turned off, the user is free to set the threshold to any value. We display the mean and standard deviation values used to calculate the auto threshold, and the user can change the number of standard deviations above or below the mean to calculate a suggested threshold.

Viewing a Drift Monitor

Once a drift monitor is setup, you can view feature performance or prediction monitors from the Monitors page, or on any Model Overview page:
The above view shows both the historical tracking of Population stability index (PSI) monitor and the current distribution view that generated the PSI value.

Historical View of Drift

In the Metric History view, the data over time is historical based on the monitor data. The historical time series view allows teams to visualize the change in PSI over the period of time the monitor was enabled:
The distribution view allows teams to understand the underlying distribution that created the PSI measure. The distribution view shows both a reference (baseline) comparison to a production distribution.

Download our whitepaper learn more about the general approach to drift analysis:

Statistical Distances for ML Update.pdf
Drift Whitepaper
Questions? Email us at [email protected] or Slack us in the #arize-support channel