Set up Data Quality Monitors

Data quality monitoring reference guide

When To Monitor Data Quality

High-quality data is fundamental to building reliable, accurate machine learning models and the value of predictions can be significantly compromised by poor data quality.

Easily root cause model issues by monitoring key data quality metrics to identify cardinality shifts, data type mismatches, missing data, and more.

🏃 Common Questions:

Data Quality Metrics

How To Monitor Data Quality

Step 1: Enable Data Quality Monitors

Monitor your data quality based on various metrics for your model use case.

You can enable managed data quality monitors automatically and tailor them to your needs or fully customize your data quality monitors.

Managed monitors are configured by Arize with default settings.

Using Managed Monitors

Use managed monitors if this is your first time monitoring your model, you want to try a new metric, or simplify your setup workflow!

From the 'Setup Monitors' tab, enable the applicable data quality monitors based on various data quality metrics.

Managed data quality monitors will create a separate data quality monitor based on your desired metric across all applicable features.

Enabled monitors are represented in the monitors listing page

Step 2: Configure Evaluation Window

An evaluation window defines the period of time your metric is calculated on (i.e. the previous 30 days). Increase this window to smooth out spiky or seasonal data. Decrease this for your monitors to react faster to sudden changes.

A delay window defines is the gap between the evaluation time and the window of data used for the evaluation. A delay window tells Arize how long to delay an evaluation. Change this if you have delayed actuals or predictions, so you evaluate your model on the most up-to-date data.

Managed monitors create monitors for all applicable features for a given metric with preset basic configurations. Based on the metric and feature monitor you want to edit, edit your monitor's details. These settings apply to all managed monitors of the same type.

Managed Monitors Default Configurations:

  • Evaluation Window: 72 hours of production data

  • Delay Window: 0 hours

From the 'Monitors' tab, edit the monitor configurations in the 'Managed Data Quality Monitors' card.

Step 3: Calibrate Alerting Threshold

Arize monitors trigger an alert when your monitor crosses a threshold. You can use our dynamic automatic threshold or create a custom threshold. Thresholds trigger notifications, so you can adjust your threshold to be more or less noisy depending on your needs.

Learn more here about how an auto threshold value is calculated.

Managed monitors create monitors for all applicable features for a given metric with an automatic threshold. If you've had issues in the past, we suggest you take a look at the threshold to make sure the threshold is relevant to your needs.

How To Edit Managed Monitor's Threshold In Bulk

Change the tolerance of an existing automatic threshold by adjusting the number of standard deviations used in the calculation in the 'Managed Data Quality Monitors' card in the 'Config' tab on the Monitors page to edit all of your managed monitor auto thresholds in bulk.

Note: this will override any individual managed data quality monitor auto threshold config, but will not change any manual thresholds configured for monitors.

How To Edit Managed Monitor's Threshold Per Monitor

Edit an individual managed monitor's threshold by referencing the 'Custom Monitor' tab.

Step 4: Set Notifications

Your Monitor Status provides an indication of your model health. Your monitor will either be:

  • Healthy: Sit back and relax! No action is needed

  • No Data: When the monitor does not have recent

  • Triggered: When your monitor crosses the threshold value, indicating a model issue

When a monitor is triggered, get notified when your model deviates from your threshold. You can send notifications via e-mail, PagerDuty, OpsGenie, or Slack. Learn more about notifications and integrations here.

All managed monitors will be set with the default configuration of 'No Contacts Selected'. To get the most out of Arize, set notifications so you are automatically notified when your monitor is triggered. You can edit notifications in bulk edit notifications per monitor for enhanced customizability.

How To Set Managed Monitors Notifications In Bulk

Configure data quality monitor notifications for all managed monitors for an easy way to fully set up monitors in Arize in the 'Config' tab on the Monitors page.

How To Edit Managed Monitor's Notifications Per Monitor

Set notifications per monitor to limit notifications, change alerting providers or add individual emails to the alert. Within each monitor, you can add a note and edit the monitor name to better suit naming conventions you may already have.

Edit an individual managed monitor's notification setting by referencing the 'Custom Monitor' tab

Last updated

Copyright © 2023 Arize AI, Inc