Setting Up Monitors

Learn how to configure your model in three ways

Why Monitor Your ML Models?

Continuous monitoring ensures the accuracy and reliability of ML predictions over time. This is critical because models can drift or degrade in performance due to changes in the underlying data, altering environments, or evolving target variables.

Top Ways To Monitor

Monitoring isn't a one-size-fits-all solution for the variety of ML use cases, business needs, and areas of concern.

✌️ Two Types of Drift

Use drift monitors to compare production against different baseline datasets.

  1. Feature drift captures changes to your data pipeline that can lead to anomalous model behavior.

  2. Prediction drift captures changes in the outputs of your model that may require stakeholders to be notified. This is also an excellent way to monitor performance without ground truth values.

🚀 Performance

Monitor performance metrics based on ground truth data (actuals) for your model type, such as NDCG (ranking), AUC (propensity to click), MAPE (predicting ETAs), and more!

📌 Important Features

Monitor key features important to your model with data quality monitors. This can be a powerful tool for root cause analysis workflows.

🔍 Leading Indicators

If your model receives delayed ground truth, monitor your prediction drift score and feature drift as a proxy for model performance.

One-Click Monitoring

Start monitoring your models with Arize in one click! Pick from a wide array of different data quality, drift, and model performance metrics to monitor across all applicable features.

Enable monitors for 1 performance metric, 1 drift metric, and 1 data quality metrics to get started!

Some monitor types, such as feature drift, will set up monitors across all of your features at once. This way, you can achieve 100% coverage of drift monitoring across all of your features in one click.

Automatic Thresholds

All monitors are configured with an automatic threshold by default. Auto thresholds allow you to enable a large number of monitors for wide coverage, without having to tune the threshold for each monitor.

Auto thresholds adjust with your metrics to accurately identify anomalous behavior based on a statistical analysis of data over 14 days. Each day, a data point is collected, and after 14 days, the average (mean) and standard deviation of these data points are computed. The thresholds are then set by adding or subtracting the standard deviation from the average.

If you have a precise area to monitor or specific configuration needs, edit and customize your monitor in the UI or programmatically.

  • Metrics: edit based on a wide range of metrics such as F_1, AUC, RMSE, and more

  • Filters: filter your monitor on prediction score, feature, actual class, etc.

  • Evaluation window: change the time window from 1 hour - 30 days

  • Threshold value: automatic or custom, edit the multiplier within the calculated value

  • Alerts: change your integration or email alerts

Programmatic Monitors

Use our public-facing GraphQL API to bulk configure custom monitors within your own infrastructure.

Follow the instructions here to query, patch, and create monitors using our programmatic API.

mutation createPerformanceMonitorForState($state: String!){
  createPerformanceMonitor(
    input: {
      modelId: "model_id", 
      operator: lessThan, 
      performanceMetric: accuracy, 
      dynamicAutoThreshold: {
        stdDevMultiplier: 1.2
      }
      filters: [{
        dimensionType: featureLabel,
        operator: equals
        name: "stateName"
        values: [$state]
      }],
      contacts: [{
        notificationChannelType: email,
        emailAddress: "you@me.com"
      }]
    }
  ) {
    monitor {
      id
    }
  }
}

Last updated

Copyright © 2023 Arize AI, Inc