Search
K
Links

Performance Monitors

Performance monitoring reference guide

When To Monitor Performance

Performance metrics quantify a model's effectiveness in its predictions. Monitor performance metrics when deploying a model in production to flag unexpected changes or drops in performance.
Additionally, these metrics are used during the model validation phase, offering insights that guide the improvement and fine-tuning of models to achieve optimal predictive performance.

🏃 Common Questions:

🌊 How do I monitor performance without ground truth data?
Get a sense of model performance without ground truth data by monitoring feature drift and prediction drift.
🔔 My performance alerts are too noisy/not noisy enough
Edit your threshold value above or below the default standard deviation value to temper your alerts.
🪟 How do I monitor with delayed ground truth data?
Delay a performance evaluation via a delay window. Change this if you have delayed actuals, so you evaluate your model on the most up-to-date data.
🏗️ What if my performance metric is specific to my team?
Create any performance metric to suit your monitoring needs via Custom Metrics. Monitor, troubleshoot, and use custom metrics in dashboards.
📈 My monitors are overly sensitive or not sensitive enough
Increase your evaluation window to smooth out spikes or seasonality. Decrease your evaluation window to react faster to potential incidents.
🤖 Can I create performance monitors programmatically?
Use the GraphQL API to programmatically create performance monitors.

Performance Metrics By Use Case

Metrics are batched into Metric Groups that align with model types and their variants.
Model Type
Metrics
Classification
Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity
Regression
MAPE, MAE, RMSE, MSE, R-Squared, Mean Error
Ranking
NDCG@k, AUC@k
Ranking Labels
MAP@k, MRR
AUC/LogLoss
AUC, PR-AUC, Log Loss
Computer Vision/ Object Detection
Accuracy, MAP, IoU
Custom Metrics
Not seeing what you're looking for? Create a metric yourself!

How To Monitor Performance

Easily enable, customize, and alert model performance to ensure your production models stay in tip-top shape.

Step 1: Enable Performance Monitors

Monitor how your model performs in production based on metrics applicable to your model use case.
You can enable managed performance monitors automatically and tailor them to your needs or fully customize your monitors.
Managed Monitors
Monitors configured by Arize with default settings for your threshold and evaluation window. These are meant to simple to enable and understand, with sensible defaults.
Custom Monitors
Fully customizable monitors based on various dimensions such as features, tags, evaluation windows, thresholds, etc.
Managed Monitors
Custom Monitors

Managed monitors are configured by Arize with default settings.

Using Managed Monitors

Use managed monitors if this is your first time monitoring your model, you want to try a new metric, or you want to simplify your setup workflow!
From the 'Setup Monitors' tab, enable the applicable performance monitors based on relevant metrics for your use case.
Enable managed performance monitors from the setup monitors tab
Enabled monitors will be represented in the monitors listing page
Enabled monitors in the monitors listing page

Using Custom Monitors

Use custom monitors if you want to monitor a specific slice of your data or if you want to customize the evaluation windows without affecting other monitors.
From the 'Setup Monitors' tab, click 'Create Custom Monitor' to get started.
Enable custom performance monitors from the setup monitors tab
From there, select the performance metric to monitor.
Custom monitor page

Step 2: Configure Evaluation Window

An evaluation window defines the period of time your metric is calculated on (i.e. the previous 24 hours). Increase this window to smooth out spiky or seasonal data. Decrease this for your monitors to react faster to sudden changes.
A delay window defines is the gap between the evaluation time and the window of data used for the evaluation. A delay window tells Arize how long to delay an evaluation. Change this if you have delayed actuals or predictions, so you evaluate your model on the most up-to-date data.
Managed Monitor
Custom Monitor
Managed monitors create monitors for all applicable features for a given metric with preset basic configurations. Based on the metric and feature monitor you want to edit, edit your monitor's details. These settings apply to all managed monitors of the same type.

Managed Monitors Default Configurations:

  • Evaluation Window: Last 72 hours
  • Delay Window: No delay
From the 'Monitors' tab, edit the monitor configurations in the 'Managed Performance Monitors' card.
Define the various settings that go into calculating and monitoring your metric. Within monitor settings, configure the evaluation window within Step 2: Define the Data.

Custom Monitor Dimensions

Setting name
Description
Evaluation window
Default: last 72 hours
Increase this to smooth out spikes or seasonality. Decrease this to react faster to potential incidents.
Evaluation delay
Default: delayed by 0 seconds This setting is the gap between the evaluation time and the window of data used for the evaluation. Use this if your predictions or actuals have an ingestion lag.
Model version
Filter your metric to only use certain model versions. This defaults to include all model versions.
Filters
You can filter using a variety of operators on any dimension in your model. The dimension can be a prediction, actuals, features, or tags.
Step 2 in custom monitor setup

Step 3: Calibrate Performance Threshold

Arize monitors trigger an alert when your monitor crosses a threshold. You can use our dynamic automatic threshold or create a custom threshold. Thresholds trigger notifications, so you can adjust your threshold to be more or less noisy depending on your needs.
Automatic Threshold
Automatic thresholds set a dynamic value for each data point. Auto thresholds work best when there are at least 14 days of production data to determine a trend.
Custom Threshold
Set the threshold to any value for additional flexibility. The threshold in the monitor preview will update as you change this value, so you can backtest the threshold value against your metric history.
Learn more here about how an auto threshold value is calculated.
Managed Monitor
Custom Monitor
Managed monitors create monitors for all applicable features for a given metric with an automatic threshold. If you've had performance issues in the past, we suggest you take a look at the threshold to make sure the threshold is relevant to your needs.

How To Edit Managed Monitor's Threshold In Bulk

Change the tolerance of an existing automatic threshold by adjusting the number of standard deviations used in the calculation in the 'Managed Performance Monitors' card to edit all of your managed monitor auto thresholds in bulk.
Edit monitor thresholds in bulk
Note: this will override any individual managed monitor auto threshold config, but will not change any manual thresholds configured for monitors.

How To Edit Managed Monitor's Threshold Per Monitor

Change the tolerance of an existing automatic threshold by adjusting the number of standard deviations used in the calculation in the 'Monitor Settings' card or create a new custom threshold to update an individual monitor.
From the 'Monitors' page, click on the 'Monitors Listing' to select the monitor you want to edit.
Edit monitor threshold individually
Define the threshold value that will trigger an alert within Step 3: Define the Alerting.
This section allows to:
  • Set a specific (custom) threshold if you already know the precise threshold value to use
  • Automatically create a dynamic threshold. You can edit your auto threshold sensitivity by changing the standard deviation number. Lowering the number of standard deviations will increase the sensitivity, and decreasing the standard deviation number will decrease the sensitivity.
Step 3 in custom monitor setup

Step 4: Set Notifications

Your Monitor Status provides an indication of your model health. Your monitor will either be:
  • Healthy: Sit back and relax! No action is needed
  • No Data: When the monitor does not have recent data in the evaluation window
  • Triggered: When your monitor crosses the threshold value, indicating a model issue
When a monitor is triggered, get notified when your model deviates from your threshold. You can send notifications via e-mail, PagerDuty, OpsGenie, or Slack. Learn more about notifications and integrations here.
Managed Monitor
Custom Monitor
All managed monitors will be set with the default configuration of 'No Contacts Selected'. To get the most out of Arize, set notifications so you are automatically notified when your monitor is triggered. You can edit notifications in bulk edit notifications per monitor for enhanced customizability.

How To Set Managed Monitors Notifications In Bulk

Configure performance monitor notifications for all managed monitors for an easy way to fully set up monitors in Arize.
Edit notifications in bulk

How To Edit Managed Monitor's Notifications Per Monitor

Set notifications per monitor to limit notifications, change alerting providers or add individual emails to the alert. Within each monitor, you can add a note and edit the monitor name to better suit naming conventions you may already have.
From the 'Monitors' page, click on the 'Monitors Listing' to select the monitor you want to edit.
Edit notifications for each monitor
Define where your alerts are sent within Step 4: Define the Notification.
Setting name
Description
Monitor Name
The monitor name is used to identify the monitor and will be used in the notification.
Send Notifications to
Choose your notification contacts. You can select multiple contacts to receive notifications. Learn more here.
Notes
Add notes to your monitor to help the alert recipient understand the monitor and quickly debug any issues.