Configure Monitors
Learn about the different configurations users can customize for monitors
Managed vs. Custom Monitors
Managed Monitors
Monitors configured by Arize with default settings for your threshold and evaluation window. These are meant to simple to enable and understand, with sensible defaults.
From the 'Monitors' tab, select 'Setup Monitors', and enable the applicable monitors based on relevant metrics for your use case.
Custom Monitors
Fully customizable monitors based on various dimensions such as features, tags, evaluation windows, thresholds, etc.
Use custom monitors if you want to monitor a specific slice of your data or if you want to customize the evaluation windows without affecting other monitors.
From the 'Monitors' tab, select 'Setup Monitors', and click 'Create Custom Monitor' to get started.
Evaluation vs. Delay Windows
Evaluation Window
An evaluation window defines the period of time your metric is calculated on (i.e. the previous 24 hours).
Increase this window to smooth out spiky or seasonal data. Decrease this for your monitors to react faster to sudden changes.
Default: Last 72 hours
Delay Window
A delay window defines is the gap between the evaluation time and the window of data used for the evaluation. A delay window tells Arize how long to delay an evaluation.
Change this if you have delayed actuals or predictions, so you evaluate your model on the most up-to-date data.
Default: no delay
For managed monitors: edit the monitor configurations from the 'Monitors' tab in the 'Managed Performance Monitors' card. These settings apply to all managed monitors of the same type.
For custom monitors: within monitor settings, configure the evaluation window within "Step 2: Define the Data."
Automatic vs. Custom Alerting Thresholds
Arize monitors trigger an alert when your monitor crosses a threshold. Thresholds trigger notifications, so you can adjust your threshold to be more or less noisy depending on your needs.
Automatic Threshold
Automatic thresholds set a dynamic value for each data point. Auto thresholds work best when there are at least 14 days of production data to determine a trend.
Custom Threshold
Set the threshold to any value for additional flexibility. The threshold in the monitor preview will update as you change this value, so you can backtest the threshold value against your metric history.
To change the tolerance of an existing automatic threshold, go to the 'Monitors' tab, select 'Config' and edit all of your managed monitor auto thresholds in bulk in the 'Managed Performance Monitors' card .
Note: this will override any individual managed monitor auto threshold config, but will not change any manual thresholds configured for monitors.
Production Model Baseline vs, Pre-Production Model Baseline (For Drift Monitors)
A model baseline is the reference dataset to compare your current data to identify model changes, enable analysis, and identify the root cause of performance degradation. A model baseline can be from any environment or time period. Arize automatically configures all new models with a default baseline, but you can pick a new model baseline to use across all monitors, or set a custom baseline per monitor.
Baseline with Production Data
Production baselines are helpful if your training or validation data is unavailable or unreliable as a reference point:
Moving Production Baseline: A dynamic baseline, adjustable to the current time. Enables detection of abrupt changes in feature and model distributions, avoids overlap with drift evaluation windows, and prevents issues associated with fixed, outdated baselines.
Fixed Production Baseline: A fixed time period in production. This is useful if you want a fixed reference point but don't have or want to use pre-production data.
Baseline with Pre-Production Data
Pre-production baselines use training or validation datasets. This can be useful for models where production data is expected to be similar to the pre-production data.
Automatic Baseline Update: Selects the latest uploaded pre-production dataset as the baseline for training/validation. The newest dataset is determined by the most recent upload, ignoring the date range, version, or batch. Ideal for frequent model training.
Choose My Own Dataset: For teams who train infrequently, or want the manual step of choosing the baseline when a new pre-production dataset is uploaded, manually choosing the dataset for the baseline is the best option. When selecting this, the baseline will not change until a user goes to the model config to manually change it.
To edit the model baseline: navigate to the 'Dataset' or 'Config' tab, click on the 'Configure Baseline' button where you will be prompted to pick your baseline.
To edit an individual monitor's baseline to identify changes on a feature level: select 'Custom Baseline' in "Step 2: Define the Data" of the edit managed or custom monitor form.
Custom baselines can utilize filters in the case of new, changing, or problematic features to see how different features/tags/etc affect your model's distribution.
A different fixed time range: you expect large changes for a specific feature
A moving time range: identify fluctuating changes for a feature over time
Different versions: compare distribution changes of your current model with versions
Set Notifications
Your Monitor Status provides an indication of your model health. Your monitor will either be:
Healthy: Sit back and relax! No action is needed
No Data: When the monitor does not have recent data in the evaluation window
Triggered: When your monitor crosses the threshold value, indicating a model issue
When a monitor is triggered, get notified when your model deviates from your threshold. You can send notifications via e-mail, PagerDuty, OpsGenie, or Slack. Learn more about notifications and integrations here.
All managed monitors will be set with the default configuration of 'No Contacts Selected'.
To edit notification settings in bulk, navigate to the 'Monitors' tab, select 'Config', and update in the 'Notifications' section.
Last updated