Arize AI

3. Set a Model Baseline

A baseline is the reference data or benchmark used to compare model performance for monitoring purposes. Baselines can be training, validation, or past-production data.
The Arize platform automatically detects drift, data quality issues, and anomalous performance degradations with highly configurable monitors based on both common KPIs and custom metrics.
However, to compare changes, perform analysis, and root cause performance degradations, your model needs a model baseline. A model baseline is a reference data set used to compare your current data with either training, validation, or prior time periods in production.

Model Baseline

Your model's monitors are preset with a baseline defined by a moving time range of your model's production data for 2 weeks delayed by 3 days.
For more information on model environments, refer to Model Environments
We selected our preset model baseline based on user feedback. Feel free to reach out with suggestions in our community slack!
Learn how to configure a baseline tailored to your specific monitoring needs here.
Preset Model Baseline

Configure A Model Baseline

To change your model's baseline to other datasets or time periods, navigate to either:
  • The 'Config' tab on the right side of the screen
  • The 'Datasets' tab in the top navigation bar
From either tab, click on the 'Configure Baseline' button where you will be prompted to pick your baseline from production data or pre-production datasets.
Configure Your Model Baseline
Production Baselines Select parts of your production data using fixed or moving time ranges.
Configure Your Model Baseline Using Production Data
Pre-production Baselines Choose from different versions of training or validation datasets.

Choosing a Baseline

Changing your baseline depends on your team's specific use case. For additional questions, email us at [email protected] or Slack us in the #arize-support channel
Here are some considerations that may go into choosing a different baseline:
  • If you are not expecting changes between training and production: use the training dataset as your baseline
  • If you are expecting large changes between training and production (e.g. upsampling fraud in training, imbalanced):
    1. 1.
      use a specific/fixed time range from your production data set, such as the initial model launch period where you're actively monitoring its performance
    2. 2.
      use a validation dataset, this allows for monitoring to start from day 0
  • If you are monitoring for highly fluctuating changes on a regular time interval (e.g. click-through rate): set a rolling window on your production dataset (e.g. every week, every two weeks)

Custom Drift Monitor Baselines

Separate from your model baseline, you can customize individual drift monitor baselines to identify changes on a feature level based on a few different settings:
  • A different fixed time range: you expect large changes for a specific feature
  • A moving time range: identify fluctuating changes for a feature over time
  • Different versions: compare distribution changes of your current model with versions
  • Specific filters: in the case of new, changing, or problematic features, see how different features/tags/etc affect your model's distribution

Edit Monitor Baseline

In your drift monitor of interest, click 'Edit Monitor' from the top right, scroll down to the 'Baseline Distribution' tab, and click 'Use Custom Baseline' to edit advanced settings or change your baseline time window.
Drift Monitor With Custom Moving Baseline
Drift Monitor With A Custom Filtered Baseline