Arize AI
Search…
Model Schema Mapping
What performance metrics are available for common model types - including how to map those into Arize's data model.

Model Types

Arize supports the following model types:
The model type determines both the data ingestion format and the performance metrics that can be visualized in platform.

Performance Metrics

Arize supports the following performance metrics:
Metric Family
Metrics
Categorical Metrics
  • Accuracy
  • Recall
  • Precision
  • False Positive Rate
  • False Negative Rate
  • F1 Score
  • Sensitivity
  • Specificity
Numeric Metrics
  • MAPE
  • MAE
  • RMSE
  • MSE
  • R^2
  • Mean Error
Ranking/Recommendation Metrics
  • NDCG
  • Precision @ K
  • MAP @ K
  • MRR
For more information on how these are defined or calculated, please refer to Model Metric Definitions

Classification (classes only, no probability score)

For this type of model, a category (or multiple) is predicted, and no probability is associated with the label. For example, predictions can be: is fraud, will buy/click, will churn.
Since these models only output a category as the prediction, the available performance metrics fall in the Categorical Family.
For examples on how to map into Arize, please see: Classification and Multi-label Classification
Input Field
Binary
Muilticlass Classification
Multi-label Classification
Prediction Label (sent as Arize variable: prediction_label)
String/Boolean
String
List of Strings (explode prediction to be 1 per label)
Prediction Score
-
-
-
Actual Label (sent as Arize variable: actual_label)
String/Boolean
String
List of Strings
Actual Score
-
-
-
Additional Columns
-
-
Tags - each label in a multi-label model will be an independent prediction where a Tag can be passed in. E.g. tag_label = My Label

Classification with a Score

For this type of model, a category (or multiple) is predicted, and has a probability associated with the label. For example: Fraud: 0.0876, or Click: 0.23.
Since these models output both a category as well as a score, both Categorical Family and Numeric Family performance metrics are available.
For examples on how to map into Arize, please see: Classification and Multi-label Classification
Input Field
Binary
Muilticlass Classification
Multi-label Classification
Prediction Label (Arize variable: prediction_label)
- (sometimes a String/Boolean is present)
String
List of Strings (explode prediction into N predictions where N is the number of labels)
Prediction Score (Arize variable: prediction_score)
Float
Float
List of Floats (explode actuals to match exploded predictions)
Actual Label (Arize variable: actual_label)
String/Boolean
String
List of Strings
Actual Score (Arize variable: actual_score)
0 or 1
0 or 1
0 or 1 for each label
Additional Columns
-
-
Tags - each label in a multi-label model will be an independent prediction where a Tag can be passed in. E.g. tag_label = "My Label"

Regression

This type of model predicts a float. Examples would be click through rates, sales forecasting, customer life-time value, ETA models, etc.
Since these models only output a number as the prediction, the available performance metrics fall in the Numeric Family.
Input Field
Regression
Timeseries
Prediction Label
-
-
Prediction Score (Arize variable: prediction_score)
Float
Float
Actual Label
-
-
Actual Score (Arize variable: actual_score)
Float
Float
Additional Columns
-
Tags are leveraged to capture the following metadata:
  • forecast lag: unit of time between prediction time and when the prediction was for
  • run date: date which the prediction was run on
For the timestamp, you want to use the forecast date (when the prediction is for)

Ranking/Recommendation

This common type of model predicts the order (aka rank) of a set of "candidates" based on how relevant a candidate it for a search query. Examples would be music playlists, what products to advertise, what items to include on a subscription box, what hotels to promote, etc.
Since these models output a score for each candidate in a bag of candidates, in addition to the supported Ranking/Recommendation Metric Family, both Categorical Family and Numeric Family performance metrics are also available.
For examples on how to map into Arize, please see: Ranking
Input Field
Ranking/Recommendation
Prediction Label (Arize variable: prediction_label)
-
Prediction Score (Arize variable: prediction_score)
List of Ranks (numeric)
Actual Label (Arize variable: actual_label)
List of Actions/Categories (e.g. buy, click, favor, completed song) optional field to unlock categorical performance metrics
Actual Score (Arize variable: actual_score)
List of Relevancies (numeric)
Group or Session ID (Arize variable: prediction_group_id)
Group ID The group which all candidates in an evaluation are part of, often referred to as session id or query id
Candidate bag
The list of candidates being ranked can be sent as either features or tags
Copy link
Outline
Model Types
Performance Metrics
Classification (classes only, no probability score)
Classification with a Score
Regression
Ranking/Recommendation