Binary Classification
How to log your model schema for binary classification models
Binary Classification Cases | Expected Fields | Performance Metrics |
---|---|---|
prediction label, actual label | Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity | |
prediction score, prediction label, actual label | AUC, PR-AUC, Log Loss, Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity | |
prediction label, actual label, prediction score, actual score | Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity, MAPE, MAE, RMSE, MSE, R-Squared, Mean Error, AUC, PR-AUC, Log Loss | |
prediction score, actual label | AUC, PR-AUC, Log Loss | |
prediction score, actual score | MAPE, MAE, RMSE, MSE, R-Squared, Mean Error |
Python Pandas Batch
Python Single Record
Data Connector
Example Row
state | pos_approved | zip_code | age | prediction_label | actual_label | prediction_ts |
---|---|---|---|---|---|---|
ca | True | 12345 | 25 | not_fraud | fraud | 1618590882 |
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="prediction_label",
actual_label_column_name="actual_label",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code", "age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
environment=Environments.PRODUCTION,
dataframe=example_dataframe,
schema=schema
)
For more details on Python Batch API Reference, visit here:
features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label="not fraud",
actual_label="fraud"
)
For more information on Python Single Record Logging API Reference, visit here:
Learn how to upload files via various Data Connectors:
Python Pandas
Python Single Record
Data Connector
Example Row
state | pos_approved | zip_code | age | prediction_label | actual_label | prediction_score | prediction_tsa |
---|---|---|---|---|---|---|---|
ca | True | 12345 | 25 | not_fraud | fraud | 0.3 | 1618590882 |
Code Example
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="prediction_label",
prediction_score_column_name="prediction_score",
actual_label_column_name="actual_label",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code", "age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Batch API Reference, visit here:
Code Example
features = {
'state': 'ca',
'pos_approved': True,
'item_count': 10
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label=("not fraud", 0.3),
actual_label="fraud"
)
For more information on Python Single Record Logging API Reference, visit here:
Learn how to upload files via various Data Connectors:
Python Pandas
Python Single Record
Data Connector
Example Row
state | pos_approved | zip_code | age | prediction_label | actual_label | prediction_score | actual_score | prediction_ts |
---|---|---|---|---|---|---|---|---|
ca | True | 12345 | 25 | not_fraud | fraud | 0.3 | 1 | 1618590882 |
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="prediction_label",
prediction_score_column_name="prediction_score",
actual_label_column_name="actual_label",
actual_score_column_name="actual_score",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code","age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.REGRESSION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Pandas API Reference, visit here:
features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
prediction_label=("not fraud", 0.3)
features=features,
tags=tags,
actual_label=("fraud", 1)
)
For more information on Python Single Record Logging API Reference, visit here:
Download an example CSV file:
Learn how to upload files via various Data Connectors:
Python Pandas
Python Single Record
state | pos_approved | zip_code | age | actual_label | prediction_score | prediction_ts |
---|---|---|---|---|---|---|
ca | True | 12345 | 25 | fraud | 0.3 | 1618590882 |
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_score_column_name="prediction_score",
actual_label_column_name="actual_label",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code","age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.REGRESSION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Pandas API Reference, visit here:
To declare
prediction_score
ONLY, pass a tuple of an empty string and your prediction_score
through the prediction_label
argument. features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label=("", 0.3),
actual_label="fraud",
)
For more information on Python Single Record Logging API Reference, visit here:
Python Pandas
Python Single Record
state | pos_approved | zip_code | age | prediction_score | actual_score | prediction_ts |
---|---|---|---|---|---|---|
ca | True | 12345 | 25 | 0.3 | 1 | 1618590882 |
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_score_column_name="prediction_score",
actual_score_column_name="actual_score",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code","age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.REGRESSION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Pandas API Reference, visit here:
To declare
prediction_score
ONLY, pass a tuple of an empty string and your prediction_score
through the prediction_label
argument.
To declareactual_score
ONLY, pass a tuple of an empty string and your actual_score
through the actual_label
argument. features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label=("", 0.3),
actual_label=("", 1.0)
)
For more information on Python Single Record Logging API Reference, visit here:
For some use cases, it may be important to treat a prediction for which no corresponding actual label has been logged yet as having a default negative class actual label.
For example, consider tracking advertisement conversion rates for an ad clickthrough rate model, where the positive class is
click
and the negative class is no_click
. For ad conversion purposes, a prediction without a corresponding actual label for an ad placement is equivalent to logging an explicit no_click
actual label for the prediction. In both cases, the result is the same: a user has not converted by clicking on the ad.
For AUC-ROC, PR-AUC, and Log Loss performance metrics, Arize supports treating predictions without an explicit actual label as having the negative class actual label by default. In the above example, a click
prediction without an actual would be treated as a false positive, because the missing actual for the prediction would, by default, be assigned to the no_click
negative class. This feature can be enabled for monitors and dashboards via the model performance config section of your model's config page.
Prediction Label: The classification label of this event (Cardinality = 2)
Actual Label: The ground truth label (Cardinality = 2)
Prediction Score: The likelihood of the event (Probability between 0 to 1)
Actual Score: The ground truth score (0 or 1)