Search
K
Links

Binary Classification

How to log your model schema for binary classification models

Binary Classification Cases

Binary Classification Cases
Expected Fields
Performance Metrics
prediction label, actual label
Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity
prediction score, prediction label, actual label
AUC, PR-AUC, Log Loss, Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity
prediction label, actual label, prediction score, actual score
Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity, MAPE, MAE, RMSE, MSE, R-Squared, Mean Error, AUC, PR-AUC, Log Loss
prediction score, actual label
AUC, PR-AUC, Log Loss
prediction score, actual score
MAPE, MAE, RMSE, MSE, R-Squared, Mean Error
Click here for all valid model types and metric combinations.

Case #1 - Supports Only Classification Metrics

Python Pandas Batch
Python Single Record
Data Connector
Example Row
state
pos_approved
zip_code
age
prediction_label
actual_label
prediction_ts
ca
True
12345
25
not_fraud
fraud
1618590882

Code Example

schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="prediction_label",
actual_label_column_name="actual_label",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code", "age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
environment=Environments.PRODUCTION,
dataframe=example_dataframe,
schema=schema
)
For more details on Python Batch API Reference, visit here:

Code Example

features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label="not fraud",
actual_label="fraud"
)
For more information on Python Single Record Logging API Reference, visit here:

Case #2 - Supports Classification & AUC/Log Loss Metrics

Python Pandas
Python Single Record
Data Connector
Example Row
state
pos_approved
zip_code
age
prediction_label
actual_label
prediction_score
prediction_tsa
ca
True
12345
25
not_fraud
fraud
0.3
1618590882
Code Example
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="prediction_label",
prediction_score_column_name="prediction_score",
actual_label_column_name="actual_label",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code", "age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Batch API Reference, visit here:
Code Example
features = {
'state': 'ca',
'pos_approved': True,
'item_count': 10
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label=("not fraud", 0.3),
actual_label="fraud"
)
For more information on Python Single Record Logging API Reference, visit here:

Case #3 - Supports Classification, AUC/Log Loss, & Regression Metrics

Python Pandas
Python Single Record
Data Connector
Example Row
state
pos_approved
zip_code
age
prediction_label
actual_label
prediction_score
actual_score
prediction_ts
ca
True
12345
25
not_fraud
fraud
0.3
1
1618590882

Code Example

schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="prediction_label",
prediction_score_column_name="prediction_score",
actual_label_column_name="actual_label",
actual_score_column_name="actual_score",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code","age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.REGRESSION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Pandas API Reference, visit here:

Code Example

features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
prediction_label=("not fraud", 0.3)
features=features,
tags=tags,
actual_label=("fraud", 1)
)
For more information on Python Single Record Logging API Reference, visit here:

Case 4: Supports AUC & Log Loss Metrics

Python Pandas
Python Single Record

Example Row

state
pos_approved
zip_code
age
actual_label
prediction_score
prediction_ts
ca
True
12345
25
fraud
0.3
1618590882

Code Example

schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_score_column_name="prediction_score",
actual_label_column_name="actual_label",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code","age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.REGRESSION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Pandas API Reference, visit here:
To declareprediction_score ONLY, pass a tuple of an empty string and your prediction_score through the prediction_label argument.

Code Example

features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label=("", 0.3),
actual_label="fraud",
)
For more information on Python Single Record Logging API Reference, visit here:

Case 5: Supports Only Regression Metrics

Python Pandas
Python Single Record

Example Row

state
pos_approved
zip_code
age
prediction_score
actual_score
prediction_ts
ca
True
12345
25
0.3
1
1618590882

Code Example

schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_score_column_name="prediction_score",
actual_score_column_name="actual_score",
feature_column_names=["state", "pos_approved"],
tag_column_names=["zip_code","age"]
)
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION, Metrics.REGRESSION, Metrics.AUC_LOG_LOSS],
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
For more details on Python Pandas API Reference, visit here:
To declareprediction_score ONLY, pass a tuple of an empty string and your prediction_score through the prediction_label argument. To declareactual_score ONLY, pass a tuple of an empty string and your actual_score through the actual_label argument.

Code Example

features = {
'state': 'ca',
'pos_approved': True
}
tags = {
'zip_code': '12345',
'age': '25'
}
response = arize_client.log(
model_id='sample-model-1',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
environment=Environments.PRODUCTION,
features=features,
tags=tags,
prediction_label=("", 0.3),
actual_label=("", 1.0)
)
For more information on Python Single Record Logging API Reference, visit here:

Default Actuals

For some use cases, it may be important to treat a prediction for which no corresponding actual label has been logged yet as having a default negative class actual label.
For example, consider tracking advertisement conversion rates for an ad clickthrough rate model, where the positive class is click and the negative class is no_click. For ad conversion purposes, a prediction without a corresponding actual label for an ad placement is equivalent to logging an explicit no_click actual label for the prediction. In both cases, the result is the same: a user has not converted by clicking on the ad. For AUC-ROC, PR-AUC, and Log Loss performance metrics, Arize supports treating predictions without an explicit actual label as having the negative class actual label by default. In the above example, a click prediction without an actual would be treated as a false positive, because the missing actual for the prediction would, by default, be assigned to the no_click negative class.
This feature can be enabled for monitors and dashboards via the model performance config section of your model's config page.

Quick Definitions

Prediction Label: The classification label of this event (Cardinality = 2)
Actual Label: The ground truth label (Cardinality = 2)
Prediction Score: The likelihood of the event (Probability between 0 to 1)
Actual Score: The ground truth score (0 or 1)