SHAP

SHAP (Shapley Additive exPlanations) is a method used to break down individual predictions of a complex model

Visit the Shapley Values Documentation here to learn more

Tree Shap

TreeSHAP is a fast explainer used for analyzing decision tree models in the Shap python library. TreeSHAP is designed for tree-based machine learning models such as decision trees, random forests and gradient boosted trees. TreeSHAP is offered as a rapid, model-specific alternative to KernelSHAP; however, it can sometimes produce unintuitive feature attributions.

Neural Network Explainer

Deep explainer (deep SHAP) is an explainability technique that can be used for models with a neural network based architecture. This is the fastest neural network explainability approach and is based on running a SHAP-based version of the original deep lift algorithm.

Kernal Explainer

KernelSHAP is a slow, perturbation-based Shapley approach that theoretically works for all types of models but is rarely used by teams in the wild (at least in production). KernelSHAP tends to be way too slow to be used in practice extensively on anything but small data. It also tends to cause confusion among teams. When teams complain about SHAP being slow, usually it’s because they tested KernelSHAP.

Code Example

# 1. Generate the Shap Values and save as Dataframe
explainer = shap.TreeExplainer(tree_model)
shap_values = explainer.shap_values(X_data)
shap_dataframe = pd.DataFrame(
        shap_values, columns=[f"{fn}_shap" for fn in data["feature_names"]]
    )

# 2.Define the Schema. Link the feature column with its corresponding shap column
feature_cols = ["MERCHANT_TYPE", "ENTRY_MODE", "STATE", "MEAN_AMOUNT", "STD_AMOUNT", "TX_AMOUNT"]
shap_cols = shap_dataframe.columns

schema = Schema(
    prediction_id_column_name="prediction_id",
    ...
    feature_column_names= feature_cols,
    shap_values_column_names=dict(zip(feature_cols, shap_cols)),
)

# Log the dataframe with the schema mapping 
response = arize_client.log(
    model_id="sample-model-1",
    model_version= "v1",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    dataframe=test_dataframe,
    schema=schema,
)

Questions? Email us at support@arize.com or Slack us in the #arize-support channel

Last updated

Copyright © 2023 Arize AI, Inc