Arize AI
Search…
8. Feature Importance
Assign a certain score to each feature to weigh how much or how little it impacted the outcome

Overview

Feature importance is a compilation of a class of techniques that take in all the features related to making a model prediction and assign a certain score to each feature to weigh how much or how little it impacted the outcome.
Important: Arize does NOT REQUIRE a model artifact upload to use feature importance. Feature Importance values should be calculated by the user or a surrogate model and then logged to Arize.
SHAP Values (a feature importance technique) can be logged along with the inferences or after.

How to Send Feature Importance to Arize

Arize supports 2 methods for ingesting and visualizing feature importance, with tradeoffs:
Tradeoffs of feature importance generation methods

Code Example

1
# 1. Generate the Shap Values and save as Dataframe
2
explainer = shap.TreeExplainer(tree_model)
3
shap_values = explainer.shap_values(X_data)
4
shap_dataframe = pd.DataFrame(
5
shap_values, columns=[f"{fn}_shap" for fn in data["feature_names"]]
6
)
7
8
# 2.Define the Schema. Link the feature column with its corresponding shap column
9
feature_cols = ["MERCHANT_TYPE", "ENTRY_MODE", "STATE", "MEAN_AMOUNT", "STD_AMOUNT", "TX_AMOUNT"]
10
shap_cols = shap_dataframe.columns
11
12
schema = Schema(
13
prediction_id_column_name="prediction_id",
14
...
15
feature_column_names= feature_cols,
16
shap_values_column_names=dict(zip(feature_cols, shap_cols)),
17
)
18
19
# Log the dataframe with the schema mapping
20
response = arize_client.log(
21
model_id="sample-model-1",
22
model_version= "v1",
23
model_type=ModelTypes.SCORE_CATEGORICAL,
24
environment=Environments.PRODUCTION,
25
dataframe=test_dataframe,
26
schema=schema,
27
)
28
Copied!
Questions? Email us at [email protected] or Slack us in the #arize-support channel