What model Types are supported by the surrogate explainer?
Classification and regression models are supported by the surrogate explainer.
How does the surrogate explainer work?
The surrogate model, built with LightGBM, acts like a stand-in for the complex model you're trying to understand. It's designed to make the same kind of predictions but in a way that's easier to explain. Once this simpler model is ready, we use it to calculate SHAP values. These values help us see how much each piece of data you put into the model affects the final decision. It's like giving each feature a score that tells us its importance in the prediction, which brings clarity to the complex model's decision-making.
How does the volume of data I send impact the SHAP values calculated by the surrogate explainer?
With a small dataset, SHAP values can become less reliable and more susceptible to instability, as they depend on averaging over various feature permutations. Sparse data or a small sample size may hinder the accurate capture of feature interactions, potentially leading to less trustworthy SHAP values. Additionally, overfitting the surrogate model to the limited data is a concern, making SHAP interpretations less robust. We recommend sending at least 1000 rows when using the surrogate explainer.