Build Machine Learning Explainability Explainer with SHAP and LLM
Let's bring the explanation for the non-technical people
Let’s be real: the business person and stakeholder doesn’t care about the technical implementation.
In fact, they probably don’t care about the probability percentage from your prediction as long as it brings value.
However, the business may consider why the model predicts a certain way. For instance, given our features, why is a person predicted to be a fraud or not?
I aim to establish this when developing a fraud detection model. As this is a rather sensitive issue, people must be assured about why someone is deemed a fraud.
That’s why I am trying to build an explainable pipeline using SHAP. However, I don’t create it as is since no business person understands SHAP values and their interpretation. Instead, I transform the SHAP results into passages that non-technical people can understand.
In this article, we will develop an explainability system for SHAP that business professionals can understand.🚀
How does this work? Let’s get into it.
Don’t forget to subscribe to get access to these amazing pro tips! 👇👇👇
What is SHAP?
SHAP (SHapley Additive exPlanations) is a framework for interpreting machine learning models by quantifying the contribution of each feature to a model's predictions.
The concept is based on cooperative game theory, where SHAP assigns an importance value—Shapley value—to each feature. The Shapley value then becomes an indicator of each feature's impact on the model's output.
The framework seeks to quantify each feature's contribution to a model's predictions by analyzing all possible feature combinations. This ensures a fair and consistent allocation of importance for the features.
Imagine a collaborative game in which each feature of your dataset is a player contributing to the game's outcome—the model's prediction. The objective is to determine how much each player (feature) contributes to the total payout (prediction).
SHAP values achieve this by calculating the average marginal contribution of each feature across all potential coalitions (subsets of features). This method guarantees that the contribution attributed to each feature is equitable and considers the interactions between features.
You can see the visualization for SHAP contribution in the image below.
Each feature increases or decreases the base value, indicating how big the feature contributes to the prediction probability. A few points to remember regarding how the visualization works are:
Base (Expected) Value:
Represents the average prediction if no additional information about features was available. It's like the "starting point."SHAP Values:
Show how each feature moves the prediction away from the base value, either upward (positive influence) or downward (negative influence).Final Prediction:
Sum of Base Value + SHAP Values for all features.
We will not discuss SHAP in much more detail, as this is not the point of the article.
Let’s move on to the practical application of this article.
Keep reading with a 7-day free trial
Subscribe to Non-Brand Data to keep reading this post and get 7 days of free access to the full post archives.