Skip to Content
Technical Articles
Author's profile photo Marc DANIAU

SHAP Interaction values with Automated Predictive (APL)

We already covered SHAP-explained models for classification and regression scenarios in a previous APL blog post, and at the time we talked briefly about the main effect of a predictor and its interaction effect with the other predictors of the model. Now with HANA ML 2.17, you have the ability to visualize the interaction between variables in a heatmap. This new visualization adds to the bar chart Variable Importance in providing a global explanation of the classification/regression model. To get that new feature you need APL 2311 or a later version.

This blog will walk you through an example using the Census dataset that comes with APL.

from hana_ml import dataframe as hd
conn = hd.ConnectionContext(userkey='MLMDA_KEY')
hdf_train = hd.DataFrame(conn, sql_cmd)

First, we train a gradient boosting classification model with the interaction parameter set to true:

from hana_ml.algorithms.apl.gradient_boosting_classification import GradientBoostingBinaryClassifier
apl_model = GradientBoostingBinaryClassifier(variable_auto_selection=True, 
                                             interactions=True), label='class', key='id')

When the model training is completed, we ask for the report:

from hana_ml.visualizers.unified_report import UnifiedReport

You may want to generate the report as an HTML file:


The usual “Variable Importance” tab provides a global explanation of the predictive model.

But because we explicitly requested the interactions when setting the model parameters, a new tab “Interaction Matrix” appears at the end:

On the diagonal is the main effect of each variable. The interaction matrix presents only the variables with the highest interactions. By default, it is limited to a size of 6×6. For a larger matrix, 9×9 for example, we must specify a maximum number as follows:

apl_model = GradientBoostingBinaryClassifier(variable_auto_selection=True, 
                                             interactions_max_kept=8), label='class', key='id')

The larger the matrix, the longer it takes to fit the model.

If needed, one can obtain the interaction values in a pandas dataframe:

df = apl_model.get_debrief_report('ClassificationRegression_InteractionMatrix').deselect('Oid').collect()'index')

These figures are computed using the Shapley Taylor index.


To know more about APL


Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Iatco Sergiu
      Iatco Sergiu

      Dear Marc, I like it. It is more convenient to have tabs on top than on the left. Regards, Sergiu