kernelshap 0.7.0

This release is intended to be the last before stable version 1.0.0.

Major change

Passing a background dataset bg_X is now optional.

If the explanation data X is sufficiently large (>= 50 rows), bg_X is derived as a random sample of bg_n = 200 rows from X. If X has less than bg_n rows, then simply bg_X = X. If X has too few rows (< 50), you will have to pass an explicit bg_X.

Minor changes

kernelshap 0.6.0

Major changes

Maintenance

kernelshap 0.5.0

New features

New additive explainer additive_shap() that works for models fitted via

The explainer uses predict(..., type = "terms"), a beautiful trick used in fastshap::explain.lm(). The result will be identical to those returned by kernelshap() and permshap() but exponentially faster. Thanks David Watson for the great idea discussed in #130.

User visible changes

kernelshap 0.4.2

API

Documentation

kernelshap 0.4.1

Performance improvements

Documentation

kernelshap 0.4.0

Major changes

Other changes

kernelshap 0.3.8

API improvements

Bug fixes

Maintenance

kernelshap 0.3.7

Maintenance

kernelshap 0.3.6

Maintenance

kernelshap 0.3.5

Maintenance

Small visible changes

kernelshap 0.3.4

Documentation

kernelshap 0.3.3

Less dependencies

kernelshap 0.3.2

Documentation

Bug fixes

kernelshap 0.3.1

Changes

kernelshap 0.3.0

Major improvements

Exact calculations

Thanks to David Watson, exact calculations are now also possible for \(p>5\) features. By default, the algorithm uses exact calculations for \(p \le 8\) and a hybrid strategy otherwise, see the next section. At the same time, the exact algorithm became much more efficient.

A word of caution: Exact calculations mean to create \(2^p-2\) on-off vectors \(z\) (cheap step) and evaluating the model on a whopping \((2^p-2)N\) rows, where \(N\) is the number of rows of the background data (expensive step). As this explodes with large \(p\), we do not recommend the exact strategy for \(p > 10\).

Hybrid strategy

The iterative Kernel SHAP sampling algorithm of Covert and Lee (2021) [1] works by randomly sample \(m\) on-off vectors \(z\) so that their sum follows the SHAP Kernel weight distribution (renormalized to the range from \(1\) to \(p-1\)). Based on these vectors, many predictions are formed. Then, Kernel SHAP values are derived as the solution of a constrained linear regression, see [1] for details. This is done multiple times until convergence.

A drawback of this strategy is that many (at least 75%) of the \(z\) vectors will have \(\sum z \in \{1, p-1\}\), producing many duplicates. Similarly, at least 92% of the mass will be used for the \(p(p+1)\) possible vectors with \(\sum z \in \{1, 2, p-1, p-2\}\) etc. This inefficiency can be fixed by a hybrid strategy, combining exact calculations with sampling. The hybrid algorithm has two steps:

  1. Step 1 (exact part): There are \(2p\) different on-off vectors \(z\) with \(\sum z \in \{1, p-1\}\), covering a large proportion of the Kernel SHAP distribution. The degree 1 hybrid will list those vectors and use them according to their weights in the upcoming calculations. Depending on \(p\), we can also go a step further to a degree 2 hybrid by adding all \(p(p-1)\) vectors with \(\sum z \in \{2, p-2\}\) to the process etc. The necessary predictions are obtained along with other calculations similar to those in [1].
  2. Step 2 (sampling part): The remaining weight is filled by sampling vectors \(z\) according to Kernel SHAP weights renormalized to the values not yet covered by Step 1. Together with the results from Step 1 - correctly weighted - this now forms a complete iteration as in Covert and Lee (2021). The difference is that most mass is covered by exact calculations. Afterwards, the algorithm iterates until convergence. The output of Step 1 is reused in every iteration, leading to an extremely efficient strategy.

The default behaviour of kernelshap() is as follows:

It is also possible to use a pure sampling strategy, see Section “User visible changes” below. While this is usually not advisable compared to a hybrid approach, the options of kernelshap() allow to study different properties of Kernel SHAP and doing empirical research on the topic.

Kernel SHAP in the Python implementation “shap” uses a quite similar hybrid strategy, but without iterating. The new logic in the R package thus combines the efficiency of the Python implementation with the convergence monitoring of [1].

[1] Ian Covert and Su-In Lee. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3457-3465, 2021.

User visible changes

Other changes

Bug fixes

kernelshap 0.2.0

Breaking change

The interface of kernelshap() has been revised. Instead of specifying a prediction function, it suffices now to pass the fitted model object. The default pred_fun is now stats::predict, which works in most cases. Some other cases are catched via model class (“ranger” and mlr3 “Learner”). The pred_fun can be overwritten by a function of the form function(object, X, ...). Additional arguments to the prediction function are passed via ... of kernelshap().

Some examples:

Major improvements

User visible changes

Bug fixes

New contributor

kernelshap 0.1.0

This is the initial release.