Are there connections between post-hoc explanation methods and interpretable models?

Sebastian Bordt and Ulrike Luxburg show in their #aistats23 paper that there is a strong correspondence between Shapley Values and Generalized Additive Models (GAMs).

Read more: https://arxiv.org/abs/2209.04012

From Shapley Values to Generalized Additive Models and back

In explainable machine learning, local post-hoc explanation algorithms and inherently interpretable models are often seen as competing approaches. This work offers a partial reconciliation between the two by establishing a correspondence between Shapley Values and Generalized Additive Models (GAMs). We introduce $n$-Shapley Values, a parametric family of local post-hoc explanation algorithms that explain individual predictions with interaction terms up to order $n$. By varying the parameter $n$, we obtain a sequence of explanations that covers the entire range from Shapley Values up to a uniquely determined decomposition of the function we want to explain. The relationship between $n$-Shapley Values and this decomposition offers a functionally-grounded characterization of Shapley Values, which highlights their limitations. We then show that $n$-Shapley Values, as well as the Shapley Taylor- and Faith-Shap interaction indices, recover GAMs with interaction terms up to order $n$. This implies that the original Shapely Values recover GAMs without variable interactions. Taken together, our results provide a precise characterization of Shapley Values as they are being used in explainable machine learning. They also offer a principled interpretation of partial dependence plots of Shapley Values in terms of the underlying functional decomposition. A package for the estimation of different interaction indices is available at \url{https://github.com/tml-tuebingen/nshap}.

arXiv.org

The same score can be derived for probabilistic circuits (PCs), or actually, for the sub-class of deterministic PCs! We use the score in a simple cutset learning algorithm and achieve excellent trade-offs between learning speed and test log-likelihoods.

Stay tuned for the paper, or catch up with us at #aistats23 in April! 😀

I'm happy to share that our paper "Bayesian Structure Scores for Probabilistic Circuits" has been accepted at #aistats23! This is joint work with Yang Yang (MSc @mastodon.world, incoming PhD @leuvenai and Gennaro Gala.

The main contribution of the paper is to propose a new (actually old) way to learn the structure of #ProbabilisticCircuits. We take lessons from classical structure learning in Bayesian networks which optimizes some structure score, where a principled choice is a Bayesian score.