Poster
AIM: Amending Inherent Interpretability via Self-Supervised Masking
Eyad Alshami · Shashank Agnihotri · Bernt Schiele · Margret Keuper
[
Abstract
]
Abstract:
It has been observed that deep neural networks (DNNs) often use both genuine as well as spurious features.In this work, we propose ''Amending Inherent Interpretability via Self-Supervised Masking'' (AIM), a simple yet surprisingly effective method that promotes the network’s utilization of genuine features over spurious alternatives without requiring additional annotations.In particular, AIM uses features at multiple encoding stages to guide a self-supervised, sample-specific feature-masking process. As a result, AIM allows training well-performing and inherently interpretable models that faithfully summarize the decision process.When tested on challenging datasets designed to assess reliance on spurious features and out-of-domain generalization, AIM networks demonstrate significant dual benefits: Evaluations show that AIM improves interpretability, as measured by the Energy Pointing Game (EPG) score, by $\sim$6$-$37\%, while simultaneously enhancing accuracy by $\sim$10$-$40\%. These impressive performance gains are further validated on the standard in-domain CUB-200 dataset for fine-grained classification. The results provide compelling evidence supporting our hypothesis that AIM finds genuine and meaningful features that directly contribute to its improved human interpretability.
Live content is unavailable. Log in and register to view live content