SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability

Huang, Wei; Zhao, Xingyu; Jin, Gaojie; Huang, Xiaowei

Computer Science > Machine Learning

arXiv:2208.09418v1 (cs)

[Submitted on 19 Aug 2022 (this version), latest version 31 Jul 2023 (v4)]

Title:SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability

Authors:Wei Huang, Xingyu Zhao, Gaojie Jin, Xiaowei Huang

View PDF

Abstract:Interpretability of Deep Learning (DL) models is arguably the barrier in front of trustworthy AI. Despite great efforts made by the Explainable AI (XAI) community, explanations lack robustness--indistinguishable input perturbations may lead to different XAI results. Thus, it is vital to assess how robust DL interpretability is, given an XAI technique. To this end, we identify the following challenges that state-of-the-art is unable to cope with collectively: i) XAI techniques are highly heterogeneous; ii) misinterpretations are normally rare events; iii) both worst-case and overall robustness are of practical interest. In this paper, we propose two evaluation methods to tackle them--i) they are of black-box nature, based on Genetic Algorithm (GA) and Subset Simulation (SS); ii) bespoke fitness functions are used by GA to solve a constrained optimisation efficiently, while SS is dedicated to estimating rare event probabilities; iii) two diverse metrics are introduced, concerning the worst-case interpretation discrepancy and a probabilistic notion of \textit{how} robust in general, respectively. We conduct experiments to study the accuracy, sensitivity and efficiency of our methods that outperform state-of-the-arts. Finally, we show two applications of our methods for ranking robust XAI methods and selecting training schemes to improve both classification and interpretation robustness.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2208.09418 [cs.LG]
	(or arXiv:2208.09418v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2208.09418

Submission history

From: Xingyu Zhao [view email]
[v1] Fri, 19 Aug 2022 16:07:22 UTC (6,723 KB)
[v2] Sun, 16 Jul 2023 20:59:16 UTC (6,269 KB)
[v3] Wed, 19 Jul 2023 20:37:48 UTC (6,271 KB)
[v4] Mon, 31 Jul 2023 16:28:13 UTC (6,271 KB)

Computer Science > Machine Learning

Title:SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators