Significance tests of feature relevance for a black-box learner

Dai, Ben; Shen, Xiaotong; Pan, Wei

doi:10.1109/TNNLS.2022.3185742

Statistics > Machine Learning

arXiv:2103.04985v3 (stat)

[Submitted on 2 Mar 2021 (v1), last revised 21 Jun 2022 (this version, v3)]

Title:Significance tests of feature relevance for a black-box learner

Authors:Ben Dai, Xiaotong Shen, Wei Pan

View PDF

Abstract:An exciting recent development is the uptake of deep neural networks in many scientific fields, where the main objective is outcome prediction with the black-box nature. Significance testing is promising to address the black-box issue and explore novel scientific insights and interpretation of the decision-making process based on a deep learning model. However, testing for a neural network poses a challenge because of its black-box nature and unknown limiting distributions of parameter estimates while existing methods require strong assumptions or excessive computation. In this article, we derive one-split and two-split tests relaxing the assumptions and computational complexity of existing black-box tests and extending to examine the significance of a collection of features of interest in a dataset of possibly a complex type such as an image. The one-split test estimates and evaluates a black-box model based on estimation and inference subsets through sample splitting and data perturbation. The two-split test further splits the inference subset into two but require no perturbation. Also, we develop their combined versions by aggregating the p-values based on repeated sample splitting. By deflating the bias-sd-ratio, we establish asymptotic null distributions of the test statistics and the consistency in terms of Type II error. Numerically, we demonstrate the utility of the proposed tests on seven simulated examples and six real datasets. Accompanying this paper is our Python library dnn-inference (this https URL) that implements the proposed tests.

Comments:	Accepted for publication in IEEE Transactions on Neural Networks and Learning Systems
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:2103.04985 [stat.ML]
	(or arXiv:2103.04985v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2103.04985
Journal reference:	IEEE Transactions on Neural Networks and Learning Systems, 2022
Related DOI:	https://doi.org/10.1109/TNNLS.2022.3185742

Submission history

From: Ben Dai [view email]
[v1] Tue, 2 Mar 2021 00:59:19 UTC (5,397 KB)
[v2] Wed, 9 Jun 2021 03:28:58 UTC (1,416 KB)
[v3] Tue, 21 Jun 2022 15:45:34 UTC (1,871 KB)

Statistics > Machine Learning

Title:Significance tests of feature relevance for a black-box learner

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Significance tests of feature relevance for a black-box learner

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators