Active Statistical Inference

Zrnic, Tijana; Candès, Emmanuel J.

Statistics > Machine Learning

arXiv:2403.03208v2 (stat)

[Submitted on 5 Mar 2024 (v1), last revised 29 May 2024 (this version, v2)]

Title:Active Statistical Inference

Authors:Tijana Zrnic, Emmanuel J. Candès

View PDF HTML (experimental)

Abstract:Inspired by the concept of active learning, we propose active inference$\unicode{x2013}$a methodology for statistical inference with machine-learning-assisted data collection. Assuming a budget on the number of labels that can be collected, the methodology uses a machine learning model to identify which data points would be most beneficial to label, thus effectively utilizing the budget. It operates on a simple yet powerful intuition: prioritize the collection of labels for data points where the model exhibits uncertainty, and rely on the model's predictions where it is confident. Active inference constructs provably valid confidence intervals and hypothesis tests while leveraging any black-box machine learning model and handling any data distribution. The key point is that it achieves the same level of accuracy with far fewer samples than existing baselines relying on non-adaptively-collected data. This means that for the same number of collected samples, active inference enables smaller confidence intervals and more powerful p-values. We evaluate active inference on datasets from public opinion research, census analysis, and proteomics.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:2403.03208 [stat.ML]
	(or arXiv:2403.03208v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2403.03208

Submission history

From: Tijana Zrnic [view email]
[v1] Tue, 5 Mar 2024 18:46:50 UTC (96 KB)
[v2] Wed, 29 May 2024 05:20:54 UTC (110 KB)

Statistics > Machine Learning

Title:Active Statistical Inference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Active Statistical Inference

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators