Are all models wrong? Fundamental limits in distribution-free empirical model falsification

Müller, Manuel M.; Luo, Yuetian; Barber, Rina Foygel

Mathematics > Statistics Theory

arXiv:2502.06765 (math)

[Submitted on 10 Feb 2025]

Title:Are all models wrong? Fundamental limits in distribution-free empirical model falsification

Authors:Manuel M. Müller, Yuetian Luo, Rina Foygel Barber

View PDF HTML (experimental)

Abstract:In statistics and machine learning, when we train a fitted model on available data, we typically want to ensure that we are searching within a model class that contains at least one accurate model -- that is, we would like to ensure an upper bound on the model class risk (the lowest possible risk that can be attained by any model in the class). However, it is also of interest to establish lower bounds on the model class risk, for instance so that we can determine whether our fitted model is at least approximately optimal within the class, or, so that we can decide whether the model class is unsuitable for the particular task at hand. Particularly in the setting of interpolation learning where machine learning models are trained to reach zero error on the training data, we might ask if, at the very least, a positive lower bound on the model class risk is possible -- or are we unable to detect that "all models are wrong"? In this work, we answer these questions in a distribution-free setting by establishing a model-agnostic, fundamental hardness result for the problem of constructing a lower bound on the best test error achievable over a model class, and examine its implications on specific model classes such as tree-based methods and linear regression.

Subjects:	Statistics Theory (math.ST); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2502.06765 [math.ST]
	(or arXiv:2502.06765v1 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.2502.06765

Submission history

From: Manuel Müller [view email]
[v1] Mon, 10 Feb 2025 18:44:30 UTC (35 KB)

Mathematics > Statistics Theory

Title:Are all models wrong? Fundamental limits in distribution-free empirical model falsification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Statistics Theory

Title:Are all models wrong? Fundamental limits in distribution-free empirical model falsification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators