Computer Science > Information Theory
[Submitted on 17 Nov 2014 (v1), last revised 2 May 2022 (this version, v4)]
Title:Data driven weak universal consistency
View PDFAbstract:Many current applications in data science need rich model classes to adequately represent the statistics that may be driving the observations. But rich model classes may be too complex to admit estimators that converge to the truth with convergence rates that can be uniformly bounded over the entire collection of probability distributions comprising the model class, i.e. it may be impossible to guarantee uniform consistency of such estimators as the sample size increases. In such cases, it is conventional to settle for estimators with guarantees on convergence rate where the performance can be bounded in a model-dependent way, i.e. pointwise consistent estimators. But this viewpoint has the serious drawback that estimator performance is a function of the unknown model within the model class that is being estimated, and is therefore unknown. Even if an estimator is consistent, how well it is doing at any given time may not be clear, no matter what the sample size of the observations.
Departing from the classical uniform/pointwise consistency dichotomy that leads to this impasse, a new analysis framework is explored by studying rich model classes that may only admit pointwise consistency guarantees, yet all the information about the unknown model driving the observations that is needed to gauge estimator accuracy can be inferred from the sample at hand. We expect that this data-derived estimation framework will be broadly applicable to a wide range of estimation problems by providing a methodology to deal with much richer model classes. In this paper we analyze the lossless compression problem in detail in this novel data-derived framework.
Submission history
From: Narayana Santhanam [view email][v1] Mon, 17 Nov 2014 10:02:55 UTC (103 KB)
[v2] Fri, 19 Mar 2021 22:47:44 UTC (90 KB)
[v3] Tue, 25 May 2021 06:46:17 UTC (90 KB)
[v4] Mon, 2 May 2022 22:48:48 UTC (112 KB)
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.