Statistics Theory
See recent articles
Showing new listings for Thursday, 3 April 2025
- [1] arXiv:2504.01113 [pdf, html, other]
-
Title: Confidence Bands for Multiparameter Persistence LandscapesComments: 11 pages, 1 figureSubjects: Statistics Theory (math.ST); Computational Geometry (cs.CG); Algebraic Topology (math.AT)
Multiparameter persistent homology is a generalization of classical persistent homology, a central and widely-used methodology from topological data analysis, which takes into account density estimation and is an effective tool for data analysis in the presence of noise. Similar to its classical single-parameter counterpart, however, it is challenging to compute and use in practice due to its complex algebraic construction. In this paper, we study a popular and tractable invariant for multiparameter persistent homology in a statistical setting: the multiparameter persistence landscape. We derive a functional central limit theorem for multiparameter persistence landscapes, from which we compute confidence bands, giving rise to one of the first statistical inference methodologies for multiparameter persistence landscapes. We provide an implementation of confidence bands and demonstrate their application in a machine learning task on synthetic data.
- [2] arXiv:2504.01247 [pdf, html, other]
-
Title: On spectral gap decomposition for Markov chainsSubjects: Statistics Theory (math.ST); Probability (math.PR)
Multiple works regarding convergence analysis of Markov chains have led to spectral gap decomposition formulas of the form \[ \mathrm{Gap}(S) \geq c_0 \left[\inf_z \mathrm{Gap}(Q_z)\right] \mathrm{Gap}(\bar{S}), \] where $c_0$ is a constant, $\mathrm{Gap}$ denotes the right spectral gap of a reversible Markov operator, $S$ is the Markov transition kernel (Mtk) of interest, $\bar{S}$ is an idealized or simplified version of $S$, and $\{Q_z\}$ is a collection of Mtks characterizing the differences between $S$ and $\bar{S}$.
This type of relationship has been established in various contexts, including: 1. decomposition of Markov chains based on a finite cover of the state space, 2. hybrid Gibbs samplers, and 3. spectral independence and localization schemes.
We show that multiple key decomposition results across these domains can be connected within a unified framework, rooted in a simple sandwich structure of $S$. Within the general framework, we establish new instances of spectral gap decomposition for hybrid hit-and-run samplers and hybrid data augmentation algorithms with two intractable conditional distributions. Additionally, we explore several other properties of the sandwich structure, and derive extensions of the spectral gap decomposition formula. - [3] arXiv:2504.01318 [pdf, other]
-
Title: Tail Bounds for Canonical $U$-Statistics and $U$-Processes with Unbounded KernelsComments: This is a slightly edited version of the 2018 draft available at this https URL. The current version includes an improved result (Theorem 1). More revisions to follow in the coming monthsSubjects: Statistics Theory (math.ST); Probability (math.PR)
In this paper, we prove exponential tail bounds for canonical (or degenerate) $U$-statistics and $U$-processes under exponential-type tail assumptions on the kernels. Most of the existing results in the relevant literature often assume bounded kernels or obtain sub-optimal tail behavior under unbounded kernels. We obtain sharp rates and optimal tail behavior under sub-Weibull kernel functions. Some examples from nonparametric and semiparametric statistics literature are considered.
- [4] arXiv:2504.01535 [pdf, html, other]
-
Title: On Robust Empirical Likelihood for Nonparametric Regression with Application to Regression Discontinuity DesignsComments: 35 pages, 2 figures, 5 tablesSubjects: Statistics Theory (math.ST); Econometrics (econ.EM); Methodology (stat.ME)
Empirical likelihood serves as a powerful tool for constructing confidence intervals in nonparametric regression and regression discontinuity designs (RDD). The original empirical likelihood framework can be naturally extended to these settings using local linear smoothers, with Wilks' theorem holding only when an undersmoothed bandwidth is selected. However, the generalization of bias-corrected versions of empirical likelihood under more realistic conditions is non-trivial and has remained an open challenge in the literature. This paper provides a satisfactory solution by proposing a novel approach, referred to as robust empirical likelihood, designed for nonparametric regression and RDD. The core idea is to construct robust weights which simultaneously achieve bias correction and account for the additional variability introduced by the estimated bias, thereby enabling valid confidence interval construction without extra estimation steps involved. We demonstrate that the Wilks' phenomenon still holds under weaker conditions in nonparametric regression, sharp and fuzzy RDD settings. Extensive simulation studies confirm the effectiveness of our proposed approach, showing superior performance over existing methods in terms of coverage probabilities and interval lengths. Moreover, the proposed procedure exhibits robustness to bandwidth selection, making it a flexible and reliable tool for empirical analyses. The practical usefulness is further illustrated through applications to two real datasets.
- [5] arXiv:2504.01562 [pdf, html, other]
-
Title: Asymptotic analysis of the finite predictor for the fractional Gaussian noiseSubjects: Statistics Theory (math.ST); Probability (math.PR)
The goal of this paper is to propose a new approach to asymptotic analysis of the finite predictor for stationary sequences. It produces the exact asymptotics of the relative prediction error and the partial correlation coefficients. The assumptions are analytic in nature and applicable to processes with long range dependence. The ARIMA type process driven by the fractional Gaussian noise (fGn), which previously remained elusive, serves as our study case.
- [6] arXiv:2504.01781 [pdf, html, other]
-
Title: Proper scoring rules for estimation and forecast evaluationSubjects: Statistics Theory (math.ST); Machine Learning (stat.ML)
Proper scoring rules have been a subject of growing interest in recent years, not only as tools for evaluation of probabilistic forecasts but also as methods for estimating probability distributions. In this article, we review the mathematical foundations of proper scoring rules including general characterization results and important families of scoring rules. We discuss their role in statistics and machine learning for estimation and forecast evaluation. Furthermore, we comment on interesting developments of their usage in applications.
- [7] arXiv:2504.01836 [pdf, html, other]
-
Title: Estimating hazard rates from $δ$-records in discrete distributionsSubjects: Statistics Theory (math.ST)
This paper focuses on nonparametric statistical inference of the hazard rate function of discrete distributions based on $\delta$-record data. We derive the explicit expression of the maximum likelihood estimator and determine its exact distribution, as well as some important characteristics such as its bias and mean squared error. We then discuss the construction of confidence intervals and goodness-of-fit tests. The performance of our proposals is evaluated using simulation methods. Applications to real data are given, as well. The estimation of the hazard rate function based on usual records has been studied in the literature, although many procedures require several samples of records. In contrast, our approach relies on a single sequence of $\delta$-records, simplifying the experimental design and increasing the applicability of the methods.
New submissions (showing 7 of 7 entries)
- [8] arXiv:2504.01837 (cross-list from cs.IT) [pdf, html, other]
-
Title: Cramér--Rao Inequalities for Several Generalized Fisher InformationComments: 27 pagesSubjects: Information Theory (cs.IT); Probability (math.PR); Statistics Theory (math.ST)
The de Bruijn identity states that Fisher information is the half of the derivative of Shannon differential entropy along heat flow. In the same spirit, in this paper we introduce a generalized version of Fisher information, named as the Rényi--Fisher information, which is the half of the derivative of Rényi information along heat flow. Based on this Rényi--Fisher information, we establish sharp Rényi-entropic isoperimetric inequalities, which generalize the classic entropic isoperimetric inequality to the Rényi setting. Utilizing these isoperimetric inequalities, we extend the classical Cramér--Rao inequality from Fisher information to Rényi--Fisher information. Lastly, we use these generalized Cramér--Rao inequalities to determine the signs of derivatives of entropy along heat flow, strengthening existing results on the complete monotonicity of entropy.
Cross submissions (showing 1 of 1 entries)
- [9] arXiv:2408.06103 (replaced) [pdf, other]
-
Title: Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional AsymptoticsComments: 21 figures, 8 tablesSubjects: Statistics Theory (math.ST); Econometrics (econ.EM); Methodology (stat.ME); Machine Learning (stat.ML)
In this paper, we consider the estimation of regression coefficients and signal-to-noise (SNR) ratio in high-dimensional Generalized Linear Models (GLMs), and explore their implications in inferring popular estimands such as average treatment effects in high-dimensional observational studies. Under the ``proportional asymptotic'' regime and Gaussian covariates with known (population) covariance $\Sigma$, we derive Consistent and Asymptotically Normal (CAN) estimators of our targets of inference through a Method-of-Moments type of estimators that bypasses estimation of high dimensional nuisance functions and hyperparameter tuning altogether. Additionally, under non-Gaussian covariates, we demonstrate universality of our results under certain additional assumptions on the regression coefficients and $\Sigma$. We also demonstrate that knowing $\Sigma$ is not essential to our proposed methodology when the sample covariance matrix estimator is invertible. Finally, we complement our theoretical results with numerical experiments and comparisons with existing literature.
- [10] arXiv:2502.15752 (replaced) [pdf, other]
-
Title: Universality of High-Dimensional Logistic Regression and a Novel CGMT under Dependence with Applications to Data AugmentationComments: Added extensions to m-dependence and mixingSubjects: Statistics Theory (math.ST); Machine Learning (stat.ML)
Over the last decade, a wave of research has characterized the exact asymptotic risk of many high-dimensional models in the proportional regime. Two foundational results have driven this progress: Gaussian universality, which shows that the asymptotic risk of estimators trained on non-Gaussian and Gaussian data is equivalent, and the convex Gaussian min-max theorem (CGMT), which characterizes the risk under Gaussian settings. However, these results rely on the assumption that the data consists of independent random vectors--an assumption that significantly limits its applicability to many practical setups. In this paper, we address this limitation by generalizing both results to the dependent setting. More precisely, we prove that Gaussian universality still holds for high-dimensional logistic regression under block dependence, $m$-dependence and special cases of mixing, and establish a novel CGMT framework that accommodates for correlation across both the covariates and observations. Using these results, we establish the impact of data augmentation, a widespread practice in deep learning, on the asymptotic risk.
- [11] arXiv:2408.07379 (replaced) [pdf, html, other]
-
Title: Posterior Covariance Structures in Gaussian ProcessesComments: 28 pagesSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Numerical Analysis (math.NA); Statistics Theory (math.ST)
In this paper, we present a comprehensive analysis of the posterior covariance field in Gaussian processes, with applications to the posterior covariance matrix. The analysis is based on the Gaussian prior covariance but the approach also applies to other covariance kernels. Our geometric analysis reveals how the Gaussian kernel's bandwidth parameter and the spatial distribution of the observations influence the posterior covariance as well as the corresponding covariance matrix, enabling straightforward identification of areas with high or low covariance in magnitude. Drawing inspiration from the a posteriori error estimation techniques in adaptive finite element methods, we also propose several estimators to efficiently measure the absolute posterior covariance field, which can be used for efficient covariance matrix approximation and preconditioning. We conduct a wide range of experiments to illustrate our theoretical findings and their practical applications.
- [12] arXiv:2411.09516 (replaced) [pdf, html, other]
-
Title: Sharp Matrix Empirical Bernstein InequalitiesSubjects: Probability (math.PR); Functional Analysis (math.FA); Statistics Theory (math.ST); Machine Learning (stat.ML)
We present two sharp, closed-form empirical Bernstein inequalities for symmetric random matrices with bounded eigenvalues. By sharp, we mean that both inequalities adapt to the unknown variance in a tight manner: the deviation captured by the first-order $1/\sqrt{n}$ term asymptotically matches the matrix Bernstein inequality exactly, including constants, the latter requiring knowledge of the variance. Our first inequality holds for the sample mean of independent matrices, and our second inequality holds for a mean estimator under martingale dependence at stopping times.