Statistical Inference for Privatized Data with Unknown Sample Size

Awan, Jordan; Barrientos, Andres Felipe; Ju, Nianqiao

Mathematics > Statistics Theory

arXiv:2406.06231 (math)

[Submitted on 10 Jun 2024]

Title:Statistical Inference for Privatized Data with Unknown Sample Size

Authors:Jordan Awan, Andres Felipe Barrientos, Nianqiao Ju

View PDF HTML (experimental)

Abstract:We develop both theory and algorithms to analyze privatized data in the unbounded differential privacy(DP), where even the sample size is considered a sensitive quantity that requires privacy protection. We show that the distance between the sampling distributions under unbounded DP and bounded DP goes to zero as the sample size $n$ goes to infinity, provided that the noise used to privatize $n$ is at an appropriate rate; we also establish that ABC-type posterior distributions converge under similar assumptions. We further give asymptotic results in the regime where the privacy budget for $n$ goes to zero, establishing similarity of sampling distributions as well as showing that the MLE in the unbounded setting converges to the bounded-DP MLE. In order to facilitate valid, finite-sample Bayesian inference on privatized data in the unbounded DP setting, we propose a reversible jump MCMC algorithm which extends the data augmentation MCMC of Ju et al. (2022). We also propose a Monte Carlo EM algorithm to compute the MLE from privatized data in both bounded and unbounded DP. We apply our methodology to analyze a linear regression model as well as a 2019 American Time Use Survey Microdata File which we model using a Dirichlet distribution.

Comments:	20 pages before references, 40 pages in total, 4 figures, 3 tables
Subjects:	Statistics Theory (math.ST); Cryptography and Security (cs.CR); Computation (stat.CO)
Cite as:	arXiv:2406.06231 [math.ST]
	(or arXiv:2406.06231v1 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.2406.06231

Submission history

From: Jordan Awan [view email]
[v1] Mon, 10 Jun 2024 13:03:20 UTC (297 KB)

Mathematics > Statistics Theory

Title:Statistical Inference for Privatized Data with Unknown Sample Size

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Statistics Theory

Title:Statistical Inference for Privatized Data with Unknown Sample Size

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators