Data enriched linear regression

Chen, Aiyou; Owen, Art B.; Shi, Minghui

Statistics > Methodology

arXiv:1304.1837 (stat)

[Submitted on 6 Apr 2013 (v1), last revised 17 Dec 2014 (this version, v3)]

Title:Data enriched linear regression

Authors:Aiyou Chen, Art B. Owen, Minghui Shi

View PDF

Abstract:We present a linear regression method for predictions on a small data set making use of a second possibly biased data set that may be much larger. Our method fits linear regressions to the two data sets while penalizing the difference between predictions made by those two models. The resulting algorithm is a shrinkage method similar to those used in small area estimation. We find a Stein-type finding for Gaussian responses: when the model has 5 or more coefficients and 10 or more error degrees of freedom, it becomes inadmissible to use only the small data set, no matter how large the bias is. We also present both plug-in and AICc-based methods to tune our penalty parameter. Most of our results use an $L_2$ penalty, but we obtain formulas for $L_1$ penalized estimates when the model is specialized to the location setting. Ordinary Stein shrinkage provides an inadmissibility result for only 3 or more coefficients, but we find that our shrinkage method typically produces much lower squared errors in as few as 5 or 10 dimensions when the bias is small and essentially equivalent squared errors when the bias is large.

Subjects:	Methodology (stat.ME)
Cite as:	arXiv:1304.1837 [stat.ME]
	(or arXiv:1304.1837v3 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.1304.1837

Submission history

From: Art Owen [view email]
[v1] Sat, 6 Apr 2013 00:20:53 UTC (136 KB)
[v2] Tue, 9 Dec 2014 22:12:22 UTC (67 KB)
[v3] Wed, 17 Dec 2014 21:58:53 UTC (67 KB)

Statistics > Methodology

Title:Data enriched linear regression

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Data enriched linear regression

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators