A weakly informative default prior distribution for logistic and other regression models

Gelman, Andrew; Jakulin, Aleks; Pittau, Maria Grazia; Su, Yu-Sung

doi:10.1214/08-AOAS191

Statistics > Applications

arXiv:0901.4011 (stat)

[Submitted on 26 Jan 2009]

Title:A weakly informative default prior distribution for logistic and other regression models

Authors:Andrew Gelman, Aleks Jakulin, Maria Grazia Pittau, Yu-Sung Su

View PDF

Abstract: We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-$t$ prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors. We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small), and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation. We implement a procedure to fit generalized linear models in R with the Student-$t$ prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several applications, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.

Comments:	Published in at this http URL the Annals of Applied Statistics (this http URL) by the Institute of Mathematical Statistics (this http URL)
Subjects:	Applications (stat.AP)
Report number:	IMS-AOAS-AOAS191
Cite as:	arXiv:0901.4011 [stat.AP]
	(or arXiv:0901.4011v1 [stat.AP] for this version)
	https://doi.org/10.48550/arXiv.0901.4011
Journal reference:	Annals of Applied Statistics 2008, Vol. 2, No. 4, 1360-1383
Related DOI:	https://doi.org/10.1214/08-AOAS191

Submission history

From: Andrew Gelman [view email] [via VTEX proxy]
[v1] Mon, 26 Jan 2009 14:20:43 UTC (303 KB)

Statistics > Applications

Title:A weakly informative default prior distribution for logistic and other regression models

Submission history

Access Paper:

References & Citations

1 blog link

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Applications

Title:A weakly informative default prior distribution for logistic and other regression models

Submission history

Access Paper:

References & Citations

1 blog link

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators