Statistics > Applications
[Submitted on 1 Nov 2017 (v1), last revised 20 May 2018 (this version, v3)]
Title:Bayesian Variable Selection for Multivariate Zero-Inflated Models: Application to Microbiome Count Data
View PDFAbstract:Microorganisms play critical roles in human health and disease. It is well known that microbes live in diverse communities in which they interact synergistically or antagonistically. Thus for estimating microbial associations with clinical covariates, multivariate statistical models are preferred. Multivariate models allow one to estimate and exploit complex interdependencies among multiple taxa, yielding more powerful tests of exposure or treatment effects than application of taxon-specific univariate analyses. In addition, the analysis of microbial count data requires special attention because data commonly exhibit zero inflation. To meet these needs, we developed a Bayesian variable selection model for multivariate count data with excess zeros that incorporates information on the covariance structure of the outcomes (counts for multiple taxa), while estimating associations with the mean levels of these outcomes. Although there has been a great deal of effort in zero-inflated models for longitudinal data, little attention has been given to high-dimensional multivariate zero-inflated data modeled via a general correlation structure. Through simulation, we compared performance of the proposed method to that of existing univariate approaches, for both the binary and count parts of the model. When outcomes were correlated the proposed variable selection method maintained type I error while boosting the ability to identify true associations in the binary component of the model. For the count part of the model, in some scenarios the the univariate method had higher power than the multivariate approach. This higher power was at a cost of a highly inflated false discovery rate not observed with the proposed multivariate method. We applied the approach to oral microbiome data from the Pediatric HIV/AIDS Cohort Oral Health Study and identified five species (of 44) associated with HIV infection.
Submission history
From: Kyu Ha Lee [view email][v1] Wed, 1 Nov 2017 01:34:02 UTC (3,924 KB)
[v2] Fri, 3 Nov 2017 16:01:08 UTC (3,924 KB)
[v3] Sun, 20 May 2018 16:20:10 UTC (3,923 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.