Open Access
September 2020 A Bayesian model of microbiome data for simultaneous identification of covariate associations and prediction of phenotypic outcomes
Matthew D. Koslovsky, Kristi L. Hoffman, Carrie R. Daniel, Marina Vannucci
Ann. Appl. Stat. 14(3): 1471-1492 (September 2020). DOI: 10.1214/20-AOAS1354

Abstract

One of the major research questions regarding human microbiome studies is the feasibility of designing interventions that modulate the composition of the microbiome to promote health and to cure disease. This requires extensive understanding of the modulating factors of the microbiome, such as dietary intake, as well as the relation between microbial composition and phenotypic outcomes, such as body mass index (BMI). Previous efforts have modeled these data separately, employing two-step approaches that can produce biased interpretations of the results. Here, we propose a Bayesian joint model that simultaneously identifies clinical covariates associated with microbial composition data and predicts a phenotypic response using information contained in the compositional data. Using spike-and-slab priors, our approach can handle high-dimensional compositional as well as clinical data. Additionally, we accommodate the compositional structure of the data via balances and overdispersion typically found in microbial samples. We apply our model to understand the relations between dietary intake, microbial samples and BMI. In this analysis we find numerous associations between microbial taxa and dietary factors that may lead to a microbiome that is generally more hospitable to the development of chronic diseases, such as obesity. Additionally, we demonstrate on simulated data how our method outperforms two-step approaches and also present a sensitivity analysis.

Citation

Download Citation

Matthew D. Koslovsky. Kristi L. Hoffman. Carrie R. Daniel. Marina Vannucci. "A Bayesian model of microbiome data for simultaneous identification of covariate associations and prediction of phenotypic outcomes." Ann. Appl. Stat. 14 (3) 1471 - 1492, September 2020. https://doi.org/10.1214/20-AOAS1354

Information

Received: 1 April 2020; Published: September 2020
First available in Project Euclid: 18 September 2020

MathSciNet: MR4152142
Digital Object Identifier: 10.1214/20-AOAS1354

Keywords: Bayesian statistics , joint modeling , multivariate count data , prediction , Variable selection

Rights: Copyright © 2020 Institute of Mathematical Statistics

Vol.14 • No. 3 • September 2020
Back to Top