Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 14, Number 1 (2020), 94-115.
Modeling microbial abundances and dysbiosis with beta-binomial regression
Using a sample from a population to estimate the proportion of the population with a certain category label is a broadly important problem. In the context of microbiome studies, this problem arises when researchers wish to use a sample from a population of microbes to estimate the population proportion of a particular taxon, known as the taxon’s relative abundance. In this paper, we propose a beta-binomial model for this task. Like existing models, our model allows for a taxon’s relative abundance to be associated with covariates of interest. However, unlike existing models, our proposal also allows for the overdispersion in the taxon’s counts to be associated with covariates of interest. We exploit this model in order to propose tests not only for differential relative abundance, but also for differential variability. The latter is particularly valuable in light of speculation that dysbiosis, the perturbation from a normal microbiome that can occur in certain disease conditions, may manifest as a loss of stability, or increase in variability, of the counts associated with each taxon. We demonstrate the performance of our proposed model using a simulation study and an application to soil microbial data.
Ann. Appl. Stat., Volume 14, Number 1 (2020), 94-115.
Received: January 2019
Revised: June 2019
First available in Project Euclid: 16 April 2020
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Martin, Bryan D.; Witten, Daniela; Willis, Amy D. Modeling microbial abundances and dysbiosis with beta-binomial regression. Ann. Appl. Stat. 14 (2020), no. 1, 94--115. doi:10.1214/19-AOAS1283. https://projecteuclid.org/euclid.aoas/1587002666
- Supplement A: corncob R package. We provide an R package implementing all methods proposed in this paper.
- Supplement B: Figure code. We provide code to reproduce all simulations and data analyses in this paper.