December 2022 Testing for differential abundance in compositional counts data, with application to microbiome studies
Barak Brill, Amnon Amir, Ruth Heller
Author Affiliations +
Ann. Appl. Stat. 16(4): 2648-2671 (December 2022). DOI: 10.1214/22-AOAS1607

Abstract

Identifying which taxa in our microbiota are associated with traits of interest is important for advancing science and health. However, the identification is challenging because the measured vector of taxa counts (by amplicon sequencing) is compositional, so a change in the abundance of one taxon in the microbiota induces a change in the number of sequenced counts across all taxa. The data are typically sparse, with many zero counts present either due to biological variance or limited sequencing depth. We examine the case of Crohn’s disease, where the microbial load changes substantially with the disease. For this representative example of a highly compositional setting, we show existing methods designed to identify differentially abundant taxa may have an inflated number of false positives. We introduce a novel nonparametric approach that provides valid inference, even when the fraction of zero counts is substantial. Our approach uses a set of reference taxa that are nondifferentially abundant which can be estimated from the data or from outside information. Our approach also allows for a novel type of testing: multivariate tests of differential abundance over a focused subset of the taxa. Genera-level multivariate testing discovers additional genera as differentially abundant by avoiding agglomeration of taxa.

Funding Statement

The work of Barak Brill and Ruth Heller was funded by ISF Grant 1049/16.

Acknowledgments

The authors would like to thank the anonymous referees, an Associate Editor, and the Editor for their constructive comments that improved the quality of this paper.

Citation

Download Citation

Barak Brill. Amnon Amir. Ruth Heller. "Testing for differential abundance in compositional counts data, with application to microbiome studies." Ann. Appl. Stat. 16 (4) 2648 - 2671, December 2022. https://doi.org/10.1214/22-AOAS1607

Information

Received: 1 August 2020; Revised: 1 January 2022; Published: December 2022
First available in Project Euclid: 26 September 2022

MathSciNet: MR4489227
zbMATH: 1498.62194
Digital Object Identifier: 10.1214/22-AOAS1607

Keywords: analysis of composition , Compositional bias , nonparametric tests , normalization , rarefaction

Rights: Copyright © 2022 Institute of Mathematical Statistics

JOURNAL ARTICLE
24 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.16 • No. 4 • December 2022
Back to Top