June 2023 Truncated rank-based tests for two-part models with excessive zeros and applications to microbiome data
Wanjie Wang, Eric Chen, Hongzhe Li
Author Affiliations +
Ann. Appl. Stat. 17(2): 1663-1680 (June 2023). DOI: 10.1214/22-AOAS1688

Abstract

High-throughput sequencing technology allows us to test the compositional difference of bacteria in different populations. One important feature of human microbiome data is that it often includes a large number of zeros. Such data can be treated as being generated from a two-part model that includes a zero-point mass. Motivated by analysis of such nonnegative data with excessive zeros, we introduce several truncated rank-based two-group and multigroup tests, including a truncated rank-based Wilcoxon rank-sum test for two-group comparison and two truncated Kruskal–Wallis tests for multigroup comparisons. We show, both analytically through asymptotic relative efficiency analysis and by simulations, that the proposed tests have higher power than the standard rank-based tests in typical microbiome data settings, especially when the proportion of zeros in the data is high. The tests can also be applied to repeated measurements of compositional data via simple within-subject permutations. In a simple before-and-after treatment experiment, the within-subject permutation is similar to the paired rank test. However, the proposed tests handle the excessive zeros which leads to a better power. We apply the tests to compare the microbiome compositions of healthy children and pediatric Crohn’s disease patients and to assess the treatment effects on microbiome compositions. We identify several bacterial genera that are missed by the standard rank-based tests.

Funding Statement

This research was supported by NIH Grants GM129781 and GM123056.

Acknowledgments

We thank Dr. Kafadar, the Aassociate Editor and two reviewers for many helpful comments and suggestions.

Citation

Download Citation

Wanjie Wang. Eric Chen. Hongzhe Li. "Truncated rank-based tests for two-part models with excessive zeros and applications to microbiome data." Ann. Appl. Stat. 17 (2) 1663 - 1680, June 2023. https://doi.org/10.1214/22-AOAS1688

Information

Received: 1 October 2021; Revised: 1 August 2022; Published: June 2023
First available in Project Euclid: 1 May 2023

MathSciNet: MR4582729
zbMATH: 07692399
Digital Object Identifier: 10.1214/22-AOAS1688

Keywords: Asymptotic relative efficiency , differential abundance analysis , truncation , two-part model

Rights: Copyright © 2023 Institute of Mathematical Statistics

JOURNAL ARTICLE
18 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.17 • No. 2 • June 2023
Back to Top