December 2021 Zero-inflated quantile rank-score based test (ZIQRank) with application to scRNA-seq differential gene expression analysis
Wodan Ling, Wenfei Zhang, Bin Cheng, Ying Wei
Author Affiliations +
Ann. Appl. Stat. 15(4): 1673-1696 (December 2021). DOI: 10.1214/21-AOAS1442

Abstract

Differential gene expression analysis based on scRNA-seq data is challenging due to two unique characteristics of scRNA-seq data. First, multimodality and other heterogeneity of the gene expression among different cell conditions lead to divergences in the tail events or crossings of the expression distributions. Second, scRNA-seq data generally have a considerable fraction of dropout events, causing zero inflation in the expression. To account for the first characteristic, existing parametric approaches targeting the mean difference in gene expression are limited, while quantile regression that examines various locations in the distribution will improve the power. However, the second characteristic, zero inflation, makes the traditional quantile regression invalid and underpowered. We propose a quantile-based test that handles the two characteristics, multimodality and zero inflation, simultaneously. The proposed quantile rank-score based test for differential distribution detection (ZIQRank) is derived under a two-part quantile regression model for zero-inflated outcomes. It comprises a test in logistic modeling for the zero counts and a collection of rank-score tests adjusting for zero inflation at multiple prespecified quantiles of the positive part. The testing decision is based on an aggregate result by combining the marginal p-values by MinP or Cauchy procedure. The proposed test is asymptotically justified and evaluated with simulation studies. It shows a higher precision-recall AUC in detecting true differentially expressed genes (DEGs) than the existing methods. We apply the ZIQRank test to a TPM scRNA-seq data on human glioblastoma tumors and exclusively identify a group of DEGs between neoplastic and nonneoplastic cells, which are heterogeneous and have been proved to be associated with glioma. Application to a UMI count scRNA-seq data on cells from mouse intestinal organoids further demonstrates the capability of ZIQRank to improve and complement the existing approaches.

Acknowledgments

Wenfei Zhang is the corresponding author.

Citation

Download Citation

Wodan Ling. Wenfei Zhang. Bin Cheng. Ying Wei. "Zero-inflated quantile rank-score based test (ZIQRank) with application to scRNA-seq differential gene expression analysis." Ann. Appl. Stat. 15 (4) 1673 - 1696, December 2021. https://doi.org/10.1214/21-AOAS1442

Information

Received: 1 June 2020; Revised: 1 December 2020; Published: December 2021
First available in Project Euclid: 21 December 2021

MathSciNet: MR4355071
zbMATH: 1498.62233
Digital Object Identifier: 10.1214/21-AOAS1442

Keywords: Heterogeneity , multimodality , quantile rank-score based test , two-part model

Rights: Copyright © 2021 Institute of Mathematical Statistics

JOURNAL ARTICLE
24 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.15 • No. 4 • December 2021
Back to Top