Open Access
February 2020 Data Denoising and Post-Denoising Corrections in Single Cell RNA Sequencing
Divyansh Agarwal, Jingshu Wang, Nancy R. Zhang
Statist. Sci. 35(1): 112-128 (February 2020). DOI: 10.1214/19-STS7560


Single cell sequencing technologies are transforming biomedical research. However, due to the inherent nature of the data, single cell RNA sequencing analysis poses new computational and statistical challenges. We begin with a survey of a selection of topics in this field, with a gentle introduction to the biology and a more detailed exploration of the technical noise. We consider in detail the problem of single cell data denoising, sometimes referred to as “imputation” in the relevant literature. We discuss why this is not a typical statistical imputation problem, and review current approaches to this problem. We then explore why the use of denoised values in downstream analyses invites novel statistical insights, and how denoising uncertainty should be accounted for to yield valid statistical inference. The utilization of denoised or imputed matrices in statistical inference is not unique to single cell genomics, and arises in many other fields. We describe the challenges in this type of analysis, discuss some preliminary solutions, and highlight unresolved issues.


Download Citation

Divyansh Agarwal. Jingshu Wang. Nancy R. Zhang. "Data Denoising and Post-Denoising Corrections in Single Cell RNA Sequencing." Statist. Sci. 35 (1) 112 - 128, February 2020.


Published: February 2020
First available in Project Euclid: 3 March 2020

MathSciNet: MR4071361
Digital Object Identifier: 10.1214/19-STS7560

Keywords: deep learning , Empirical Bayes , imputation , post-denoising inference , RNA sequencing , Single cell biology

Rights: Copyright © 2020 Institute of Mathematical Statistics

Vol.35 • No. 1 • February 2020
Back to Top