- Statist. Sci.
- Volume 35, Number 1 (2020), 112-128.
Data Denoising and Post-Denoising Corrections in Single Cell RNA Sequencing
Single cell sequencing technologies are transforming biomedical research. However, due to the inherent nature of the data, single cell RNA sequencing analysis poses new computational and statistical challenges. We begin with a survey of a selection of topics in this field, with a gentle introduction to the biology and a more detailed exploration of the technical noise. We consider in detail the problem of single cell data denoising, sometimes referred to as “imputation” in the relevant literature. We discuss why this is not a typical statistical imputation problem, and review current approaches to this problem. We then explore why the use of denoised values in downstream analyses invites novel statistical insights, and how denoising uncertainty should be accounted for to yield valid statistical inference. The utilization of denoised or imputed matrices in statistical inference is not unique to single cell genomics, and arises in many other fields. We describe the challenges in this type of analysis, discuss some preliminary solutions, and highlight unresolved issues.
Statist. Sci., Volume 35, Number 1 (2020), 112-128.
First available in Project Euclid: 3 March 2020
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Agarwal, Divyansh; Wang, Jingshu; Zhang, Nancy R. Data Denoising and Post-Denoising Corrections in Single Cell RNA Sequencing. Statist. Sci. 35 (2020), no. 1, 112--128. doi:10.1214/19-STS7560. https://projecteuclid.org/euclid.ss/1583226032
- Supplement to “Data Denoising and Post-Denoising Corrections in Single Cell RNA Sequencing”. Supplementary information.