## The Annals of Statistics

- Ann. Statist.
- Volume 45, Number 5 (2017), 1863-1894.

### Confounder adjustment in multiple hypothesis testing

Jingshu Wang, Qingyuan Zhao, Trevor Hastie, and Art B. Owen

#### Abstract

We consider large-scale studies in which thousands of significance tests are performed simultaneously. In some of these studies, the multiple testing procedure can be severely biased by latent confounding factors such as batch effects and unmeasured covariates that correlate with both primary variable(s) of interest (e.g., treatment variable, phenotype) and the outcome. Over the past decade, many statistical methods have been proposed to adjust for the confounders in hypothesis testing. We unify these methods in the same framework, generalize them to include multiple primary variables and multiple nuisance variables, and analyze their statistical properties. In particular, we provide theoretical guarantees for RUV-4 [Gagnon-Bartsch, Jacob and Speed (2013)] and LEAPP [*Ann. Appl. Stat.* **6** (2012) 1664–1688], which correspond to two different identification conditions in the framework: the first requires a set of “negative controls” that are known a priori to follow the null distribution; the second requires the true nonnulls to be sparse. Two different estimators which are based on RUV-4 and LEAPP are then applied to these two scenarios. We show that if the confounding factors are strong, the resulting estimators can be asymptotically as powerful as the oracle estimator which observes the latent confounding factors. For hypothesis testing, we show the asymptotic $z$-tests based on the estimators can control the type I error. Numerical experiments show that the false discovery rate is also controlled by the Benjamini–Hochberg procedure when the sample size is reasonably large.

#### Article information

**Source**

Ann. Statist., Volume 45, Number 5 (2017), 1863-1894.

**Dates**

Received: August 2015

Revised: January 2016

First available in Project Euclid: 31 October 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1509436821

**Digital Object Identifier**

doi:10.1214/16-AOS1511

**Mathematical Reviews number (MathSciNet)**

MR3718155

**Zentralblatt MATH identifier**

06821112

**Subjects**

Primary: 62J15: Paired and multiple comparisons

Secondary: 62H25: Factor analysis and principal components; correspondence analysis

**Keywords**

Empirical null surrogate variable analysis unwanted variation batch effect robust regression

#### Citation

Wang, Jingshu; Zhao, Qingyuan; Hastie, Trevor; Owen, Art B. Confounder adjustment in multiple hypothesis testing. Ann. Statist. 45 (2017), no. 5, 1863--1894. doi:10.1214/16-AOS1511. https://projecteuclid.org/euclid.aos/1509436821

#### Supplemental materials

- Supplement to “Confounder adjustment in multiple hypothesis testing”. We provide detailed proof for the theoretical results in this paper and some additional numerical results.Digital Object Identifier: doi:10.1214/16-AOS1511SUPPSupplemental files are immediately available to subscribers. Non-subscribers gain access to supplemental files with the purchase of the article.