Open Access
September 2016 Multiple testing under dependence via graphical models
Jie Liu, Chunming Zhang, David Page
Ann. Appl. Stat. 10(3): 1699-1724 (September 2016). DOI: 10.1214/16-AOAS956

Abstract

Large-scale multiple testing tasks often exhibit dependence. Leveraging the dependence between individual tests is still one challenging and important problem in statistics. With recent advances in graphical models, it is feasible to use them to capture the dependence among multiple hypotheses. We propose a multiple testing procedure which is based on a Markov-random-field-coupled mixture model. The underlying true states of hypotheses are represented by a latent binary Markov random field, and the observed test statistics appear as the coupled mixture variables. The model can be learned by a novel EM algorithm. The next step is to infer the posterior probability that each hypothesis is null (termed local index of significance), and the false discovery rate can be controlled accordingly. We also provide a semiparametric variation of the graphical model which is useful in the situation where $f_{1}$ (the density function of the test statistic under the alternative hypothesis) is heterogeneous among multiple hypotheses. This semiparametric approach exactly generalizes the local FDR procedure [J. Amer. Statist. Assoc. 96 (2001) 1151–1160] and connects with the BH procedure [J. Roy. Statist. Soc. Ser. B 57 (1995) 289–300]. Simulations show that the numerical performance of multiple testing can be improved substantially by using our procedure. We apply the procedure to a real-world genome-wide association study on breast cancer, and we identify several SNPs with strong association evidence.

Citation

Download Citation

Jie Liu. Chunming Zhang. David Page. "Multiple testing under dependence via graphical models." Ann. Appl. Stat. 10 (3) 1699 - 1724, September 2016. https://doi.org/10.1214/16-AOAS956

Information

Received: 1 September 2014; Revised: 1 May 2016; Published: September 2016
First available in Project Euclid: 28 September 2016

zbMATH: 06775283
MathSciNet: MR3553241
Digital Object Identifier: 10.1214/16-AOAS956

Keywords: genome-wide association study , graphical models , local index of significance , Markov random field , Multiple testing under dependence

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.10 • No. 3 • September 2016
Back to Top