The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 8, Number 1 (2014), 481-498.
Replicability analysis for genome-wide association studies
The paramount importance of replicating associations is well recognized in the genome-wide associaton (GWA) research community, yet methods for assessing replicability of associations are scarce. Published GWA studies often combine separately the results of primary studies and of the follow-up studies. Informally, reporting the two separate meta-analyses, that of the primary studies and follow-up studies, gives a sense of the replicability of the results. We suggest a formal empirical Bayes approach for discovering whether results have been replicated across studies, in which we estimate the optimal rejection region for discovering replicated results. We demonstrate, using realistic simulations, that the average false discovery proportion of our method remains small. We apply our method to six type two diabetes (T2D) GWA studies. Out of 803 SNPs discovered to be associated with T2D using a typical meta-analysis, we discovered 219 SNPs with replicated associations with T2D. We recommend complementing a meta-analysis with a replicability analysis for GWA studies.
Ann. Appl. Stat., Volume 8, Number 1 (2014), 481-498.
First available in Project Euclid: 8 April 2014
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Heller, Ruth; Yekutieli, Daniel. Replicability analysis for genome-wide association studies. Ann. Appl. Stat. 8 (2014), no. 1, 481--498. doi:10.1214/13-AOAS697. https://projecteuclid.org/euclid.aoas/1396966295
- Supplementary material: Supplementary material for replicability analysis for genome-wide association studies. Supplementary material includes the proof of Proposition 3.1, additional numerical examples that demonstrate the difference between optimal rejection regions and the loss in power that occurs when the rejection region is chosen suboptimally based on $p$-values, discussion of the necessity to specify the direction of the alternative for estimation of the local Bayes FDRs, technical details of the EM algorithm, the full table of results for the T2D example, the figure of empirical $z$-scores for the T2D studies example, and an additional figure of simulation results.