Open Access
December 2017 Semiparametric covariate-modulated local false discovery rate for genome-wide association studies
Rong W. Zablocki, Richard A. Levine, Andrew J. Schork, Shujing Xu, Yunpeng Wang, Chun C. Fan, Wesley K. Thompson
Ann. Appl. Stat. 11(4): 2252-2269 (December 2017). DOI: 10.1214/17-AOAS1077


While genome-wide association studies (GWAS) have discovered thousands of risk loci for heritable disorders, so far even very large meta-analyses have recovered only a fraction of the heritability of most complex traits. Recent work utilizing variance components models has demonstrated that a larger fraction of the heritability of complex phenotypes is captured by the additive effects of SNPs than is evident only in loci surpassing genome-wide significance thresholds, typically set at a Bonferroni-inspired $p\le5\times10^{-8}$. Procedures that control false discovery rate can be more powerful, yet these are still under-powered to detect the majority of nonnull effects from GWAS. The current work proposes a novel Bayesian semiparametric two-group mixture model and develops a Markov Chain Monte Carlo (MCMC) algorithm for a covariate-modulated local false discovery rate (cmfdr). The probability of being nonnull depends on a set of covariates via a logistic function, and the nonnull distribution is approximated as a linear combination of B-spline densities, where the weight of each B-spline density depends on a multinomial function of the covariates. The proposed methods were motivated by work on a large meta-analysis of schizophrenia GWAS performed by the Psychiatric Genetics Consortium (PGC). We show that the new cmfdr model fits the PGC schizophrenia GWAS test statistics well, performing better than our previously proposed parametric gamma model for estimating the nonnull density and substantially improving power over usual fdr. Using loci declared significant at cmfdr $\le0.20$, we perform follow-up pathway analyses using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Homo sapiens pathways database. We demonstrate that the increased yield from the cmfdr model results in an improved ability to test for pathways associated with schizophrenia compared to using those SNPs selected according to usual fdr.


Download Citation

Rong W. Zablocki. Richard A. Levine. Andrew J. Schork. Shujing Xu. Yunpeng Wang. Chun C. Fan. Wesley K. Thompson. "Semiparametric covariate-modulated local false discovery rate for genome-wide association studies." Ann. Appl. Stat. 11 (4) 2252 - 2269, December 2017.


Received: 1 September 2016; Revised: 1 June 2017; Published: December 2017
First available in Project Euclid: 28 December 2017

zbMATH: 1383.62301
MathSciNet: MR3743296
Digital Object Identifier: 10.1214/17-AOAS1077

Keywords: Bayesian mixture model , B-spline densities , genome-wide association study , mixture of experts , multiple-comparison procedures

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.11 • No. 4 • December 2017
Back to Top