The Annals of Applied Statistics

Bayesian methods for genetic association analysis with heterogeneous subgroups: From meta-analyses to gene–environment interactions

Xiaoquan Wen and Matthew Stephens

Full-text: Open access


Genetic association analyses often involve data from multiple potentially-heterogeneous subgroups. The expected amount of heterogeneity can vary from modest (e.g., a typical meta-analysis) to large (e.g., a strong gene–environment interaction). However, existing statistical tools are limited in their ability to address such heterogeneity. Indeed, most genetic association meta-analyses use a “fixed effects” analysis, which assumes no heterogeneity. Here we develop and apply Bayesian association methods to address this problem. These methods are easy to apply (in the simplest case, requiring only a point estimate for the genetic effect and its standard error, from each subgroup) and effectively include standard frequentist meta-analysis methods, including the usual “fixed effects” analysis, as special cases. We apply these tools to two large genetic association studies: one a meta-analysis of genome-wide association studies from the Global Lipids consortium, and the second a cross-population analysis for expression quantitative trait loci (eQTLs). In the Global Lipids data we find, perhaps surprisingly, that effects are generally quite homogeneous across studies. In the eQTL study we find that eQTLs are generally shared among different continental groups, and discuss consequences of this for study design.

Article information

Ann. Appl. Stat. Volume 8, Number 1 (2014), 176-203.

First available in Project Euclid: 8 April 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Meta-analysis gene–environment interaction Bayes factor Bayesian hypothesis testing heterogeneity


Wen, Xiaoquan; Stephens, Matthew. Bayesian methods for genetic association analysis with heterogeneous subgroups: From meta-analyses to gene–environment interactions. Ann. Appl. Stat. 8 (2014), no. 1, 176--203. doi:10.1214/13-AOAS695.

Export citation


  • Bravata, D. and Olkin, I. (2001). Simple pooling versus combining in meta-analysis. Eval. Health Prof. 24 218–230.
  • Brown, C., Mangravite, L. M. and Engelhardt, B. E. (2012). Integrative modeling of eQTLs and cis-regulatory elements suggest mechanisms underlying cell type specifcity of eQTLs. Preprint. Available at arXiv:1210.3294.
  • Burgess, S., Thompson, S. G. and Andrews, G. et al. (2010). Bayesian methods for meta-analysis of causal relationships estimated using genetic instrumental variables. Stat. Med. 29 1298–1311.
  • Butler, R. W. and Wood, A. T. A. (2002). Laplace approximations for hypergeometric functions with matrix argument. Ann. Statist. 30 1155–1177.
  • De Iorio, M., Newcombe, P. J., Tachmazidou, I., Verzilli, C. J. and Whittaker, J. C. (2011). Bayesian semiparametric meta-analysis for genetic association studies. Genet. Epidemiol. 35 333–340.
  • Dimas, A. S., Deutsch, S., Stranger, B. E., Montgomery, S. B., Borel, C. et al. (2009). Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325 1246–1250.
  • DuMouchel, W. H. and Harris, J. E. (1983). Bayes methods for combining the results of cancer studies in humans and other species. J. Amer. Statist. Assoc. 78 293–315.
  • Durbin, R. M., Altshuler, D. L., Abecasis, G. R., Bentley, D. R., Chakravarti, A. et al. (2010). A map of human genome variation from population-scale sequencing. Nature 467 1061–1073.
  • Eddy, D. M., Hasselblad, V. and Schachter, R. (1990). A Bayesian method for synthesizing evidence. International Journal of Technical Assistance in Health Care 6 31–55.
  • Fledel-Alon, A., Leffler, E. M., Guan, Y., Stephens, M., Coop, G. et al. (2011). Variation in human recombination rates and its genetic determinants. PloS One 6 e20321.
  • Flutre, T., Wen, X., Pritchard, J. K. and Stephens, M. (2013). A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genetics 9 e1003486.
  • Gilad, Y., Rifkin, S. A. and Pritchard, J. K. (2008). Revealing the architecture of gene regulation: The promise of eQTL studies. Trends Genet. 24 408–415.
  • Givens, G. H., Smith, D. D. and Tweedie, R. L. (1997). Publication bias in meta-analysis: A Bayesian data-augmentation approach to account for issues exemplified in the passive smoking debate. Statist. Sci. 12 221–250.
  • Guan, Y. and Stephens, M. (2008). Practical issues in imputation-based association mapping. PLoS Genetics 4 e1000279.
  • Han, B. and Eskin, E. (2011). Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88 586–598.
  • Johnson, V. E. (2005). Bayes factors based on test statistics. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 689–701.
  • Johnson, V. E. (2008). Properties of Bayes factors based on test statistics. Scand. J. Stat. 35 354–368.
  • Kong, A., Thorleifsson, G., Stefansson, H., Masson, G. et al. (2008). Sequence variants in the RNF212 gene associate with genome-wide recombination rate. Science 319 1398–1401.
  • Lebrec, J. J., Stijnen, T. and van Houwelingen, H. C. (2010). Dealing with heterogeneity between cohorts in genomewide SNP association studies dealing with heterogeneity between cohorts in genomewide SNP association studies. Stat. Appl. Genet. Mol. Biol. 9 Art. 8, 22 pp.
  • Li, Z. and Begg, C. B. (1994). Random effects models for combining results from controlled and uncontrolled studies in a meta-analysis. J. Amer. Statist. Assoc. 89 1523–1527.
  • Mila, A. L. and Ngugi, H. K. (2011). A Bayesian approach to meta-analysis of plant pathology studies. Phytopathology 101 42–51.
  • Owen, A. B. (2009). Karl Pearson’s meta-analysis revisited. Ann. Statist. 37 3867–3892.
  • Pickrell, J. K., Marioni, J. C., Pai, A. A., Degner, J. F. et al. (2010). Understanding mechanisms underlying human gene expression variation with RNA sequencing Nature 464 768–772.
  • Servin, B. and Stephens, M. (2008). Imputation-based analysis of association studies: Candidate regions and quantitative traits. PLoS Genetics 3 e114.
  • Stangl, D. K. and Berry, D. A. (2000). Meta-Analysis in Medicine and Health Policy. Dekker, New York.
  • Stephens, M. (2013). A unified framework for association analysis with multiple related phenotypes. PLoS One 8 e65245.
  • Stranger, B. E., Nica, A. C., Forrest, M. S., Dimas, A., Bird, C. P. et al. (2007). Population genomics of human gene expression. Nat. Genet. 39 1217–1224.
  • Sutton, A. J. and Abrams, K. R. (2001). Bayesian methods in meta-analysis and evidence synthesis. Stat. Methods Med. Res. 10 277–303.
  • Teslovich, T. M., Musunuru, K., Smith, A. V., Edmondson, A. C., Stylianou, I. M. et al. (2010). Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466 707–713.
  • Verzilli, C. J., Shah, T., Casas, J. P., Chapman, J., Sandhu, M. et al. (2008). Bayesian meta-analysis of genetic association studies with different sets of markers. Am. J. Hum. Genet. 82 859–872.
  • Wakefield, J. (2009). Bayes factors for genome-wide association studies: Comparison with $P$-values. Genet. Epidemiol. 33 79–86.
  • Wen, X. (2011). Bayesian analysis of genetic association data, accounting for heterogeneity. Ph.D. thesis, Dept. Statistics, Univ. Chicago.
  • Wen, X. and Stephens, M. (2014). Supplement to “Bayesian methods for genetic association analysis with heterogeneous subgroups: From meta-analyses to gene–environment interactions.” DOI:10.1214/13-AOAS695SUPP.
  • Whitehead, A. and Whitehead, J. (1991). A general parametric approach to the meta-analysis of randomized clinical trials. Stat. Med. 10 1665–1677.
  • Willer, C. J., Li, Y. and Abecasis, G. R. (2010). METAL: Fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26 2190–2191.

Supplemental materials