The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 4, Number 3 (2010), 1342-1364.
Bayesian model search and multilevel inference for SNP association studies
Technological advances in genotyping have given rise to hypothesis-based association studies of increasing scope. As a result, the scientific hypotheses addressed by these studies have become more complex and more difficult to address using existing analytic methodologies. Obstacles to analysis include inference in the face of multiple comparisons, complications arising from correlations among the SNPs (single nucleotide polymorphisms), choice of their genetic parametrization and missing data. In this paper we present an efficient Bayesian model search strategy that searches over the space of genetic markers and their genetic parametrization. The resulting method for Multilevel Inference of SNP Associations, MISA, allows computation of multilevel posterior probabilities and Bayes factors at the global, gene and SNP level, with the prior distribution on SNP inclusion in the model providing an intrinsic multiplicity correction. We use simulated data sets to characterize MISA’s statistical power, and show that MISA has higher power to detect association than standard procedures. Using data from the North Carolina Ovarian Cancer Study (NCOCS), MISA identifies variants that were not identified by standard methods and have been externally “validated” in independent studies. We examine sensitivity of the NCOCS results to prior choice and method for imputing missing data. MISA is available in an R package on CRAN.
Ann. Appl. Stat. Volume 4, Number 3 (2010), 1342-1364.
First available in Project Euclid: 18 October 2010
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Wilson, Melanie A.; Iversen, Edwin S.; Clyde, Merlise A.; Schmidler, Scott C.; Schildkraut, Joellen M. Bayesian model search and multilevel inference for SNP association studies. Ann. Appl. Stat. 4 (2010), no. 3, 1342--1364. doi:10.1214/09-AOAS322. https://projecteuclid.org/euclid.aoas/1287409376.
- Supplementary material: Bayesian model search and multilevel inference for SNP association studies: Supplementary materials. In this supplement we provide details for: (1) Derivation of the implied prior distribution on the regression coefficients when AIC is used to approximate the marginal likelihood in logistic regression, (2) Description of the marginal Bayes factor screen used to reduce the number of SNPs in the MISA analysis, (3) Details of how the simulated genetic data sets used in the power analysis of MISA were created and information on the statistical software we developed for this purpose, and (4) Location of the freely available software resources referred to in this and the parent document.