Statistical Science

Structures and Assumptions: Strategies to Harness Gene × Gene and Gene × Environment Interactions in GWAS

Charles Kooperberg, Michael LeBlanc, James Y. Dai, and Indika Rajapakse

Genome-wide association studies, in which as many as a million single nucleotide polymorphisms (SNP) are measured on several thousand samples, are quickly becoming a common type of study for identifying genetic factors associated with many phenotypes. There is a strong assumption that interactions between SNPs or genes and interactions between genes and environmental factors substantially contribute to the genetic risk of a disease. Identification of such interactions could potentially lead to increased understanding about disease mechanisms; drug × gene interactions could have profound applications for personalized medicine; strong interaction effects could be beneficial for risk prediction models. In this paper we provide an overview of different approaches to model interactions, emphasizing approaches that make specific use of the structure of genetic data, and those that make specific modeling assumptions that may (or may not) be reasonable to make. We conclude that to identify interactions it is often necessary to do some selection of SNPs, for example, based on prior hypothesis or marginal significance, but that to identify SNPs that are marginally associated with a disease it may also be useful to consider larger numbers of interactions.

Article information

Statist. Sci. Volume 24, Number 4 (2009), 472-488.

First available: 20 April 2010

Kooperberg, Charles; LeBlanc, Michael; Dai, James Y.; Rajapakse, Indika. Structures and Assumptions: Strategies to Harness Gene × Gene and Gene × Environment Interactions in GWAS. Statistical Science 24 (2009), no. 4, 472--488. doi:10.1214/09-STS287.

