The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 7, Number 1 (2013), 369-390.
Efficient computation with a linear mixed model on large-scale data sets with applications to genetic studies
Motivated by genome-wide association studies, we consider a standard linear model with one additional random effect in situations where many predictors have been collected on the same subjects and each predictor is analyzed separately. Three novel contributions are (1) a transformation between the linear and log-odds scales which is accurate for the important genetic case of small effect sizes; (2) a likelihood-maximization algorithm that is an order of magnitude faster than the previously published approaches; and (3) efficient methods for computing marginal likelihoods which allow Bayesian model comparison. The methodology has been successfully applied to a large-scale association study of multiple sclerosis including over 20,000 individuals and 500,000 genetic variants.
Ann. Appl. Stat., Volume 7, Number 1 (2013), 369-390.
First available in Project Euclid: 9 April 2013
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Pirinen, Matti; Donnelly, Peter; Spencer, Chris C. A. Efficient computation with a linear mixed model on large-scale data sets with applications to genetic studies. Ann. Appl. Stat. 7 (2013), no. 1, 369--390. doi:10.1214/12-AOAS586. https://projecteuclid.org/euclid.aoas/1365527203
- Supplementary material: Supplementary text. In this supplement we give the details of the application of the mixed model to binary data, of the conditional maximization of the likelihood function and of the Bayesian computations.