Open Access
December 2017 A unified framework for variance component estimation with summary statistics in genome-wide association studies
Xiang Zhou
Ann. Appl. Stat. 11(4): 2027-2051 (December 2017). DOI: 10.1214/17-AOAS1052

Abstract

Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs—the restricted maximum likelihood estimation method (REML)—suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods—the renowned Haseman–Elston (HE) regression and the recent LD score regression (LDSC)—into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal $z$-scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.

Citation

Download Citation

Xiang Zhou. "A unified framework for variance component estimation with summary statistics in genome-wide association studies." Ann. Appl. Stat. 11 (4) 2027 - 2051, December 2017. https://doi.org/10.1214/17-AOAS1052

Information

Received: 1 November 2016; Revised: 1 March 2017; Published: December 2017
First available in Project Euclid: 28 December 2017

zbMATH: 1383.62305
MathSciNet: MR3743287
Digital Object Identifier: 10.1214/17-AOAS1052

Keywords: Genome-wide association studies , linear mixed model , method of moments , MINQUE , Summary statistics , variance component

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.11 • No. 4 • December 2017
Back to Top