Open Access
December 2019 A simple, consistent estimator of SNP heritability from genome-wide association studies
Armin Schwartzman, Andrew J. Schork, Rong Zablocki, Wesley K. Thompson
Ann. Appl. Stat. 13(4): 2509-2538 (December 2019). DOI: 10.1214/19-AOAS1291

Abstract

Analysis of genome-wide association studies (GWAS) is characterized by a large number of univariate regressions where a quantitative trait is regressed on hundreds of thousands to millions of single-nucleotide polymorphism (SNP) allele counts, one at a time. This article proposes an estimator of the SNP heritability of the trait, defined here as the fraction of the variance of the trait explained by the SNPs in the study. The proposed GWAS heritability (GWASH) estimator is easy to compute, highly interpretable and is consistent as the number of SNPs and the sample size increase. More importantly, it can be computed from summary statistics typically reported in GWAS, not requiring access to the original data. The estimator takes full account of the linkage disequilibrium (LD) or correlation between the SNPs in the study through moments of the LD matrix, estimable from auxiliary datasets. Unlike other proposed estimators in the literature, we establish the theoretical properties of the GWASH estimator and obtain analytical estimates of the precision, allowing for power and sample size calculations for SNP heritability estimates and forming a firm foundation for future methodological development.

Citation

Download Citation

Armin Schwartzman. Andrew J. Schork. Rong Zablocki. Wesley K. Thompson. "A simple, consistent estimator of SNP heritability from genome-wide association studies." Ann. Appl. Stat. 13 (4) 2509 - 2538, December 2019. https://doi.org/10.1214/19-AOAS1291

Information

Received: 1 December 2017; Revised: 1 April 2019; Published: December 2019
First available in Project Euclid: 28 November 2019

zbMATH: 07160948
MathSciNet: MR4037439
Digital Object Identifier: 10.1214/19-AOAS1291

Keywords: high dimensional data , massively univariate regression , single nucleotide polymorphism , Summary statistics

Rights: Copyright © 2019 Institute of Mathematical Statistics

Vol.13 • No. 4 • December 2019
Back to Top