The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 12, Number 1 (2018), 567-585.
Powerful test based on conditional effects for genome-wide screening
This paper considers testing procedures for screening large genome-wide data, where we examine hundreds of thousands of genetic variants, for example, single nucleotide polymorphisms (SNP), on a quantitative phenotype. We screen the whole genome by SNP sets and propose a new test that is based on conditional effects from multiple SNPs. The test statistic is developed for weak genetic effects and incorporates correlations among genetic variables, which may be very high due to linkage disequilibrium. The limiting null distribution of the test statistic and the power of the test are derived. Under appropriate conditions, the test is shown to be more powerful than the minimum $p$-value method, which is based on marginal SNP effects and is the most commonly used method in genome-wide screening. The proposed test is also compared with other existing methods, including the Higher Criticism (HC) test and the sequence kernel association test (SKAT), through simulations and analysis of a real genome data set. For typical genome-wide data, where effects of individual SNPs are weak and correlations among SNPs are high, the proposed test is more advantageous and clearly outperforms the other methods in the literature.
Ann. Appl. Stat. Volume 12, Number 1 (2018), 567-585.
Received: May 2016
Revised: March 2017
First available in Project Euclid: 9 March 2018
Permanent link to this document
Digital Object Identifier
Liu, Yaowu; Xie, Jun. Powerful test based on conditional effects for genome-wide screening. Ann. Appl. Stat. 12 (2018), no. 1, 567--585. doi:10.1214/17-AOAS1103. https://projecteuclid.org/euclid.aoas/1520564484
- Supplement to “Powerful test based on conditional effects for genome-wide screening”. The supplementary material contains (1) technical lemmas and their proofs; (2) the proofs of all theorems; (3) additional table and figures regarding simulation results under constant effect magnitude and sparsity parameter $\gamma=1/4$, simulations using real genotype data, the stability of the real data analysis result and the conservativeness of $p$-value calculation based on asymptotic null distribution.