The Annals of Applied Statistics

Adaptive-weight burden test for associations between quantitative traits and genotype data with complex correlations

Xiaowei Wu, Ting Guan, Dajiang J. Liu, Luis G. León Novelo, and Dipankar Bandyopadhyay

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


High throughput sequencing has often been used to screen samples from pedigrees or with population structure, producing genotype data with complex correlations caused by both familial relation and linkage disequilibrium. With such data it is critical to account for these genotypic correlations when assessing the contribution of multiple variants by gene or pathway. Recognizing the limitations of existing association testing methods, we propose Adaptive-weight Burden Test (ABT), a retrospective, mixed model test for genetic association of quantitative traits on genotype data with complex correlations. This method makes full use of genotypic correlations across both samples and variants and adopts “data driven” weights to improve power. We derive the ABT statistic and its explicit distribution under the null hypothesis and demonstrate through simulation studies that it is generally more powerful than the fixed-weight burden test and family-based SKAT in various scenarios, controlling for the type I error rate. Further investigation reveals the connection of ABT with kernel tests, as well as the adaptability of its weights to the direction of genetic effects. The application of ABT is illustrated by a gene-based association analysis of fasting glucose using data from the NHLBI “Grand Opportunity” Exome Sequencing Project.

Article information

Ann. Appl. Stat., Volume 12, Number 3 (2018), 1558-1582.

Received: May 2016
Revised: August 2017
First available in Project Euclid: 11 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Genetic association test burden test kernel test adaptive weight complex genotypic correlation


Wu, Xiaowei; Guan, Ting; Liu, Dajiang J.; Novelo, Luis G. León; Bandyopadhyay, Dipankar. Adaptive-weight burden test for associations between quantitative traits and genotype data with complex correlations. Ann. Appl. Stat. 12 (2018), no. 3, 1558--1582. doi:10.1214/17-AOAS1121.

Export citation


  • Ansorge, W. J. (2009). Next-generation DNA sequencing techniques. New Biotechnol. 25 195–203.
  • Asimit, J. and Zeggini, E. (2010). Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 44 293–308.
  • Chen, H., Meigs, J. B. and Dupuis, J. (2013). Sequence kernel association test for quantitative traits in family samples. Genet. Epidemiol. 37 196–204.
  • Chun, H., Ballard, D. H., Cho, J. and Zhao, H. (2011). Identification of association between disease and multiple markers via sparse partial least-squares regression. Genet. Epidemiol. 35 479–486.
  • Fang, S., Zhang, S. and Sha, Q. (2014). Detecting association of rare variants by testing an optimally weighted combination of variants for quantitative traits in general families. Ann. Hum. Genet. 77 524–534.
  • Fuentes, M. (2006). Testing for separability of spatial-temporal covariance functions. J. Statist. Plann. Inference 136 447–466.
  • Gauderman, W. J., Murcray, C., Gilliland, F. and Conti, D. V. (2007). Testing association between disease and multiple SNPs in a candidate gene. Genet. Epidemiol. 31 383–395.
  • Han, F. and Pan, W. (2010). A data-adaptive sum test for disease association with multiple common or rare variants. Hum. Hered. 70 42–54.
  • Hansen, J., Rinnov, A., Krogh-Madsen, R., Fischer, C. P., Andreasen, A. S., Berg, R. M., Møller, K., Pedersen, B. K. and Plomgaard, P. (2013). Plasma follistatin is elevated in patients with type 2 diabetes: Relationship to hyperglycemia, hyperinsulinemia, and systemic low-grade inflammation. Diabetes/Metab. Res. Rev. 29 463–472.
  • Ingelsson, E., Langenberg, C., Hivert, M. F., Prokopenko, I., Lyssenko, V., Dupuis, J., Mägi, R., Sharp, S., Jackson, A. U., Assimes, T. L. et al. (2010). Detailed physiologic characterization reveals diverse mechanisms for novel genetic loci regulating glucose and insulin metabolism in humans. Diabetes 59 1266–1275.
  • Jakobsdottir, J. and McPeek, M. S. (2013). MASTOR: Mixed-model association mapping of quantitative traits in samples with related individuals. Am. J. Hum. Genet. 92 652–666.
  • Jiang, D. and McPeek, M. S. (2013). Robust rare variant association testing for quantitative traits in samples with related individuals. Genet. Epidemiol. 38 1–20.
  • Kwee, L. C., Liu, D., Lin, X., Ghosh, D. and Epstein, M. P. (2008). A powerful and flexible multilocus association test for quantitative traits. Am. J. Hum. Genet. 82 386–397.
  • Lee, S., Wu, M. C. and Lin, X. (2012). Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13 762–775.
  • Lee, S., Emond, M. J., Bamshad, M. J., Barnes, K. C., Rieder, M. J., Nickerson, D. A., NHLBI GO Exome Sequencing Project-ESP Lung Project Team, Christiani, D. C., Wurfel, M. M. and Lin, X. (2012). Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91 224–237.
  • Li, B. and Leal, S. M. (2008). Methods for detecting associations with rare variants for common diseases: Application to analysis of sequence data. Am. J. Hum. Genet. 83 311–321.
  • Li, M. X., Gui, H. S., Kwan, J. S. and Sham, P. C. (2011). GATES: A rapid and powerful gene-based association test using extended simes procedure. Am. J. Hum. Genet. 88 283–293.
  • Lin, D. Y. and Tang, Z. Z. (2011). A general framework for detecting disease associations with rare variants in sequencing studies. Am. J. Hum. Genet. 89 354–367.
  • Liu, D. J. and Leal, S. M. (2010). A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet. 6 e1001156.
  • Ma, L., Clark, A. G. and Keinan, A. (2013). Gene-based testing of interactions in association studies of quantitative traits. PLoS Genet. 9 e1003321.
  • Madsen, B. E. and Browning, S. R. (2009). A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5 e1000384.
  • McPeek, M. S., Wu, X. and Ober, C. (2004). Best linear unbiased allele-frequency estimation in complex pedigrees. Biometrics 60 359–367.
  • Melkersson, K. I., Scordo, M. G., Gunes, A. and Dahl, M.-L. (2007). Impact of CYP1A2 and CYP2D6 polymorphisms on drug metabolism and on insulin and lipid elevations and insulin resistance in clozapine-treated patients. J. Clin. Psychiatry 68 697–704.
  • Morgenthaler, S. and Thilly, W. G. (2007). A strategy to discover genes that carrymulti-allelic or mono-allelic risk for common diseases: A Cohort Allelic Sums Test (CAST). Mutation Research 615 28–56.
  • Neale, B. M., Rivas, M. A., Voight, B. F., Altshuler, D., Devlin, B., Orho-Melander, M., Kathiresan, S., Purcell, S. M., Roeder, K. and Daly, M. J. (2011). Testing for an unusual distribution of rare variants. PLoS Genet. 7 e1001322.
  • Palatini, P., Benetti, E., Mos, L., Garavelli, G., Mazzer, A., Cozzio, S., Fania, C. and Casiglia, E. (2015). Association of coffee consumption and CYP1A2 polymorphism with risk of impaired fasting glucose in hypertensive patients. Eur. J. Epidemiol. 30 209–217.
  • Price, A. L., Kryukov, G. V., de Bakker, P. I., Purcell, S. M., Staples, J., Wei, L. J. and Sunyaev, S. R. (2010a). Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86 832–838.
  • Price, A. L., Zaitlen, N. A., Reich, D. and Patterson, N. (2010b). New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11 459–463.
  • Qi, Q., Bray, G. A., Hu, F. B., Sacks, F. M. and Qi, L. (2012). Weight-loss diets modify glucose-dependent insulinotropic polypeptide receptor rs2287019 genotype effects on changes in body weight, fasting glucose, and insulin resistance: The preventing overweight using novel dietary strategies trial. Am. J. Clin. Nutr. 95 506–513.
  • Saxena, R., Hivert, M. F., Langenberg, C., Tanaka, T., Pankow, J. S., Vollenweider, P., Lyssenko, V., Bouatia-Naji, N., Dupuis, J., Jackson, A. U. et al. (2010). Genetic variation in GIPR influences the glucose and insulin responses to an oral glucose challenge. Nat. Genet. 42 142–148.
  • Schaid, D. J., McDonnell, S. K., Sinnwell, J. P. and Thibodeau, S. M. (2013). Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data. Genet. Epidemiol. 37 409–418.
  • Schifano, E. D., Epstein, M. P., Bielak, L. F., Jhun, M. A., Kardia, S. L. R., Peyser, P. A. and Lin, X. (2012). SNP set association analysis for familial data. Genet. Epidemiol. 36 797–810.
  • Sha, Q. and Zhang, S. (2014). A novel test for testing the optimally weighted combination of rare and common variants based on data of parents and affected children. Genet. Epidemiol. 38 135–143.
  • Sha, Q., Wang, X., Wang, X. and Zhang, S. (2012). Detecting association of rare and common variants by testing an optimally weighted combination of variants. Genet. Epidemiol. 36 561–571.
  • Shendure, J. and Ji, H. (2008). Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145.
  • Splansky, G. L., Corey, D., Yang, Q., Atwood, L. D., Cupples, L. A., Benjamin, E. J., D’Agostino Sr., R. B., Fox, C. S., Larson, M. G., Murabito, J. M. et al. (2007). The third generation cohort of the national heart, lung, and blood institute’s framingham heart study: Design, recruitment, and initial examination. Am. J. Epidemiol. 165 1328–1335.
  • The 1000 Genomes Project Consortium (2010). A map of human genome variation from population-scale sequencing. Nature 467 1061–1073.
  • Thornton, T. and McPeek, M. S. (2007). Case-control association testing with related individuals: A more powerful quasi-likelihood score test. Am. J. Hum. Genet. 81 321–337.
  • Thornton, T. and McPeek, M. S. (2010). ROADTRIPS: Case-control association testing with partially or completely unknown population and pedigree structure. Am. J. Hum. Genet. 86 172–184.
  • Wang, K. and Abbott, D. (2008). A principal components regression approach to multilocus genetic association studies. Genet. Epidemiol. 32 108–118.
  • Wang, Y., Chen, Y. H. and Yang, Q. (2012). Joint rare variant association test of the average and individual effects for sequencing studies. PLoS ONE 7 e32485.
  • Wang, T. and Elston, R. C. (2007). Improved power by use of a weighted score test for linkage disequilibrium mapping. Am. J. Hum. Genet. 80 353–360.
  • Wang, X., Morris, N. J., Zhu, X. and Elston, R. C. (2013a). A variance component based multi-marker association test using family and unrelated data. BMC Genet. 14 17.
  • Wang, X., Lee, S., Zhu, X., Redline, S. and Lin, X. (2013b). GEE-based SNP set association test for continuous and discrete traits in family based association studies. Genet. Epidemiol. 37 778–786.
  • Wu, M. C., Kraft, P., Epstein, M. P., Taylor, D. M., Chanock, S. J., Hunter, D. J. and Lin, X. (2010). Powerful SNP-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86 929–942.
  • Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89 82–93.
  • Wu, H., Wu, M., Chen, Y., Allan, C. A., Phillips, D. J. and Hedger, M. P. (2012). Correlation between blood activin levels and clinical parameters of type 2 diabetes. Exp. Diabetes Res. 2012 410579.
  • Wu, X., Guan, T., Liu, D. J., Novelo, L. G. and Bandyopadhyay, D. (2018). Supplement to “Adaptive-weight burden test for associations between quantitative traits and genotype data with complex correlations.” DOI:10.1214/17-AOAS1121SUPP.

Supplemental materials

  • Mathematical justifications and additional results. The supplementary materials of the paper are organized as follows. Supplement S.1 provides the theoretical justification of the covariance matrix of genetic burden score $\boldsymbol{X}$. Supplement S.2 derives the LD covariance of the simulated genotype data for founders. In Supplement S.3, additional results from Section 3.3 on the empirical type-I error of ABT based on $\chi_{m}^{2}$ null distribution in simulation studies are summarized in Table S1. Supplement S.4 includes additional power comparison results at $\alpha=0.01$, for Scenarios II, IV, V, and IX.