The Annals of Applied Statistics

Leveraging local identity-by-descent increases the power of case/control GWAS with related individuals

Joshua N. Sampson, Bill Wheeler, Peng Li, and Jianxin Shi

Full-text: Open access

Abstract

Large case/control Genome-Wide Association Studies (GWAS) often include groups of related individuals with known relationships. When testing for associations at a given locus, current methods incorporate only the familial relationships between individuals. Here, we introduce the chromosome-based Quasi Likelihood Score (cQLS) statistic that incorporates local Identity-By-Descent (IBD) to increase the power to detect associations. In studies robust to population stratification, such as those with case/control sibling pairs, simulations show that the study power can be increased by over 50%. In our example, a GWAS examining late-onset Alzheimer’s disease, the $p$-values among the most strongly associated SNPs in the APOE gene tend to decrease, with the smallest $p$-value decreasing from $1.23\times10^{-8}$ to $7.70\times10^{-9}$. Furthermore, as a part of our simulations, we reevaluate our expectations about the use of families in GWAS. We show that, although adding only half as many unique chromosomes, genotyping affected siblings is more efficient than genotyping randomly ascertained cases. We also show that genotyping cases with a family history of disease will be less beneficial when searching for SNPs with smaller effect sizes.

Article information

Source
Ann. Appl. Stat., Volume 8, Number 2 (2014), 974-998.

Dates
First available in Project Euclid: 1 July 2014

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1404229522

Digital Object Identifier
doi:10.1214/14-AOAS715

Mathematical Reviews number (MathSciNet)
MR3262542

Zentralblatt MATH identifier
06333784

Keywords
cQLS GWAS related individuals case–control

Citation

Sampson, Joshua N.; Wheeler, Bill; Li, Peng; Shi, Jianxin. Leveraging local identity-by-descent increases the power of case/control GWAS with related individuals. Ann. Appl. Stat. 8 (2014), no. 2, 974--998. doi:10.1214/14-AOAS715. https://projecteuclid.org/euclid.aoas/1404229522


Export citation

References

  • Akey, J., Jin, L. and Xiong, M. (2001). Haplotypes vs single marker linkage disequilibrium tests: What do we gain? Eur. J. Hum. Genet. 9 291–300.
  • Barrett, J. C., Hansoul, S., Nicolae, D. L., Cho, J. H., Duerr, R. H., Rioux, J. D., Brant, S. R., Silverberg, M. S., Taylor, K. D., Barmada, M. M., Bitton, A., Dassopoulos, T., Datta, L. W., Green, T., Griffiths, A. M., Kistner, E. O., Murtha, M. T., Regueiro, M. D., Rotter, J. I., Schumm, L. P., Steinhart, A. H., Targan, S. R., Xavier, R. J., NIDDK IBD Genetics Consortium, Libioulle, C., Sandor, C., Lathrop, M., Belaiche, J., Dewit, O., Gut, I., Heath, S., Laukens, D., Mni, M., Rutgeerts, P., Gossum, A. V., Zelenika, D., Franchimont, D., Hugot, J.-P., de Vos, M., Vermeire, S., Louis, E., Belgian-French IBD Consortium, Wellcome Trust Case Control Consortium, Cardon, L. R., Anderson, C. A., Drummond, H., Nimmo, E., Ahmad, T., Prescott, N. J., Onnie, C. M., Fisher, S. A., Marchini, J., Ghori, J., Bumpstead, S., Gwilliam, R., Tremelling, M., Deloukas, P., Mansfield, J., Jewell, D., Satsangi, J., Mathew, C. G., Parkes, M., Georges, M. and Daly, M. J. (2008). Genome-wide association defines more than 30 distinct susceptibility loci for Crohn’s disease. Nat. Genet. 40 955–962.
  • Bertram, L., McQueen, M. B., Mullin, K., Blacker, D. and Tanzi, R. E. (2007). Systematic meta-analyses of Alzheimer disease genetic association studies: The AlzGene database. Nat. Genet. 39 17–23.
  • Bourgain, C., Hoffjan, S., Nicolae, R., Newman, D., Steiner, L., Walker, K., Reynolds, R., Ober, C. and McPeek, M. S. (2003). Novel case–control test in a founder population identifies $P$-selectin as an atopy-susceptibility locus. The American Journal of Human Genetics 73 612–626.
  • Browning, S. R. and Browning, B. L. (2010). High-resolution detection of identity by descent in unrelated individuals. Am. J. Hum. Genet. 86 526–539.
  • Browning, B. L. and Browning, S. R. (2011). A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88 173–182.
  • Delaneau, O., Zagury, J.-F. and Marchini, J. (2013). Improved whole-chromosome phasing for disease and population genetic studies. Nat. Meth. 10 5–6.
  • Ewens, W. J., Li, M. and Spielman, R. S. (2008). A review of family-based tests for linkage disequilibrium between a quantitative trait and a genetic marker. PLoS Genet. 4 e1000180.
  • Gusev, A., Lowe, J. K., Stoffel, M., Daly, M. J., Altshuler, D., Breslow, J. L., Friedman, J. M. and Pe’er, I. (2009). Whole population, genome-wide mapping of hidden relatedness. Genome Res. 19 318–326.
  • Hattersley, A. T. and McCarthy, M. I. (2005). What makes a good genetic association study? The Lancet 366 1315–1323.
  • He, D. (2013). IBD-Groupon: An efficient method for detecting group-wise identity-by-descent regions simultaneously in multiple individuals based on pairwise IBD relationships. Bioinformatics 29 i162–i170.
  • Hirschhorn, J. N. and Daly, M. J. (2005). Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6 95–108.
  • Howie, B. N., Donnelly, P. and Marchini, J. (2009). A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5 e1000529.
  • Ionita-Laza, I. and Ottman, R. (2011). Study designs for identification of rare disease variants in complex diseases: The utility of family-based designs. Genetics 189 1061–1068.
  • Laird, N. M., Horvath, S. and Xu, X. (2000). Implementing a unified approach to family-based tests of association. Genetic Epidemiology 19 S36–S42.
  • Lange, C., Silverman, E. K., Xu, X., Weiss, S. T. and Laird, N. M. (2003). A multivariate family-based association test using generalized estimating equations: FBAT-GEE. Biostatistics 4 195–206.
  • Lee, J. H., Cheng, R., Graff-Radford, N., Foroud, T. and R, M. (2008). Analyses of the national institute on aging late-onset Alzheimer’s disease family study: Implication of additional loci. Archives of Neurology 65 1518–1526.
  • Manichaikul, A., Chen, W.-M., Williams, K., Wong, Q., Sale, M., Pankow, J., Tsai, M., Rotter, J., Rich, S. and Mychaleckyj, J. (2012). Analysis of family- and population-based samples in cohort genome-wide association studies. Human Genetics 131 275–287.
  • Mirea, L., Infante-Rivard, C., Sun, L. and Bull, S. B. (2012). Strategies for genetic association analyses combining unrelated case–control individuals and family trios. American Journal of Epidemiology 176 70–79.
  • Ott, J., Kamatani, Y. and Lathrop, M. (2011). Family-based designs for genome-wide association studies. Nat. Rev. Genet. 12 465–474.
  • Peters, B. A., Kermani, B. G., Sparks, A. B., Alferov, O., Hong, P., Alexeev, A., Jiang, Y., Dahl, F., Tang, Y. T., Haas, J., Robasky, K., Zaranek, A. W., Lee, J.-H., Ball, M. P., Peterson, J. E., Perazich, H., Yeung, G., Liu, J., Chen, L., Kennemer, M. I., Pothuraju, K., Konvicka, K., Tsoupko-Sitnikov, M., Pant, K. P., Ebert, J. C., Nilsen, G. B., Baccash, J., Halpern, A. L., Church, G. M. and Drmanac, R. (2012). Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells. Nature 487 190–195.
  • Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A. and Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38 904–909.
  • Sham, P. C., Purcell, S., Cherny, S. S. and Abecasis, G. R. (2002). Powerful regression-based quantitative-trait linkage analysis of general pedigrees. The American Journal of Human Genetics 71 238–253.
  • Slager, S. L. and Schaid, D. J. (2001). Evaluation of candidate genes in case–control studies: A statistical method to account for related subjects. The American Journal of Human Genetics 68 1457–1462.
  • Teng, J. and Risch, N. (1999). The relative power of family-based and case–control designs for linkage disequilibrium studies of complex human diseases. II. Individual genotyping. Genome Research 9 234–241.
  • Thornton, T. and McPeek, M. S. (2007). Case–control association testing with related individuals: A more powerful quasi-likelihood score test. The American Journal of Human Genetics 81 321–337.
  • Wang, Z. (2011). Direct assessment of multiple testing correction in case–control association studies with related individuals. Genet. Epidemiol. 35 70–79.
  • Wang, K., Li, M., Hadley, D., Liu, R., Glessner, J., Grant, S. F. A., Hakonarson, H. and Bucan, M. (2007). PennCNV: An integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17 1665–1674.
  • Wang, X., Zhu, X., Qin, H., Cooper, R. S., Ewens, W. J., Li, C. and Li, M. (2011). Adjustment for local ancestry in genetic association analysis of admixed populations. Bioinformatics 27 670–677.
  • Wijsman, E. M., Pankratz, N. D., Choi, Y., Rothstein, J. H., Faber, K. M., Cheng, R., Lee, J. H., Bird, T. D., Bennett, D. A., Diaz-Arrastia, R., Goate, A. M., Farlow, M., Ghetti, B., Sweet, R. A., Foroud, T. M., Mayeux, R. and NIA-LOAD/NCRAD Family Study Group (2011). Genome-wide association of familial late-onset Alzheimer’s disease replicates BIN1 and CLU and nominates CUGBP2 in interaction with APOE. PLoS Genet. 7 e1001308.
  • Wilkinson, L. S., Davies, W. and Isles, A. R. (2007). Genomic imprinting effects on brain development and function. Nat. Rev. Neurosci. 8 832–843.
  • Willer, C. J., Sanna, S., Jackson, A. U., Scuteri, A., Bonnycastle, L. L., Clarke, R., Heath, S. C., Timpson, N. J., Najjar, S. S., Stringham, H. M., Strait, J., Duren, W. L., Maschio, A., Busonero, F., Mulas, A., Albai, G., Swift, A. J., Morken, M. A., Narisu, N., Bennett, D., Parish, S., Shen, H., Galan, P., Meneton, P., Hercberg, S., Zelenika, D., Chen, W.-M., Li, Y., Scott, L. J., Scheet, P. A., Sundvall, J., Watanabe, R. M., Nagaraja, R., Ebrahim, S., Lawlor, D. A., Ben-Shlomo, Y., Davey-Smith, G., Shuldiner, A. R., Collins, R., Bergman, R. N., Uda, M., Tuomilehto, J., Cao, A., Collins, F. S., Lakatta, E., Lathrop, G. M., Boehnke, M., Schlessinger, D., Mohlke, K. L. and Abecasis, G. R. (2008). Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat. Genet. 40 161–169.
  • Won, S., Lu, Q., Bertram, L., Tanzi, R. E. and Lange, C. (2012). On the meta-analysis of genome-wide association studies: A robust and efficient approach to combine population and family-based studies. Hum. Hered. 73 35–46.
  • Zheng, Y., Heagerty, P. J., Hsu, L. and Newcomb, P. A. (2010). On combining family-based and population-based case–control data in association studies. Biometrics 66 1024–1033.
  • Zhu, Y. and Xiong, M. (2012). Family-based association studies for next-generation sequencing. The American Journal of Human Genetics 90 1028–1045.