Statistical Science

Statistical Analysis in Genetic Studies of Mental Illnesses

Heping Zhang

Full-text: Open access


Identifying the risk factors for mental illnesses is of significant public health importance. Diagnosis, stigma associated with mental illnesses, comorbidity, and complex etiologies, among others, make it very challenging to study mental disorders. Genetic studies of mental illnesses date back at least a century ago, beginning with descriptive studies based on Mendelian laws of inheritance. A variety of study designs including twin studies, family studies, linkage analysis, and more recently, genomewide association studies have been employed to study the genetics of mental illnesses, or complex diseases in general. In this paper, I will present the challenges and methods from a statistical perspective and focus on genetic association studies.

Article information

Statist. Sci., Volume 26, Number 1 (2011), 116-129.

First available in Project Euclid: 9 June 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Comorbidity covariate adjusted association test FBAT Kendall’s tau multiple traits ordinal traits


Zhang, Heping. Statistical Analysis in Genetic Studies of Mental Illnesses. Statist. Sci. 26 (2011), no. 1, 116--129. doi:10.1214/11-STS353.

Export citation


  • Abelson, J. F., Kwan, K. Y., O’Roak, B. J., Baek, D. Y., Stillman, A. A., Morgan, T. M., Mathews, C. A., Pauls, D. L., Rasin, M.-R., Gunel, M., Davis, N. R., Ercan-Sencicek, A. G., Guez, D. H., Spertus, J. A., Leckman, J. F., Leon S. Dure, t., Kurlan, R., Singer, H. S., Gilbert, D. L., Farhi, A., Louvi, A., Lifton, R. P., Sestan, N. and State, M. W. (2005). Sequence variants in SLITRK1 are associated with Tourette’s syndrome. Science 310 317–320.
  • Abreu, P. C., Greenberg, D. A. and Hodge, S. E. (1999). Direct power comparisons between simple lod scores and NPL scores for linkage analysis in complex diseases. Am. J. Hum. Genet. 65 847–857.
  • Allison, D. B. (1997). Transmission-disequilibrium tests for quantitative traits. Am. J. Hum. Genet. 60 676–690.
  • Almasy, L. and Blangero, J. (1998). Multipoint quantitative-trait linkage analysis in general pedigrees. Am. J. Hum. Genet. 62 1198–1121.
  • American Psychiatric Association (1994). Diagnostic and Statistical Manual of Mental Disorders, 4th ed. American Psychiatric Association Press, Washington, DC.
  • Amos, C. I. (1994). Robust variance-components approach for assess-ing genetic linkage in pedigrees. Am. J. Hum. Genet. 54 535–543.
  • Arking, D. E., Pfeufer, A., Post, W., Kao, W. H. L., Newton-Cheh, C., Ikeda, M., West, K., Kashuk, C., Akyol, M., Perz, S., Jalilzadeh, S., Illig, T., Gieger, C., Guo, C.-Y., Larson, M. G., Wichmann, H. E., Marbán, E., O’Donnell, C. J., Hirschhorn, J. N., Kääb, S., Spooner, P. M., Meitinger, T. and Chakravarti, A. (2006). A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization. Nat. Genet. 38 644–651.
  • Babiker, A. and Cuzick, J. (1994). A simple frailty model for family studies with covariates. Stat. Med. 13 1679–1692.
  • Begleiter, e. a. H. (1995). The collaborative study on the genetics of alcoholism. Alcohol Health Res. World 19 228–236.
  • Blackwelder, W. C. and Elston, R. C. (1985). A comparison of sib-pair linkage tests for disease susceptibility loci. Genet. Epidemiol. 2 85–97.
  • Blangero, J. and Almasy, L. (1997). Multipoint oligogenic linkage analysis of quantitative traits. Genet. Epidemiol. 14 959–964.
  • Cannings, C., Thompson, E. A. and Skolnick, M. H. (1978). Probability functions on complex pedigrees. Adv. Appl. Probab. 10 26–61.
  • Cannon, G. L. and Rosanoff, A. J. (1911). Preliminary report of a study of heredity in insanity in the light of the Mendelian laws. Reprinted from J. Nervous and Mental Disorders 38 272–279.
  • Carter, C. L. and Chung, C. S. (1980). Segregation analysis of schizophrenia under a mixed genetic model. Hum. Hered. 30 350–356.
  • Chen, X., Liu, C. T., Zhang, M. Z. and Zhang, H. P. (2007). A forest-based approach to identifying gene and gene–gene interactions. Proc. Natl. Acad. Sci. USA 104 19199–19203.
  • Chen, X., Cho, K., Singer, B. H. and Zhang, H. P. (2011). The nuclear transcription factor PKNOX2 is a candidate gene for substance dependence in European-origin women. PLoS ONE 6 e16002.
  • Cochran, W. G. (1977). Sampling Techniques, 3rd ed. Wiley, New York.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1–38.
  • Duerr, R. H., Taylor, K. D., Brant, S. R., Rioux, J. D., Silverberg, M. S., Daly, M. J., Steinhart, A. H., Abraham, C., Regueiro, M. and Griffiths, A. et al. (2006). A genomewide association study identifies IL23R as an inflammatory bowel disease gene. Science 314 1461–1463.
  • Edenberg, H. J., Bierut, L. J., Boyce, P., Cao, M., Cawley, S., Chiles, R. and Doheny, K. F. (2005). Description of the data from the Collaborative Study on the Genetics of Alcoholism (COGA) and single-nucleotide polymorphism genotyping for Genetic Analysis Workshop 14. BMC Genetics 6 S2.
  • Elston, R. C. and Steward, J. (1971). A general model for the analysis of pedigree data. Hum. Hered. 21 523–542.
  • Feng, R., Leckman, J. and Zhang, H. P. (2004). Linkage analysis of ordinal traits for pedigree data. Proc. Natl. Acad. Sci. USA 101 16739–16744.
  • Gastwirth, J. L. (1966). On robust procedures. J. Amer. Statist. Assoc. 61 929–948.
  • Gastwirth, J. L. (1985). The use of maximin efficiency robust tests in combining contingency tables and survival analysis. J. Amer. Statist. Assoc. 80 381–384.
  • Goldgar, D. E. (1990). Multipoint analysis of human quantitative genetic variation. Am. J. Hum. Genet. 47 957–967.
  • Goldstein, D. B. (2009). Common genetic variation and human traits. N. Eng. J. Med. 360 1696–1698.
  • Guo, S. W. and Thompson, E. A. (1992). A Monte Carlo method for combined segregation and linkage analysis. Am. J. Hum. Genet. 51 1111–1126.
  • Heath, A. C., Todorov, A. A., Nelson, E. C., Madden, P. A. F., Bucholz, K. K. and Martin, N. G. (2002). Gene–environment interaction effects on behavioral variation and risk of complex disorders: The example of alcoholism and other psychiatric disorders. Twin Research 5 30–37.
  • Heston, L. L. (1966). Psychiatric disorders in foster home reared children of schizophrenic mothers. Bristish J. Psychiatry 112 819–825.
  • Hindorff, L. A., Sethupathy, P., Junkinsa, H. A., Ramosa, E. M., Mehtac, J. P., Collinsb, F. S. and Manolioa, T. A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106 9362–9367.
  • Hollander, M. and Wolfe, D. A. (1999). Nonparametric Statistical Methods, 2nd ed. Wiley, New York.
  • Hopper, J. L. (1989). Modelling sibship environment in the regressive logistic model for familial disease. Genet. Epidemiol. 6 235–240.
  • Horvath, S. and Laird, N. M. (1998). A discordant-sibship test for disequilibrium and linkage: No need for parental data. Am. J. Hum. Genet. 63 1886–1897.
  • Kao, W. H. et al. (2009). MYH9 is associated with nondiabetic end-stage renal disease in African-Americans. Nat. Genet. 40 1185–1192.
  • Kety, S. S., Rosenthal, D. and Wender, P. (1978). Genetic relationships within the schizophrenia spectrum: Evidence from adoption studies. In Critical Issues in Psychiatric Diagnosis ( R. L. Spitzer and D. F. Klein, eds.) 213–223. Raven Press, New York.
  • Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J.-Y., Sackler, R. S., Haynes, C., Henning, A. K., SanGiovanni, J. P., Mane, S. M., Mayne, S. T., Bracken, M. B., Ferris, F. L., Ott, J., Barnstable, C. and Hoh, J. (2005). Complement factor H polymorphism in age-related macular degeneration. Science 308 385–389.
  • Knapp, M. (1999). Using exact P values to compare the power between the reconstruction-combined transmission/disequilibrium test and the sib transmission/disequilibrium test. Am. J. Hum. Genet. 65 1208–1210.
  • Kopp, J. B., Smith, M. W., Nelson, G. W., Johnson, R. C., Freedman, B. I., Bowden, D. W., Oleksyk, T., McKenzie, L. M., Kajiyama, H., Ahuja, T. S., Berns, J. S., Briggs, W., Cho, M. E., Dart, R. A., Kimmel, P. L., Korbet, S. M., Michel, D. M., Mokrzycki, M. H., Schelling, J. R., Simon, E., Trachtman, H., Vlahov, D. and Winkler, C. A. (2008). MYH9 is a major-effect risk gene for focal segmental glomerulosclerosis. Nat. Genet. 40 1175–1184.
  • Kraepelin, E. (1899). Psychiatrie: Ein Lehrbuch fur Studirende und Aerzte. Barth, Leipzig.
  • Kruglyak, L., Daly, M. J., Reeve-Daly, M. P. and Lander, E. S. (1996). Parametric and nonparametric linkage analysis: A unified multipoint approach. Am. J. Hum. Genet. 58 1347–1363.
  • Lander, E. S. and Green, P. (1987). Construction of multilocus genetic linkage maps in humans. Proc. Natl. Acad. Sci. USA 84 2363–2367.
  • Lange, C., Silverman, E. K., Xu, X., Weiss, S. T. and Laird, N. M. (2003). A multivariate family-based association test using generalized estimating equations: FBAT-GEE. Biostatistics 4 195–306.
  • Li, H. Z. and Thompson, E. (1997). Semiparametric estimation of major gene and family-specific random effects for age of onset. Biometrics 53 282–293.
  • Li, M. D. and Burmeister, M. (2009). New insights into the genetics of addiction. Nat. Rev. Genet. 10 225–231.
  • Liang, K.-Y. and Rathouz, P. J. (1999). Hypothesis testing under mixture models: Application to genetic linkage analysis. Biometrics 55 65–74.
  • Liu, H., Tang, Y. and Zhang, H. H. (2009). A new chi-square approximation to the distribution of non-negative definite quadratic forms in non-central normal variables. Comput. Statist. Data Anal. 53 853–856.
  • Liu, Y., Tritchler, D. and Bull, S. B. (2002). A unified framework for transmission-disequilibrium test analysis of discrete and continuous traits. Genet. Epidemiol. 22 26–40.
  • Lunetta, K. L., Farone, S. V., Biederman, J. and Laird, N. M. (2000). Family based tests of association and linkage that used unaffected sibs, covariates, and interactions. Am. J. Hum. Genet. 66 605–614.
  • Martin, E. R., Monks, S. A., Warren, L. L. and Kaplan, N. L. (2000). A test for linkage and association in general pedigrees: The pedigree disequilibrium test. Am. J. Hum. Genet. 67 146–154.
  • Merikangas, K. R., Stolar, M., Stevens, D. E., Goulet, J., Preisig, M. A., Fenton, B., Zhang, H., O’Malley, S. S. and Rounsaville, B. J. (1998). Familial transmission of substance use disorders. Arch. Gen. Psychiatry 55 973–979.
  • Morton, N. E. (1955). Sequential tests for the detection of linkage. Am. J. Hum. Genet. 7 277–318.
  • Mulligan, M. K., Ponomarev, I., Hitzemann, R. J., Belknap, J. K., Tabakoff, B., Harris, R. A., Crabbe, J. C., Blednov, Y. A., Grahame, N. J., Phillips, T. J., Finn, D. A., Hoffman, P. L., Iyer, V. R., Koob, G. F. and Bergeson, S. E. (2006). Toward understanding the genetics of alcohol drinking through transcriptome meta-analysis. Proc. Natl. Acad. Sci. USA 103 6368–6373.
  • Ott, J. (1974). Estimation of the recombination fraction in human pedigrees: Efficient computation of the likelihood for human linkage studies. Am. J. Hum. Genet. 26 588–597.
  • Ott, J. (1999). Analysis of Human Genetic Linkage, 3rd ed. Johns Hopkins Univ. Press, Baltimore, MD.
  • Pauls, D. L. and Leckman, J. F. (1986). The inheritance of Gilles de la Tourette’s syndrome and associated behaviors. Evidence for autosomal dominant transmission. New Eng. J. Med. 315 993–997.
  • Pearson, E. S. (1959). Note on an approximation to the distribution of non-central χ2. Biometrika 46 364.
  • Plomin, R., DeFries, J. C., McClearn, G. E. and Rutter, M. (1997). Behavioral Genetics, 3rd ed. Freeman, New York.
  • Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A. R., Bender, D., Maller, J., Sklar, P., de Bakker, P. I. W., Daly, M. J. and Sham, P. C. (2007). PLINK: A toolset for whole-genome association and population-based linkage analysis. Am. J. Hum. Genet. 81 559–575.
  • Rabinowitz, D. (1997). A transmission disequilibrium test for quantitative trait loci. Hum. Hered. 47 342–350.
  • Rabinowitz, D. and Laird, N. M. (2000). A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum. Hered. 50 211–223.
  • Rao, S. and Xu, S. (1998). Mapping quantitative trait loci for ordered categorical traits in four-way crosses. Heredity 81 214–224.
  • Risch, N. R. and Zhang, H. P. (1995). Extreme discordant sib pairs for mapping quantitative trait loci in humans. Science 268 1584–1589.
  • Rosenthal, D. (1972). Three adoption studies of heredity in the schizophrenic disorders. Internat. J. Mental Health 1 63–75.
  • Scharf, J. M., Moorjani, P., Fagerness, J., Platko, J. V., Illmann, C., Galloway, B., Jenike, E., Stewart, S. E., Pauls, D. L. and The Tourette Syndrome International Consortium for Genetics (2008). Lack of association between SLITRK1var321 and Tourette syndrome in a large family-based sample. Neurology 70 1495–1496.
  • Schork, N. J. (1993). Extended multipoint identity-by-descent analy-sis of human quantitative traits: Efficiency, power, and modeling con-siderations. Am. J. Hum. Genet. 53 1306–1319.
  • Siegmund, K. and McKnight, B. (1998). Modeling hazard functions in families. Genet. Epidemiol. 15 147–171.
  • Solomon, H. and Stephens, M. A. (1977). Distribution of a sum of weighted chi-square variables. J. Amer. Statist. Assoc. 72 881–885.
  • Spielman, R. S. and Ewens, W. J. (1998). A sibship rest for linkage in the presence of association: The sib transmission/disequilibrium test. Am. J. Hum. Genet. 62 450–458.
  • Spielman, R. S., McGinnis, R. E. and Ewens, W. J. (1993). Transmission test for linkage disequilibrium: The insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52 506–516.
  • Steinke, J. W., Borish, L. and Rosenwasser, L. J. (2003). Genetics of hypersensitivity. J. Allergy Clin. Immunol. 111 S495–S501.
  • Tienari, P. (1991). Interaction between genetic vulnerability and family environment: The Finnish adoptive family study of schizophrenia. Acta Psychiatr. Scand. 84 460–465.
  • True, W. R., Heath, A. C., Scherrer, J. F., Xian, H., Lin, N., Eisen, S. A., Lyons, M. J., Goldberg, J. and Tsuang, M. T. (1999). Interrelationship of genetic and environmental influences on conduct disorder and alcohol and marijuana dependence symptoms. Am. J. Med. Genet. 88 391–397.
  • Vergne, L., Bourgeois, A., Mpoudi-Ngole, E., Mougnutou, R., Mbuagbaw, J., Liegeois, F., Laurent, C., Butel, C., Zekeng, L., Delaporte, E. and Peeters, M. (2003). Biological and genetic characteristics of HIV infections in Cameroon reveals dual group M and O infections and a correlation between SI-inducing phenotype of the predominant CRF02_AG variant and disease stage. Virology 310 254–266.
  • Wang, X. Q., Ye, Y. Q. and Zhang, H. P. (2006). Family-based association tests for ordinal traits adjusting for covariates. Genet. Epidemiol. 30 728–736.
  • Weinberg, C. R. (1999). Allowing for missing parents in genetic studies of case-parental triads. Am. J. Hum. Genet. 64 1186–1193.
  • Whittemore, A. S. (1996). Genome scanning for linkage: An overview. Am. J. Hum. Genet. 59 704–716.
  • Xu, S. and Xu, C. (2006). A multivariate model for ordinal trait analysis. Heredity 97 409–417.
  • Zhang, H. P., Feng, R. and Zhu, H. (2003). A latent variable model of segregation analysis for ordinal traits. J. Amer. Statist. Assoc. 98 1023–1034.
  • Zhang, H. P., Liu, C.-T. and Wang, X. (2010). An association test for multiple traits based on the generalized Kendall’s tau. J. Amer. Statist. Assoc. 105 473–481.
  • Zhang, H. P. and Merikangas, K. (2000). A frailty model of segregation analysis: Understanding the familial transmission of alcoholism. Biometrics 56 815–823.
  • Zhang, H. P., Wang, X. Q. and Ye, Y. Q. (2006). Detection of genes for ordinal traits in nuclear families and a unified approach for association studies. Genetics 172 693–699.
  • Zhang, M., Feng, R., Chen, X., Hu, B. and Zhang, H. (2008). LOT: A tool for linkage analysis of ordinal traits for pedigree data. Bioinformatics 24 1737–1739.
  • Zheng, G., Joo, J., Zaykin, D., Wu, C. and Geller, N. (2009). Robust tests in genome-wide scans under incomplete linkage disequilibrium. Statist. Sci. 24 503–516.
  • Zhu, W. S., Jiang, Y. and Zhang, H. P. (2010). Covariate-adjusted association tests and power calculations based on the generalized Kendall’s tau. Technical report.
  • Zhu, W. and Zhang, H. (2009). Why do we test multiple traits in genetic association studies? J. Korean Statist. Soc. 38 1–10.
  • Zhu, X., Cooper, R., Kan, D., Cao, G. and Wu, X. (2005). A genome-wide linkage and association study using COGA data. BMC Genet. 6 S128.