The Annals of Applied Statistics

Functional principal variance component testing for a genetic association study of HIV progression

Denis Agniel, Wen Xie, Myron Essex, and Tianxi Cai

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


HIV-1C is the most prevalent subtype of HIV-1 and accounts for over half of HIV-1 infections worldwide. Host genetic influence of HIV infection has been previously studied in HIV-1B, but little attention has been paid to the more prevalent subtype C. To understand the role of host genetics in HIV-1C disease progression, we perform a study to assess the association between longitudinally collected measures of disease and more than 100,000 genetic markers located on chromosome 6. The most common approach to analyzing longitudinal data in this context is linear mixed effects models, which may be overly simplistic in this case. On the other hand, existing flexible and nonparametric methods either require densely sampled points, restrict attention to a single SNP, lack testing procedures, or are cumbersome to fit on the genome-wide scale. We propose a functional principal variance component (FPVC) testing framework which captures the nonlinearity in the CD4 and viral load with low degrees of freedom and is fast enough to carry out thousands or millions of times. The FPVC testing unfolds in two stages. In the first stage, we summarize the markers of disease progression according to their major patterns of variation via functional principal components analysis (FPCA). In the second stage, we employ a simple working model and variance component testing to examine the association between the summaries of disease progression and a set of single nucleotide polymorphisms. We supplement this analysis with simulation results which indicate that FPVC testing can offer large power gains over the standard linear mixed effects model.

Article information

Ann. Appl. Stat., Volume 12, Number 3 (2018), 1871-1893.

Received: November 2015
Revised: July 2017
First available in Project Euclid: 11 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Genomic association studies HIV disease progression functional principal component analysis longitudinal data mixed effects models variance component testing


Agniel, Denis; Xie, Wen; Essex, Myron; Cai, Tianxi. Functional principal variance component testing for a genetic association study of HIV progression. Ann. Appl. Stat. 12 (2018), no. 3, 1871--1893. doi:10.1214/18-AOAS1135.

Export citation


  • Agniel, D., Xie, W., Essex, M. and Cai, T. (2018). Supplement to “Functional principal variance component testing for a genetic association study of HIV progression.” DOI:10.1214/18-AOAS1135SUPP.
  • Antoniadis, A. and Sapatinas, T. (2007). Estimation and inference in functional mixed-effects models. Comput. Statist. Data Anal. 51 4793–4813.
  • Baum, M. K., Campa, A., Lai, S., Martinez, S. S., Tsalaile, L., Burns, P., Farahani, M., Li, Y., Van Widenfelt, E., Page, J. B. et al. (2013). Effect of micronutrient supplementation on disease progression in asymptomatic, antiretroviral-naive, HIV-infected adults in Botswana: A randomized clinical trial. JAMA 310 2154–2163.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • Castro, P. E., Lawton, W. H. and Sylvestre, E. A. (1986). Principal modes of variation for processes with continuous sample curves. Technometrics 28 329–337.
  • Chiou, J.-M., Müller, H.-G. and Wang, J.-L. (2003). Functional quasi-likelihood regression models with smooth random effects. J. R. Stat. Soc. Ser. B. Stat. Methodol. 65 405–423.
  • Chubb, D., Weinhold, N., Broderick, P., Chen, B., Johnson, D. C., Försti, A., Vijayakrishnan, J., Migliorini, G., Dobbins, S. E., Holroyd, A. et al. (2013). Common variation at 3q26. 2, 6p21. 33, 17p11. 2 and 22q13. 1 influences multiple myeloma risk. Nat. Genet. 45 1221–1225.
  • Commenges, D. and Andersen, P. K. (1995). Score test of homogeneity for survival data. Lifetime Data Anal. 1 145–159.
  • Crainiceanu, C. M. and Ruppert, D. (2004). Likelihood ratio tests in linear mixed models with one variance component. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 165–185.
  • Fellay, J., Shianna, K. V., Ge, D., Colombo, S., Ledergerber, B., Weale, M., Zhang, K., Gumbs, C., Castagna, A., Cossarizza, A. et al. (2007). A whole-genome association study of major determinants for host control of HIV-1. Science 317 944–947.
  • Geretti, A. M. (2006). HIV-1 subtypes: Epidemiology and significance for HIV management. Curr. Opin. Infect. Dis. 19 1–7.
  • Guo, W. (2002). Functional mixed effects models. Biometrics 58 121–128.
  • Hall, P., Müller, H.-G. and Wang, J.-L. (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist. 34 1493–1517.
  • Jiang, J. (1998). Asymptotic properties of the empirical BLUP and BLUE in mixed linear models. Statist. Sinica 8 861–885.
  • Joint United Nations Programme on HIV/AIDS (UNAIDS) (2012). Global Report: UNAIDS Report on the Global AIDS Epidemic: 2012. UNAIDS.
  • Krafty, R. T., Gimotty, P. A., Holtz, D., Coukos, G. and Guo, W. (2008). Varying coefficient model with unknown within-subject covariance for analysis of tumor growth curves. Biometrics 64 1023–1031.
  • Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics 963–974.
  • Lin, X. (1997). Variance component testing in generalised linear models with random effects. Biometrika 84 309–326.
  • Lindstrom, M. J. and Bates, D. M. (1990). Nonlinear mixed effects models for repeated measures data. Biometrics 46 673–687.
  • Migueles, S. A., Sabbaghian, M. S., Shupert, W. L., Bettinotti, M. P., Marincola, F. M., Martino, L., Hallahan, C. W., Selig, S. M., Schwartz, D., Sullivan, J. et al. (2000). HLA B∗ 5701 is highly associated with restriction of virus replication in a subgroup of HIV-infected long term nonprogressors. Proc. Natl. Acad. Sci. USA 97 2709–2714.
  • Morris, J. S. and Carroll, R. J. (2006). Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 179–199.
  • Nalls, M. A., Couper, D. J., Tanaka, T., van Rooij, F. J., Chen, M.-H., Smith, A. V., Toniolo, D., Zakai, N. A., Yang, Q., Greinacher, A. et al. (2011). Multiple loci are associated with white blood cell phenotypes. PLoS Genet. 7 e1002113–e1002113.
  • O’Brien, S. J. and Hendrickson, S. L. (2013). Host genomic influences on HIV/AIDS. Genome Biol. 14 201.
  • Reiss, P. T., Huang, L. and Mennes, M. (2010). Fast function-on-scalar regression with penalized basis expansions. Int. J. Biostat. 6 28.
  • Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243.
  • Rice, J. A. and Wu, C. O. (2001). Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57 253–259.
  • Robinson, G. K. (1991). That BLUP is a good thing: The estimation of random effects. Statist. Sci. 6 15–51.
  • Skibola, C. F., Bracci, P. M., Halperin, E., Conde, L., Craig, D. W., Agana, L., Iyadurai, K., Becker, N., Brooks-Wilson, A., Curry, J. D. et al. (2009). Genetic variants at 6p21. 33 are associated with susceptibility to follicular lymphoma. Nat. Genet. 41 873–875.
  • Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 187–205.
  • van Manen, D., Kootstra, N. A., Boeser-Nunnink, B., Handulle, M. A., van’t Wout, A. B. and Schuitemaker, H. (2009). Association of HLA-C and HCP5 gene regions with the clinical course of HIV-1 infection. AIDS 23 19–28.
  • Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89 82–93.
  • Yao, F., Müller, H.-G. and Wang, J.-L. (2005a). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.
  • Yao, F., Müller, H.-G. and Wang, J.-L. (2005b). Functional linear regression analysis for longitudinal data. Ann. Statist. 33 2873–2903.

Supplemental materials

  • Supplementary proofs and plots. We provide the derivation of the form of the score statistic, proof of its null distribution, and supporting assumptions. And we include the form of the eigenfunctions for the HIV data analysis.