The Annals of Applied Statistics

A functional data analysis approach for genetic association studies

Matthew Reimherr and Dan Nicolae

Full-text: Open access


We present a new method based on Functional Data Analysis (FDA) for detecting associations between one or more scalar covariates and a longitudinal response, while correcting for other variables. Our methods exploit the temporal structure of longitudinal data in ways that are otherwise difficult with a multivariate approach. Our procedure, from an FDA perspective, is a departure from more established methods in two key aspects. First, the raw longitudinal phenotypes are assembled into functional trajectories prior to analysis. Second, we explore an association test that is not directly based on principal components. We instead focus on quantifying the reduction in $L^{2}$ variability as a means of detecting associations. Our procedure is motivated by longitudinal genome wide association studies and, in particular, the childhood asthma management program (CAMP) which explores the long term effects of daily asthma treatments. We conduct a simulation study to better understand the advantages (and/or disadvantages) of an FDA approach compared to a traditional multivariate one. We then apply our methodology to data coming from CAMP. We find a potentially new association with a SNP negatively affecting lung function. Furthermore, this SNP seems to have an interaction effect with one of the treatments.

Article information

Ann. Appl. Stat., Volume 8, Number 1 (2014), 406-429.

First available in Project Euclid: 8 April 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Functional data analysis longitudinal data analysis genome wide association study functional linear model functional analysis of variance hypothesis testing


Reimherr, Matthew; Nicolae, Dan. A functional data analysis approach for genetic association studies. Ann. Appl. Stat. 8 (2014), no. 1, 406--429. doi:10.1214/13-AOAS692.

Export citation


  • Antoniadis, A. and Sapatinas, T. (2007). Estimation and inference in functional mixed-effects models. Comput. Statist. Data Anal. 51 4793–4813.
  • Bosq, D. (2000). Linear Processes in Function Spaces. Springer, New York.
  • Cardot, H., Ferraty, F., Mas, A. and Sarda, P. (2003). Testing hypotheses in the functional linear model. Scand. J. Stat. 30 241–255.
  • Chen, K. and Müller, H.-G. (2012). Conditional quantile analysis when covariates are functions, with application to growth data. J. R. Stat. Soc. Ser. B Stat. Methodol. 74 67–89.
  • Duchesne, P. and Lafaye De Micheaux, P. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu–Tang–Zhang approximation and exact methods. Comput. Statist. Data Anal. 54 858–862.
  • Fan, J. and Zhang, J.-T. (2000). Two-step estimation of functional linear models with applications to longitudinal data. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 303–322.
  • Gohberg, I., Goldberg, S. and Kaashoek, M. A. (2003). Basic Classes of Linear Operators. Birkhäuser, Basel.
  • Gromenko, O. and Kokoszka, P. (2013). Nonparametric inference in small data sets of spatially indexed curves with application to ionospheric trend determination. Comput. Statist. Data Anal. 59 82–94.
  • Hall, P., Müller, H.-G. and Wang, J.-L. (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist. 34 1493–1517.
  • Imhof, J. P. (1961). Computing the distribution of quadratic forms in normal variables. Biometrika 48 419–426.
  • Kokoszka, P. and Reimherr, M. (2013). Determining the order of the functional autoregressive model. J. Time Series Anal. 34 116–129.
  • Kokoszka, P., Maslova, I., Sojka, J. and Zhu, L. (2008). Testing for lack of dependence in the functional linear model. Canad. J. Statist. 36 207–222.
  • Ma, C. X., Cassella, G. and Wu, R. (2002). Functional mapping of quantitative trait loci underlying the character process: A theoretical framework. Genetics 161 1751–1762.
  • MATLAB (2013). Version 8.1 (R2013a). The MathWorks Inc., Natick, MA.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer Series in Statistics. Springer, New York.
  • Reimherr, M. (2013). Functional data methods for genome-wide association studies. Ph.D. thesis, Chicago, IL.
  • Reiss, P. T., Huang, L. and Mennes, M. (2010). Fast function-on-scalar regression with penalized basis expansions. Int. J. Biostat. 6 Art. 28, 30.
  • Reiss, P. T., Mennes, M., Petkova, E., Huang, L., Hoptman, M. J., Biswal, B. B., Colcombe, S. J., Zuo, X.-N. and Milham, M. P. (2011). Extracting information from functional connectivity maps via function-on-scalar regression. Neuroimage 56 140–148.
  • Tang, R. and Müller, H.-G. (2009). Time-synchronized clustering of gene expression trajectories. Biostatistics 10 32–45.
  • Tantisira, K. G., Lasky-Su, J., Harada, M., Murphy, A., Litonjua, A. A., Himes, B. E., Lange, C., Lazarus, R., Sylvia, J., Klanderman, B., Duan, Q. L., Qiu, W., Hirota, T., Martinez, F. D., Mauger, D., Sorkness, C., Szefler, S., Lazarus, S. C., Lemanske, R. F., Peters, S. P., Lima, J. J., Nakamura, Y., Tamari, M. and Weiss, S. T. (2011). Genomewide association between GLCCI1 and response to glucocorticoid therapy in asthma. N. Engl. J. Med. 365 1173–1183.
  • The Childhood Asthma Management Program Research Group (1999). The Childhood Asthma Management Program (CAMP): Design, rationale, and methods. Control. Clin. Trials 20 91–120.
  • The Childhood Asthma Management Program Research Group (2000). Long-term effects of budesonide or nedocromil in children with asthma. N. Engl. J. Med. 343 1054–1063.
  • Verzelen, N., Tao, W. and Müller, H.-G. (2012). Inferring stochastic dynamics from functional data. Biometrika 99 533–550.
  • Wu, R. and Lin, M. (2006). Functional mapping—How to map and study the genetic architecture of dynamic complex traits. Nature Review Genetics 7 229–237.
  • Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.
  • Zhang, J.-T. and Chen, J. (2007). Statistical inferences for functional data. Ann. Statist. 35 1052–1079.
  • Zipunnikov, V., Caffo, B., Yousem, D. M., Davatzikos, C., Schwartz, B. S. and Crainiceanu, C. (2011). Functional principal component model for high-dimensional brain imaging. NeuroImage 58 772–784.