Open Access
December 2014 Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis
Vadim Zipunnikov, Sonja Greven, Haochang Shou, Brian S. Caffo, Daniel S. Reich, Ciprian M. Crainiceanu
Ann. Appl. Stat. 8(4): 2175-2202 (December 2014). DOI: 10.1214/14-AOAS748
Abstract

We develop a flexible framework for modeling high-dimensional imaging data observed longitudinally. The approach decomposes the observed variability of repeatedly measured high-dimensional observations into three additive components: a subject-specific imaging random intercept that quantifies the cross-sectional variability, a subject-specific imaging slope that quantifies the dynamic irreversible deformation over multiple realizations, and a subject-visit-specific imaging deviation that quantifies exchangeable effects between visits. The proposed method is very fast, scalable to studies including ultrahigh-dimensional data, and can easily be adapted to and executed on modest computing infrastructures. The method is applied to the longitudinal analysis of diffusion tensor imaging (DTI) data of the corpus callosum of multiple sclerosis (MS) subjects. The study includes 176 subjects observed at 466 visits. For each subject and visit the study contains a registered DTI scan of the corpus callosum at roughly 30,000 voxels.

References

1.

3D-Slicer (2011).  http://www.slicer.org/.3D-Slicer (2011).  http://www.slicer.org/.

2.

Aston, J. A. D., Chiou, J.-M. and Evans, J. P. (2010). Linguistic pitch analysis using functional principal component mixed effect models. J. R. Stat. Soc. Ser. C. Appl. Stat. 59 297–317. MR2744475 10.1111/j.1467-9876.2009.00689.xAston, J. A. D., Chiou, J.-M. and Evans, J. P. (2010). Linguistic pitch analysis using functional principal component mixed effect models. J. R. Stat. Soc. Ser. C. Appl. Stat. 59 297–317. MR2744475 10.1111/j.1467-9876.2009.00689.x

3.

Bigelow, J. L. and Dunson, D. B. (2009). Bayesian semiparametric joint models for functional predictors. J. Amer. Statist. Assoc. 104 26–36. MR2663031 10.1198/jasa.2009.0001Bigelow, J. L. and Dunson, D. B. (2009). Bayesian semiparametric joint models for functional predictors. J. Amer. Statist. Assoc. 104 26–36. MR2663031 10.1198/jasa.2009.0001

4.

Budavari, T., Wild, V., Szalay, A. S., Dobos, L. and Yip, C.-W. (2009). Reliable eigenspectra for new generation surveys. Monthly Notices of the Royal Astronomical Society 394 1496–1502.Budavari, T., Wild, V., Szalay, A. S., Dobos, L. and Yip, C.-W. (2009). Reliable eigenspectra for new generation surveys. Monthly Notices of the Royal Astronomical Society 394 1496–1502.

5.

Crainiceanu, C. M., Staicu, A.-M. and Di, C.-Z. (2009). Generalized multilevel functional regression. J. Amer. Statist. Assoc. 104 1550–1561. MR2750578 10.1198/jasa.2009.tm08564Crainiceanu, C. M., Staicu, A.-M. and Di, C.-Z. (2009). Generalized multilevel functional regression. J. Amer. Statist. Assoc. 104 1550–1561. MR2750578 10.1198/jasa.2009.tm08564

6.

Crainiceanu, C. M., Caffo, B. S., Luo, S., Zipunnikov, V. M. and Punjabi, N. M. (2011). Population value decomposition, a framework for the analysis of image populations. J. Amer. Statist. Assoc. 106 775–790. MR2894733 10.1198/jasa.2011.ap10089Crainiceanu, C. M., Caffo, B. S., Luo, S., Zipunnikov, V. M. and Punjabi, N. M. (2011). Population value decomposition, a framework for the analysis of image populations. J. Amer. Statist. Assoc. 106 775–790. MR2894733 10.1198/jasa.2011.ap10089

7.

Demmel, J. W. (1997). Applied Numerical Linear Algebra. SIAM, Philadelphia, PA. MR1463942Demmel, J. W. (1997). Applied Numerical Linear Algebra. SIAM, Philadelphia, PA. MR1463942

8.

Di, C., Crainiceanu, C. M. and Jank, W. S. (2010). Multilevel sparse functional principal component analysis. Stat. 3 126–143.Di, C., Crainiceanu, C. M. and Jank, W. S. (2010). Multilevel sparse functional principal component analysis. Stat. 3 126–143.

9.

Di, C.-Z., Crainiceanu, C. M., Caffo, B. S. and Punjabi, N. M. (2009). Multilevel functional principal component analysis. Ann. Appl. Stat. 3 458–488. MR2668715 10.1214/08-AOAS206 euclid.aoas/1239888378 Di, C.-Z., Crainiceanu, C. M., Caffo, B. S. and Punjabi, N. M. (2009). Multilevel functional principal component analysis. Ann. Appl. Stat. 3 458–488. MR2668715 10.1214/08-AOAS206 euclid.aoas/1239888378

10.

Everson, R. and Roberts, S. (2000). Inferring the eigenvalues of covariance matrices from limited, noisy data. IEEE Trans. Signal Process. 48 2083–2091. MR1824643 10.1109/78.847792Everson, R. and Roberts, S. (2000). Inferring the eigenvalues of covariance matrices from limited, noisy data. IEEE Trans. Signal Process. 48 2083–2091. MR1824643 10.1109/78.847792

11.

Goldsmith, J., Crainiceanu, C. M., Caffo, B. S. and Reich, D. S. (2011). Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis. NeuroImage 57 431–439.Goldsmith, J., Crainiceanu, C. M., Caffo, B. S. and Reich, D. S. (2011). Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis. NeuroImage 57 431–439.

12.

Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press, Baltimore, MD. MR1417720Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press, Baltimore, MD. MR1417720

13.

Greven, S., Crainiceanu, C., Caffo, B. and Reich, D. (2010). Longitudinal functional principal component analysis. Electron. J. Stat. 4 1022–1054. MR2727452 10.1214/10-EJS575 euclid.ejs/1286889183 Greven, S., Crainiceanu, C., Caffo, B. and Reich, D. (2010). Longitudinal functional principal component analysis. Electron. J. Stat. 4 1022–1054. MR2727452 10.1214/10-EJS575 euclid.ejs/1286889183

14.

Guo, W. (2002). Functional mixed effects models. Biometrics 58 121–128. MR1891050 10.1111/j.0006-341X.2002.00121.xGuo, W. (2002). Functional mixed effects models. Biometrics 58 121–128. MR1891050 10.1111/j.0006-341X.2002.00121.x

15.

Hall, P., Müller, H.-G. and Yao, F. (2008). Modelling sparse generalized longitudinal observations with latent Gaussian processes. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 703–723. MR2523900 10.1111/j.1467-9868.2008.00656.xHall, P., Müller, H.-G. and Yao, F. (2008). Modelling sparse generalized longitudinal observations with latent Gaussian processes. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 703–723. MR2523900 10.1111/j.1467-9868.2008.00656.x

16.

Harville, D. (1976). Extension of the Gauss–Markov theorem to include the estimation of random effects. Ann. Statist. 4 384–395. MR398007 10.1214/aos/1176343414 euclid.aos/1176343414 Harville, D. (1976). Extension of the Gauss–Markov theorem to include the estimation of random effects. Ann. Statist. 4 384–395. MR398007 10.1214/aos/1176343414 euclid.aos/1176343414

17.

Hua, Z. W., Dunson, D. B., Gilmore, J. H., Styner, M. and Zhu, H. T. (2012). Semiparametric Bayesian local functional models for diffusion tensor tract statistics. NeuroImage 63 460–474.Hua, Z. W., Dunson, D. B., Gilmore, J. H., Styner, M. and Zhu, H. T. (2012). Semiparametric Bayesian local functional models for diffusion tensor tract statistics. NeuroImage 63 460–474.

18.

Karhunen, K. (1947). Über lineare Methoden in der Wahrscheinlichkeitsrechnung. Annales Academie Scientiarum Fennicae 37 1–79. MR23013Karhunen, K. (1947). Über lineare Methoden in der Wahrscheinlichkeitsrechnung. Annales Academie Scientiarum Fennicae 37 1–79. MR23013

19.

Li, Y., Zhu, H., Shen, D., Lin, W., Gilmore, J. H. and Ibrahim, J. G. (2011). Multiscale adaptive regression models for neuroimaging data. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 559–578. MR2853730 10.1111/j.1467-9868.2010.00767.xLi, Y., Zhu, H., Shen, D., Lin, W., Gilmore, J. H. and Ibrahim, J. G. (2011). Multiscale adaptive regression models for neuroimaging data. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 559–578. MR2853730 10.1111/j.1467-9868.2010.00767.x

20.

Loève, M. (1978). Probability Theory II, 4th ed. Springer, New York. MR651018Loève, M. (1978). Probability Theory II, 4th ed. Springer, New York. MR651018

21.

McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York. MR1884506McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York. MR1884506

22.

Minka, T. P. (2000). Automatic choice of dimensionality for PCA. Adv. Neural Inf. Process. Syst. 13 598–604.Minka, T. P. (2000). Automatic choice of dimensionality for PCA. Adv. Neural Inf. Process. Syst. 13 598–604.

23.

MIPAV (2011).  http://mipav.cit.nih.gov.MIPAV (2011).  http://mipav.cit.nih.gov.

24.

Mohamed, A. and Davatzikos, C. (2004). Medical Image Computing and Computer-Assisted Intervention. Springer, Berlin.Mohamed, A. and Davatzikos, C. (2004). Medical Image Computing and Computer-Assisted Intervention. Springer, Berlin.

25.

Mori, S. (2007). Introduction to Diffusion Tensor Imaging. Elsevier, Amsterdam.Mori, S. (2007). Introduction to Diffusion Tensor Imaging. Elsevier, Amsterdam.

26.

Morris, J. S. and Carroll, R. J. (2006). Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 179–199. MR2188981 10.1111/j.1467-9868.2006.00539.xMorris, J. S. and Carroll, R. J. (2006). Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 179–199. MR2188981 10.1111/j.1467-9868.2006.00539.x

27.

Morris, J. S., Baladandayuthapani, V., Herrick, R. C., Sanna, P. and Gutstein, H. (2011). Automated analysis of quantitative image data using isomorphic functional mixed models, with application to proteomics data. Ann. Appl. Stat. 5 894–923. MR2840180 10.1214/10-AOAS407 euclid.aoas/1310562210 Morris, J. S., Baladandayuthapani, V., Herrick, R. C., Sanna, P. and Gutstein, H. (2011). Automated analysis of quantitative image data using isomorphic functional mixed models, with application to proteomics data. Ann. Appl. Stat. 5 894–923. MR2840180 10.1214/10-AOAS407 euclid.aoas/1310562210

28.

Pujol, S. (2010). 3D-Slicer (tutorial). National Alliance for Medical Image Computing (NA-MIC).Pujol, S. (2010). 3D-Slicer (tutorial). National Alliance for Medical Image Computing (NA-MIC).

29.

Raine, C. S., McFarland, H. and Hohlfeld, R. (2008). Multiple Sclerosis: A Comprehensive Text. Saunders, Philadelphia, PA.Raine, C. S., McFarland, H. and Hohlfeld, R. (2008). Multiple Sclerosis: A Comprehensive Text. Saunders, Philadelphia, PA.

30.

Reich, D. S., Ozturk, A., Calabresi, P. A. and Mori, S. (2010). Automated vs conventional tractography in multiple sclerosis: Variablity and correlation with disability. NeuroImage 49 3047–3056.Reich, D. S., Ozturk, A., Calabresi, P. A. and Mori, S. (2010). Automated vs conventional tractography in multiple sclerosis: Variablity and correlation with disability. NeuroImage 49 3047–3056.

31.

Reiss, P. T. and Ogden, R. T. (2008). Functional generalized linear models with applications to neuroimaging. In Poster presentation Workshop on Contemporary Frontiers in High-Dimensional Statistical Data Analysis, Isaac Newton Institute, University of Cambridge, UK.Reiss, P. T. and Ogden, R. T. (2008). Functional generalized linear models with applications to neuroimaging. In Poster presentation Workshop on Contemporary Frontiers in High-Dimensional Statistical Data Analysis, Isaac Newton Institute, University of Cambridge, UK.

32.

Reiss, P. T. and Ogden, R. T. (2010). Functional generalized linear models with images as predictors. Biometrics 66 61–69. MR2756691 10.1111/j.1541-0420.2009.01233.xReiss, P. T. and Ogden, R. T. (2010). Functional generalized linear models with images as predictors. Biometrics 66 61–69. MR2756691 10.1111/j.1541-0420.2009.01233.x

33.

Reiss, P. T., Ogden, R. T., Mann, J. and Parsey, R. V. (2005). Functional logistic regression with PET imaging data: A voxel-level clinical diagnostic tool. Journal of Cerebral Blood Flow & Metabolism 25 s635.Reiss, P. T., Ogden, R. T., Mann, J. and Parsey, R. V. (2005). Functional logistic regression with PET imaging data: A voxel-level clinical diagnostic tool. Journal of Cerebral Blood Flow & Metabolism 25 s635.

34.

Rodríguez, A., Dunson, D. B. and Gelfand, A. E. (2009). Bayesian nonparametric functional data analysis through density estimation. Biometrika 96 149–162. MR2482141 10.1093/biomet/asn054Rodríguez, A., Dunson, D. B. and Gelfand, A. E. (2009). Bayesian nonparametric functional data analysis through density estimation. Biometrika 96 149–162. MR2482141 10.1093/biomet/asn054

35.

Roweis, S. (1997). EM algorithms for PCA and SPCA. Adv. Neural Inf. Process. Syst. 10 626–632.Roweis, S. (1997). EM algorithms for PCA and SPCA. Adv. Neural Inf. Process. Syst. 10 626–632.

36.

Shinohara, R., Crainiceanu, C., Caffo, B., Gaita, M. I. and Reich, D. S. (2011). Population wide model-free quantification of blood-brain-barrier dynamics in multiple sclerosis. NeuroImage 57 1430–1446.Shinohara, R., Crainiceanu, C., Caffo, B., Gaita, M. I. and Reich, D. S. (2011). Population wide model-free quantification of blood-brain-barrier dynamics in multiple sclerosis. NeuroImage 57 1430–1446.

37.

Shou, H., Zipunnikov, V., Crainiceanu, C. and Greven, S. (2013). Structured functional principal component analysis. Available at  arXiv:1304.67831304.6783Shou, H., Zipunnikov, V., Crainiceanu, C. and Greven, S. (2013). Structured functional principal component analysis. Available at  arXiv:1304.67831304.6783

38.

Staicu, A.-M., Crainiceanu, C. M. and Carroll, R. J. (2010). Fast analysis of spatially correlated multilevel functional data. Biostatistics 11 177–194.Staicu, A.-M., Crainiceanu, C. M. and Carroll, R. J. (2010). Fast analysis of spatially correlated multilevel functional data. Biostatistics 11 177–194.

39.

Weng, J., Zhang, Y. and Hwang, W.-S. (2003). Candid covariance-free incremental principal component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 25 1034–1040.Weng, J., Zhang, Y. and Hwang, W.-S. (2003). Candid covariance-free incremental principal component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 25 1034–1040.

40.

Xiao, L., Ruppert, D., Zipunnikov, V. and Crainiceanu, C. (2013). Fast covariance estimation for high-dimensional functional data. Available at  arXiv:1306.57181306.5718Xiao, L., Ruppert, D., Zipunnikov, V. and Crainiceanu, C. (2013). Fast covariance estimation for high-dimensional functional data. Available at  arXiv:1306.57181306.5718

41.

Yuan, Y., Gilmore, J. H., Geng, X., Styner, M., Chen, K., Wang, J. L. and Zhu, H. (2014). Fmem: Functional mixed effects modeling for the analysis of longitudinal white matter tract data. NeuroImage 84 753–764.Yuan, Y., Gilmore, J. H., Geng, X., Styner, M., Chen, K., Wang, J. L. and Zhu, H. (2014). Fmem: Functional mixed effects modeling for the analysis of longitudinal white matter tract data. NeuroImage 84 753–764.

42.

Zhao, H., Yuen, P. C. and Kwok, J. T. (2006). A novel incremental principal component analysis and its application for face recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 36 873–886.Zhao, H., Yuen, P. C. and Kwok, J. T. (2006). A novel incremental principal component analysis and its application for face recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 36 873–886.

43.

Zhu, H., Brown, P. J. and Morris, J. S. (2011). Robust, adaptive functional regression in functional mixed model framework. J. Amer. Statist. Assoc. 106 1167–1179. MR2894772 10.1198/jasa.2011.tm10370Zhu, H., Brown, P. J. and Morris, J. S. (2011). Robust, adaptive functional regression in functional mixed model framework. J. Amer. Statist. Assoc. 106 1167–1179. MR2894772 10.1198/jasa.2011.tm10370

44.

Zipunnikov, V., Caffo, B., Yousem, D. M., Davatzikos, C., Schwartz, B. S. and Crainiceanu, C. (2011a). Multilevel functional principal component analysis for high-dimensional data. J. Comput. Graph. Statist. 20 852–873. MR2878951 10.1198/jcgs.2011.10122Zipunnikov, V., Caffo, B., Yousem, D. M., Davatzikos, C., Schwartz, B. S. and Crainiceanu, C. (2011a). Multilevel functional principal component analysis for high-dimensional data. J. Comput. Graph. Statist. 20 852–873. MR2878951 10.1198/jcgs.2011.10122

45.

Zipunnikov, V., Caffo, B., Yousem, D. M., Davatzikos, C., Schwartz, B. S. and Crainiceanu, C. M. (2011b). Functional principal component models for high dimensional brain volumetrics. NeuroImage 58 772–784.Zipunnikov, V., Caffo, B., Yousem, D. M., Davatzikos, C., Schwartz, B. S. and Crainiceanu, C. M. (2011b). Functional principal component models for high dimensional brain volumetrics. NeuroImage 58 772–784.

46.

Zipunnikov, V., Greven, S., Shou, H., Caffo, B., Reich, D. S. and Crainiceanu, C. (2014). Supplement to “Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis.”  DOI:10.1214/14-AOAS748SUPPMR3292493 10.1214/14-AOAS748 euclid.aoas/1419001739 Zipunnikov, V., Greven, S., Shou, H., Caffo, B., Reich, D. S. and Crainiceanu, C. (2014). Supplement to “Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis.”  DOI:10.1214/14-AOAS748SUPPMR3292493 10.1214/14-AOAS748 euclid.aoas/1419001739
Copyright © 2014 Institute of Mathematical Statistics
Vadim Zipunnikov, Sonja Greven, Haochang Shou, Brian S. Caffo, Daniel S. Reich, and Ciprian M. Crainiceanu "Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis," The Annals of Applied Statistics 8(4), 2175-2202, (December 2014). https://doi.org/10.1214/14-AOAS748
Published: December 2014
Vol.8 • No. 4 • December 2014
Back to Top