## The Annals of Statistics

### Principal component analysis for functional data on Riemannian manifolds and spheres

#### Abstract

Functional data analysis on nonlinear manifolds has drawn recent interest. Sphere-valued functional data, which are encountered, for example, as movement trajectories on the surface of the earth are an important special case. We consider an intrinsic principal component analysis for smooth Riemannian manifold-valued functional data and study its asymptotic properties. Riemannian functional principal component analysis (RFPCA) is carried out by first mapping the manifold-valued data through Riemannian logarithm maps to tangent spaces around the Fréchet mean function, and then performing a classical functional principal component analysis (FPCA) on the linear tangent spaces. Representations of the Riemannian manifold-valued functions and the eigenfunctions on the original manifold are then obtained with exponential maps. The tangent-space approximation yields upper bounds to residual variances if the Riemannian manifold has nonnegative curvature. We derive a central limit theorem for the mean function, as well as root-$n$ uniform convergence rates for other model components. Our applications include a novel framework for the analysis of longitudinal compositional data, achieved by mapping longitudinal compositional data to trajectories on the sphere, illustrated with longitudinal fruit fly behavior patterns. RFPCA is shown to outperform an unrestricted FPCA in terms of trajectory recovery and prediction in applications and simulations.

#### Article information

Source
Ann. Statist., Volume 46, Number 6B (2018), 3334-3361.

Dates
Revised: October 2017
First available in Project Euclid: 11 September 2018

https://projecteuclid.org/euclid.aos/1536631276

Digital Object Identifier
doi:10.1214/17-AOS1660

Mathematical Reviews number (MathSciNet)
MR3852654

Zentralblatt MATH identifier
06965690

Subjects
Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties 62G99: None of the above, but in this section

#### Citation

Dai, Xiongtao; Müller, Hans-Georg. Principal component analysis for functional data on Riemannian manifolds and spheres. Ann. Statist. 46 (2018), no. 6B, 3334--3361. doi:10.1214/17-AOS1660. https://projecteuclid.org/euclid.aos/1536631276

#### References

• Adriaenssens, N., Coenen, S., Versporten, A., Muller, A., Minalu, G., Faes, C., Vankerckhoven, V., Aerts, M., Hens, N. and Molenberghs, G. (2011). European Surveillance of Antimicrobial Consumption (ESAC): Outpatient antibiotic use in Europe (1997–2009). J. Antimicrob. Chemother. 66 vi3–vi12.
• Aitchison, J. (1986). The Statistical Analysis of Compositional Data. Chapman & Hall, London.
• Anirudh, R., Turaga, P., Su, J. and Srivastava, A. (2015). Elastic functional coding of human actions: From vector-fields to latent variables. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 3147–3155.
• Anirudh, R., Turaga, P., Su, J. and Srivastava, A. (2017). Elastic functional coding of Riemannian trajectories. IEEE Trans. Pattern Anal. Mach. Intell. 39 922–936.
• Bhattacharya, R. and Lin, L. (2017). Omnibus CLTs for Fréchet means and nonparametric inference on non-Euclidean spaces. Proc. Amer. Math. Soc. 145 413–428.
• Bhattacharya, R. and Patrangenaru, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds.I. Ann. Statist. 31 1–29.
• Bhattacharya, R. and Patrangenaru, V. (2005). Large sample theory of intrinsic and extrinsic sample means on manifolds.II. Ann. Statist. 33 1225–1259.
• Bosq, D. (2000). Linear Processes in Function Spaces: Theory and Applications. Lecture Notes in Statistics 149. Springer, New York.
• Carey, J. R., Papadopoulos, N. T., Kouloussis, N. A., Katsoyannos, B. I., Müller, H.-G., Wang, J.-L. and Tseng, Y.-K. (2006). Age-specific and lifetime behavior patterns in Drosophila melanogaster and the Mediterranean fruit fly, Ceratitis capitata. Exp. Gerontol. 41 93–97.
• Castro, P. E., Lawton, W. H. and Sylvestre, E. A. (1986). Principal modes of variation for processes with continuous sample curves. Technometrics 28 329–337.
• Chavel, I. (2006). Riemannian Geometry, 2nd ed. Cambridge Studies in Advanced Mathematics 98. Cambridge Univ. Press, Cambridge.
• Chen, D. and Müller, H.-G. (2012). Nonlinear manifold representations for functional data. Ann. Statist. 40 1–29.
• Chiou, J.-M., Chen, Y.-T. and Yang, Y.-F. (2014). Multivariate functional principal component analysis: A normalization approach. Statist. Sinica 24 1571–1596.
• Cornea, E., Zhu, H., Kim, P. and Ibrahim, J. G. (2017). Regression models on Riemannian symmetric spaces. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 463–482.
• Dai, X. and Müller, H.-G. (2018). Supplement to “Principal component analysis for functional data on Riemannian manifolds and spheres.” DOI:10.1214/17-AOS1660SUPP.
• Fisher, N. I., Lewis, T. and Embleton, B. J. J. (1987). Statistical Analysis of Spherical Data. Cambridge Univ. Press, Cambridge.
• Fletcher, P. T., Lu, C., Pizer, S. M. and Joshi, S. (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Med. Imag. 23 995–1005.
• Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley, Chichester.
• Huckemann, S. F. and Eltzner, B. (2018). Backward nested descriptors asymptotics with inference on stem cell differentiation. Ann. Statist. 46 1994–2019.
• Huckemann, S., Hotz, T. and Munk, A. (2010). Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric Lie group actions. Statist. Sinica 20 1–58.
• Jain, N. C. and Marcus, M. B. (1975). Central limit theorems for C(S)-valued random variables. J. Funct. Anal. 19 216–231.
• Jung, S., Dryden, I. L. and Marron, J. S. (2012). Analysis of principal nested spheres. Biometrika 99 551–568.
• Jupp, P. E. and Kent, J. T. (1987). Fitting smooth paths to spherical data. J. Roy. Statist. Soc. Ser. C 36 34–46.
• Kendall, D. G., Barden, D., Carne, T. K. and Le, H. (2009). Shape and Shape Theory. Wiley, Hoboken.
• Kent, J. T., Mardia, K. V., Morris, R. J. and Aykroyd, R. G. (2001). Functional models of growth for landmark data. In Proceedings in Functional and Spatial Data Analysis 109–115.
• Kneip, A. and Utikal, K. J. (2001). Inference for density families using functional principal component analysis. J. Amer. Statist. Assoc. 96 519–542.
• Lila, E., Aston, J. A. D. and Sangalli, L. M. (2016). Smooth principal component analysis over two-dimensional manifolds with an application to neuroimaging. Ann. Appl. Stat. 10 1854–1879.
• Lin, Z. and Yao, F. (2017). Functional regression with unknown manifold structures. Available at arXiv:1704.03005.
• Lin, L., Thomas, B. S., Zhu, H. and Dunson, D. B. (2017). Extrinsic local regression on manifold-valued data. J. Amer. Statist. Assoc. 112 1261–1273.
• Mardia, K. V. and Jupp, P. E. (2009). Directional Statistics. Wiley, Hoboken.
• Nadaraya, E. A. (1964). On estimating regression. Theory Probab. Appl. 9 141–142.
• Patrangenaru, V. and Ellingson, L. (2015). Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. CRC Press, Boca Raton, FL.
• Petersen, A. and Müller, H.-G. (2016). Functional data analysis for density functions by transformation to a Hilbert space. Ann. Statist. 44 183–218.
• Petersen, A. and Müller, H. G. (2018). Fréchet regression for random objects. Ann. Statist. To appear. Available at arXiv:1608.03012.
• Qiu, Z., Song, X. K. and Tan, M. (2008). Simplex mixed-effects models for longitudinal proportional data. Scand. J. Stat. 35 577–596.
• Rahman, I. U., Drori, I., Stodden, V. C., Donoho, D. L. and Schröder, P. (2005). Multiscale representations for manifold-valued data. Multiscale Model. Simul. 4 1201–1232.
• Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
• Su, J., Kurtek, S., Klassen, E. and Srivastava, A. (2014). Statistical analysis of trajectories on Riemannian manifolds: Bird migration, hurricane tracking and video surveillance. Ann. Appl. Stat. 8 530–552.
• Telschow, F. J. E., Huckemann, S. F. and Pierrynowski, M. R. (2016). Functional inference on rotational curves and identification of human gait at the knee joint. Available at arXiv:1611.03665.
• Tournier, M., Wu, X., Courty, N., Arnaud, E. and Reveret, L. (2009). Motion compression using principal geodesics analysis. In Computer Graphics Forum 28 355–364.
• van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
• Wang, J.-L., Chiou, J.-M. and Müller, H.-G. (2016). Functional data analysis. Annu. Rev. Stat. Appl. 3 257–295.
• Watson, G. S. (1964). Smooth regression analysis. Sankhyā Ser. A 26 359–372.
• Zheng, Y. (2015). Trajectory data mining: An overview. ACM Trans. Intell. Syst. Technol. 6 29:1–29:41.

#### Supplemental materials

• Supplement to “Principal component analysis for functional data on Riemannian manifolds and spheres”. In the Supplementary Materials, we provide proofs of Corollary 2, Theorem 2 and Corollary 4; algorithms for RFPCA of compositional data; and additional simulations.