The Annals of Statistics

Assessing extrema of empirical principal component functions

Peter Hall and Céline Vial

Full-text: Open access

Abstract

The difficulties of estimating and representing the distributions of functional data mean that principal component methods play a substantially greater role in functional data analysis than in more conventional finite-dimensional settings. Local maxima and minima in principal component functions are of direct importance; they indicate places in the domain of a random function where influence on the function value tends to be relatively strong but of opposite sign. We explore statistical properties of the relationship between extrema of empirical principal component functions, and their counterparts for the true principal component functions. It is shown that empirical principal component funcions have relatively little trouble capturing conventional extrema, but can experience difficulty distinguishing a “shoulder” in a curve from a small bump. For example, when the true principal component function has a shoulder, the probability that the empirical principal component function has instead a bump is approximately equal to ½. We suggest and describe the performance of bootstrap methods for assessing the strength of extrema. It is shown that the subsample bootstrap is more effective than the standard bootstrap in this regard. A “bootstrap likelihood” is proposed for measuring extremum strength. Exploratory numerical methods are suggested.

Article information

Source
Ann. Statist., Volume 34, Number 3 (2006), 1518-1544.

Dates
First available in Project Euclid: 10 July 2006

Permanent link to this document
https://projecteuclid.org/euclid.aos/1152540757

Digital Object Identifier
doi:10.1214/009053606000000371

Mathematical Reviews number (MathSciNet)
MR2278366

Zentralblatt MATH identifier
1113.62074

Subjects
Primary: 62H25: Factor analysis and principal components; correspondence analysis
Secondary: 62G09: Resampling methods

Keywords
Bootstrap bootstrap likelihood functional data analysis mode perturbation principal components analysis shoulder point subsample turning point

Citation

Hall, Peter; Vial, Céline. Assessing extrema of empirical principal component functions. Ann. Statist. 34 (2006), no. 3, 1518--1544. doi:10.1214/009053606000000371. https://projecteuclid.org/euclid.aos/1152540757


Export citation

References

  • Besse, P. and Ramsay, J. O. (1986). Principal components-analysis of sampled functions. Psychometrika 51 285--311.
  • Bhatia, R., Davis, C. and McIntosh, A. (1983). Perturbation of spectral subspaces and solution of linear operator equations. Linear Algebra Appl. 52/53 45--67.
  • Biau, G., Bunea, F. and Wegkamp, M. (2005). Functional classification in Hilbert spaces. IEEE Trans. Inform. Theory 51 2163--2172.
  • Boente, G. and Fraiman, R. (2000). Kernel-based functional principal components. Statist. Probab. Lett. 48 335--345.
  • Bosq, D. (2000). Linear Processes in Function Spaces. Theory and Applications. Lecture Notes in Statist. 149. Springer, New York.
  • Cardot, H. (2000). Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J. Nonparametr. Statist. 12 503--538.
  • Cardot, H., Ferraty, F. and Sarda, P. (2000). Étude asymptotique d'un estimateur spline hybride pour le modèle linéaire fonctionnel. C. R. Acad. Sci. Paris Sér. I Math. 330 501--504.
  • Cardot, H., Ferraty, F. and Sarda, P. (2003). Spline estimators for the functional linear model. Statist. Sinica 13 571--591.
  • Chaudhuri, P. and Marron, J. S. (1999). SiZer for exploration of structures in curves. J. Amer. Statist. Assoc. 94 807--823.
  • Chaudhuri, P. and Marron, J. S. (2000). Scale space view of curve estimation. Ann. Statist. 28 408--428.
  • Cheng, M.-Y. and Hall, P. (1998). Calibrating the excess mass and dip tests of modality. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 579--589.
  • Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 12 136--154.
  • Fan, J. and Lin, S.-K. (1998). Test of significance when data are curves. J. Amer. Statist. Assoc. 93 1007--1021.
  • Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics 39 254--261.
  • Ferraty, F. and Vieu, P. (2003). Curves discrimination: A nonparametric functional approach. Comput. Statist. Data Anal. 44 161--173.
  • Ferraty, F. and Vieu, P. (2004). Nonparametric models for functional data, with application in regression, time-series prediction and curve discrimination. J. Nonparametr. Statist. 16 111--125.
  • Fisher, N. I. and Marron, J. S. (2001). Mode testing via the excess mass estimate. Biometrika 88 499--517.
  • Girard, S. (2000). A nonlinear PCA based on manifold approximation. Comput. Statist. 15 145--167.
  • Glendinning, R. H. and Fleet, S. L. (2004). Classifying non-uniformly sampled, vector-valued curves. Pattern Recognition 37 1999--2008.
  • Glendinning, R. H. and Herbert, R. A. (2003). Shape classification using smooth principal components. Pattern Recognition Lett. 24 2021--2030.
  • Hartigan, J. A. and Hartigan, P. M. (1985). The dip test of unimodality. Ann. Statist. 13 70--84.
  • He, G. Z., Müller, H.-G. and Wang, J.-L. (2003). Functional canonical analysis for square integrable stochastic processes. J. Multivariate Anal. 85 54--77.
  • Huang, J. Z., Wu, C. O. and Zhou, L. (2002). Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89 111--128.
  • Huang, P. S. (2001). Automatic gait recognition via statistical approaches for extended template features. IEEE Trans. Systems, Man and Cybernetics Part B 31 818--824.
  • Indritz, J. (1963). Methods in Analysis. Macmillan, New York.
  • James, G. M., Hastie, T. J. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587--602.
  • Mammen, E., Marron, J. S. and Fisher, N. I. (1992). Some asymptotics for multimodality tests based on kernel density estimates. Probab. Theory Related Fields 91 115--132.
  • Mas, A. (2002). Weak convergence for the covariance operators of a Hilbertian linear process. Stochastic Process. Appl. 99 117--135.
  • Minnotte, M. C. (1997). Nonparametric testing of the existence of modes. Ann. Statist. 25 1646--1660.
  • Müller, D. W. and Sawitzki, G. (1991). Excess mass estimates and tests for multimodality. J. Amer. Statist. Assoc. 86 738--746.
  • Olshen, R. A., Biden, E. N., Wyatt, M. P. and Sutherland, D. H. (1989). Gait analysis and the bootstrap. Ann. Statist. 17 1419--1440.
  • Pfeiffer, R. M., Bura, E., Smith, A. and Rutter, J. L. (2002). Two approaches to mutation detection based on functional data. Statistics in Medicine 21 3447--3464.
  • Politis, D., Romano, J. P. and Wolf, M. (1999). Subsampling. Springer, New York.
  • Polonik, W. (1995). Density estimation under qualitative assumptions in higher dimensions. J. Multivariate Anal. 55 61--81.
  • Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis (with discussion). J. Roy. Statist. Soc. Ser. B 53 539--572.
  • Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis: Methods and Case Studies. Springer, New York.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
  • Rao, C. R. (1956). Analysis of dispersion with incomplete observations on one of the characters. J. Roy. Statist. Soc. Ser. B 18 259--264.
  • Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233--243.
  • Rice, J. A. and Wu, C. O. (2001). Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57 253--259.
  • Roeder, K. (1994). A graphical technique for determining the number of components in a mixture of normals. J. Amer. Statist. Assoc. 89 487--495.
  • Shoung, J.-M. and Zhang, C.-H. (2001). Least squares estimators of the mode of a unimodal regression function. Ann. Statist. 29 648--665.
  • Silverman, B. W. (1981). Using kernel density estimates to investigate multimodality. J. Roy. Statist. Soc. Ser. B 43 97--99.
  • Silverman, B. W. (1995). Incorporating parametric effects into functional principal components analysis. J. Roy. Statist. Soc. Ser. B 57 673--689.
  • Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1--24.
  • Ventura, V., Carta, R., Kass, R. E., Gettner, S. N. and Olson, C. R. (2002). Statistical analysis of temporal evolution in single-neuron firing rates. Biostatistics 3 1--20.
  • Ziegler, K. (2002). On nonparametric kernel estimation of the mode of the regression function in the random design model. J. Nonparametr. Statist. 14 749--774.
  • Ziegler, K. (2003). On the asymptotic normality of kernel regression estimators of the mode in the nonparametric random design model. J. Statist. Plann. Inference 115 123--144.