The Annals of Statistics

Defining probability density for a distribution of random functions

Aurore Delaigle and Peter Hall

Full-text: Open access

Abstract

The notion of probability density for a random function is not as straightforward as in finite-dimensional cases. While a probability density function generally does not exist for functional data, we show that it is possible to develop the notion of density when functional data are considered in the space determined by the eigenfunctions of principal component analysis. This leads to a transparent and meaningful surrogate for density defined in terms of the average value of the logarithms of the densities of the distributions of principal components for a given dimension. This density approximation is estimable readily from data. It accurately represents, in a monotone way, key features of small-ball approximations to density. Our results on estimators of the densities of principal component scores are also of independent interest; they reveal interesting shape differences that have not previously been considered. The statistical implications of these results and properties are identified and discussed, and practical ramifications are illustrated in numerical work.

Article information

Source
Ann. Statist., Volume 38, Number 2 (2010), 1171-1193.

Dates
First available in Project Euclid: 19 February 2010

Permanent link to this document
https://projecteuclid.org/euclid.aos/1266586626

Digital Object Identifier
doi:10.1214/09-AOS741

Mathematical Reviews number (MathSciNet)
MR2604709

Zentralblatt MATH identifier
1183.62061

Subjects
Primary: 62G05: Estimation
Secondary: 62G07: Density estimation

Keywords
Density estimation dimension eigenfunction eigenvalue functional data analysis kernel methods log-density estimation nonparametric statistics principal components analysis probability density function resolution level scale space

Citation

Delaigle, Aurore; Hall, Peter. Defining probability density for a distribution of random functions. Ann. Statist. 38 (2010), no. 2, 1171--1193. doi:10.1214/09-AOS741. https://projecteuclid.org/euclid.aos/1266586626


Export citation

References

  • Besse, P. and Ramsay, J. O. (1986). Principal components analysis of sampled functions. Psychometrika 51 285–311.
  • Chaudhuri, P. and Marron, J. S. (1999). SiZer for exploration of structures in curves. J. Amer. Statist. Assoc. 94 807–823.
  • Chaudhuri, P. and Marron, J. S. (2000). Scale space view of curve estimation. Ann. Statist. 28 408–428.
  • Dabo-Niang, S. (2002). Estimation de la densité dans un espace de dimension infinie: Application aux diffusions. C. R. Math. Acad. Sci. Paris 334 213–216.
  • Dabo-Niang, S., Ferraty, F. and Vieu, P. (2004a). Estimation du mode dans un espace vectoriel semi-normé. C. R. Math. Acad. Sci. Paris 339 659–662.
  • Dabo-Niang, S., Ferraty, F. and Vieu, P. (2004b). Nonparametric unsupervised classification of satellite wave altimeter forms. In COMPSTAT 2004—Proceedings in Computational Statistics 879–886. Physica, Heidelberg.
  • Dabo-Niang, S., Ferraty, F. and Vieu, P. (2006). Mode estimation for functional random variable and its application for curves classification. Far East J. Theor. Stat. 18 93–119.
  • Delaigle, A. and Hall, P. (2008). Defining probability density for a distribution of random functions. Technical report. Dept. Mathematics and Statistics, Univ. Melbourne.
  • Ferraty, F., Goïa, A. and Vieu, P. (2002a). Régression non-paramétrique pour des variables aléatoires fonctionnelles mélangeantes. C. R. Math. Acad. Sci. Paris 334 217–220.
  • Ferraty, F., Goïa, A. and Vieu, P. (2002b). Functional nonparametric model for time series: A fractal approach to dimension reduction. Test 11 317–344.
  • Ferraty, F., Goïa, A. and Vieu, P. (2007a). On the using of modal curves for radar waveforms classification. Comput. Statist. Data Anal. 51 4878–4890.
  • Ferraty, F., Goïa, A. and Vieu, P. (2007b). Nonparametric functional methods: New tools for chemometric analysis. In Statistical Methods for Biostatistics and Related Fields (W. Härdle, Y. Mori and P. Vieu, eds.) 245–264. Springer, Berlin.
  • Ferraty, F. and Vieu, P. (2002). The functional nonparametric model and application to spectrometric data. Comput. Statist. 17 545–564.
  • Ferraty, F. and Vieu, P. (2003). Curves discrimination: A nonparametric functional approach. Comput. Statist. Data Anal. 44 161–173.
  • Ferraty, F. and Vieu, P. (2004). Nonparametric models for functional data, with application in regression, time-series prediction and curve discrimination. J. Nonparametr. Stat. 16 111–125.
  • Ferraty, F. and Vieu, P. (2006a). Functional nonparametric statistics in action. In The Art of Semiparametrics 112–129. Springer, Heidelberg.
  • Ferraty, F. and Vieu, P. (2006b). Nonparametric Functional Data Analysis: Theory and Practice. Springer, Berlin.
  • Gasser, T., Hall, P. and Presnell, B. (1998). Nonparametric estimation of the mode of a distribution of random curves. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 681–691.
  • Gervini, D. (2008). Robust functional estimation using the spatial median and spherical principal components. Biometrika 95 587–600.
  • Godtliebsen, F., Chaudhuri, P. and Marron, J. S. (2002). Significance in scale space for bivariate density estimation. J. Comput. Graph. Statist. 11 1–21.
  • Haldane, J. B. S. (1942). The mode and median of a nearly normal distribution with given cumulants. Biometrika 32 294–299.
  • Hall, P. (1980). On the limiting behaviour of the mode and median of a sum of independent random variables. Ann. Probab. 8 419–430.
  • Hall, P. and Heckman, N. (2002). Estimating and depicting the structure of a distribution of random functions. Biometrika 89 145–158.
  • Hall, P. and Hosseini-Nasab, M. (2006). On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 109–126.
  • Hall, P. and Hosseini-Nasab, M. (2009). Theory for high-order bounds in functional principal components analysis. Math. Proc. Cambridge Philos. Soc. 146 225–256.
  • Hall, P. and Vial, C. (2006). Assessing the finite dimensionality of functional data. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 689–705.
  • Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika 30 179–185.
  • Lavery, B., Kariko, A. and Nicholls, N. (1992). A historical rainfall data set for Australia. Aust. Met. Mag. 40 33–39.
  • Leng, X. and Müller, H.-G. (2006). Classification using functional data analysis for temporal gene expression data. Bioinformatics 22 68–76.
  • Peres-Neto, P. R., Jackson, D. A. and Somers, K. M. (2005). How many principal components?—Stopping rules for determining the number of non-trivial axes revisited. Comput. Statist. Data Anal. 49 974–997.
  • Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis (with discussion). J. Roy. Statist. Soc. Ser. B 53 539–572.
  • Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis: Methods and Case Studies. Springer, New York.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
  • Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243.
  • Silverman, B. W. (1986). Density Estimation. Chapman and Hall, London.
  • Silverman, B. W. (1995). Incorporating parametric effects into functional principal components analysis. J. Roy. Statist. Soc. Ser. B 57 673–689.
  • Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24.
  • Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika 41 321–327.
  • Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.
  • Zwick, W. R. and Velicer, W. F. (1986). Factor influencing five rules for determining the number of components to retain. Psychol. Bull. 99 432–442.