Electronic Journal of Statistics

Adaptive and minimax estimation of the cumulative distribution function given a functional covariate

Gaëlle Chagny and Angelina Roche

Full-text: Open access

Abstract

We consider the nonparametric kernel estimation of the conditional cumulative distribution function given a functional covariate. Given the bias-variance trade-off of the risk, we first propose a totally data-driven bandwidth selection mechanism in the spirit of the recent Goldenshluger-Lepski method and of model selection tools. The resulting estimator is shown to be adaptive and minimax optimal: we establish nonasymptotic risk bounds and compute rates of convergence under various assumptions on the decay of the small ball probability of the functional variable. We also prove lower bounds. Both pointwise and integrated criteria are considered. Finally, the choice of the norm or semi-norm involved in the definition of the estimator is also discussed, as well as the projection of the data on finite dimensional subspaces. Numerical results illustrate the method.

Article information

Source
Electron. J. Statist., Volume 8, Number 2 (2014), 2352-2404.

Dates
First available in Project Euclid: 6 November 2014

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1415285928

Digital Object Identifier
doi:10.1214/14-EJS956

Mathematical Reviews number (MathSciNet)
MR3275747

Zentralblatt MATH identifier
1302.62082

Subjects
Primary: 62G05: Estimation
Secondary: 62H12: Estimation

Keywords
Adaptive kernel estimator conditional cumulative distribution function minimax estimation functional random variable small ball probability

Citation

Chagny, Gaëlle; Roche, Angelina. Adaptive and minimax estimation of the cumulative distribution function given a functional covariate. Electron. J. Statist. 8 (2014), no. 2, 2352--2404. doi:10.1214/14-EJS956. https://projecteuclid.org/euclid.ejs/1415285928


Export citation

References

  • [1] Ait-Saïdi, A., Ferraty, F., Kassa, R. and Vieu, P. (2008). Cross-validated estimations in the single-functional index model., Statistics 42 475–494.
  • [2] Akakpo, N. and Lacour, C. (2011). Inhomogeneous and anisotropic conditional density estimation from dependent data., Electron. J. Stat. 5 1618–1653.
  • [3] Anderson, T. W. (1955). The integral of a symmetric unimodal function over a symmetric convex set and some probability inequalities., Proc. Amer. Math. Soc. 6 170–176.
  • [4] Ash, R. B. and Gardner, M. F. (1975)., Topics in Stochastic Processes. Academic Press [Harcourt Brace Jovanovich Publishers], New York. Probability and Mathematical Statistics, Vol. 27.
  • [5] Aspirot, L., Bertin, K. and Perera, G. (2009). Asymptotic normality of the Nadaraya-Watson estimator for nonstationary functional data and applications to telecommunications., J. Nonparametr. Stat. 21 535–551.
  • [6] Azzedine, N., Laksaci, A. and Ould-Saïd, E. (2008). On robust nonparametric regression estimation for a functional regressor., Statist. Probab. Lett. 78 3216–3221.
  • [7] Benhenni, K., Ferraty, F., Rachdi, M. and Vieu, P. (2007). Local smoothing regression with functional data., Comput. Statist. 22 353–369.
  • [8] Bertin, K., Lacour, C. and Rivoirard, V. (2014). Adaptive estimation of conditional density function. Submitted, hal-00922555.
  • [9] Birgé, L. and Massart, P. (1998). Minimum contrast estimators on sieves: exponential bounds and rates of convergence., Bernoulli 4 329–375.
  • [10] Brunel, E., Comte, F. and Lacour, C. (2010). Minimax estimation of the conditional cumulative distribution function., Sankhya A 72 293–330.
  • [11] Brunel, E., Mas, A. and Roche, A. (2013). Non-asymptotic Adaptive Prediction in Functional Linear Models. Submitted, hal-00763924.
  • [12] Burba, F., Ferraty, F. and Vieu, P. (2009). $k$-nearest neighbour method in functional nonparametric regression., J. Nonparametr. Stat. 21 453–469.
  • [13] Cai, T. T. and Hall, P. (2006). Prediction in functional linear regression., Ann. Statist. 34 2159–2179.
  • [14] Cardot, H., Ferraty, F. and Sarda, P. (1999). Functional linear model., Statist. Probab. Lett. 45 11–22.
  • [15] Chagny, G. (2013a). Penalization versus Goldenshluger-Lepski strategies in warped bases regression., ESAIM Probab. Statist. 17 328–358 (electronic).
  • [16] Chagny, G. (2013b). Estimation adaptative avec des données transformées ou incomplètes. Application à des modèles de survie PhD thesis, Univ. Paris, Descartes.
  • [17] Chen, D., Hall, P. and Müller, H. G. (2011). Single and multiple index functional regression models with nonparametric link., Ann. Statist. 39 1720–1747.
  • [18] Comte, F. and Genon-Catalot, V. (2012). Convolution power kernels for density estimation., J. Statist. Plann. Inference 142 1698–1715.
  • [19] Comte, F. and Johannes, J. (2012). Adaptive functional linear regression., Ann. Statist. 40 2765–2797.
  • [20] Comte, F., Rozenholc, Y. and Taupin, M.-L. (2006). Penalized contrast estimator for adaptive density deconvolution., Canad. J. Statist. 34 431–452.
  • [21] Crambes, C., Delsol, L. and Laksaci, A. (2008). Robust nonparametric estimation for functional data., J. Nonparametr. Stat. 20 573–598.
  • [22] Crambes, C., Kneip, A. and Sarda, P. (2009). Smoothing splines estimators for functional linear regression., Ann. Statist. 37 35–72.
  • [23] Dabo-Niang, S., Kaid, Z. and Laksaci, A. (2012). On spatial conditional mode estimation for a functional regressor., Statist. Probab. Lett. 82 1413–1421.
  • [24] Dabo-Niang, S. and Rhomari, N. (2009). Kernel regression estimation in a Banach space., J. Statist. Plann. Inference 139 1421–1434.
  • [25] Dabo-Niang, S. and Yao, A.-F. (2013). Kernel spatial density estimation in infinite dimension space., Metrika 76 19–52.
  • [26] Delaigle, A. and Hall, P. (2010). Defining probability density for a distribution of random functions., Ann. Statist. 38 1171–1193.
  • [27] Demongeot, J., Laksaci, A., Madani, F. and Rachdi, M. (2010). Estimation locale linéaire de la densité conditionnelle pour des données fonctionnelles., C. R. Math. Acad. Sci. Paris 348 931–934.
  • [28] Dunker, T., Lifshits, M. A. and Linde, W. (1998). Small deviation probabilities of sums of independent random variables. In, High Dimensional Probability (Oberwolfach, 1996). Progr. Probab. 43 59–74. Birkhäuser, Basel.
  • [29] Ferraty, F., Laksaci, A. and Vieu, P. (2006). Estimating some characteristics of the conditional distribution in nonparametric functional models., Statistical Inference for Stochastic Processes. 9 47–76.
  • [30] Ferraty, F., Mas, A. and Vieu, P. (2007). Nonparametric regression on functional data: inference and practical aspects., Aust. N. Z. J. Stat. 49 267–286.
  • [31] Ferraty, F. and Romain, Y. (2011)., The Oxford Handbook of Functional Data Analysis. Oxford Handbooks in Mathematics. OUP Oxford.
  • [32] Ferraty, F. and Vieu, P. (2002). The functional nonparametric model and application to spectrometric data., Comput. Statist. 17 545–564.
  • [33] Ferraty, F. and Vieu, P. (2006)., Nonparametric Functional Data Analysis. Springer Series in Statistics. Springer, New York. Theory and practice.
  • [34] Ferraty, F., Laksaci, A., Tadj, A. and Vieu, P. (2010). Rate of uniform consistency for nonparametric estimates with functional variables., J. Stat. Plan. Inference 140 335–352.
  • [35] Geenens, G. (2011). Curse of dimensionality and related issues in nonparametric functional regression., Stat. Surv. 5 30–43.
  • [36] Gheriballah, A., Laksaci, A. and Sekkal, S. (2013). Nonparametric $M$-regression for functional ergodic data., Statist. Probab. Lett. 83 902–908.
  • [37] Gijbels, I., Omelka, M. and Veraverbeke, N. (2012). Multivariate and functional covariates and conditional copulas., Electron. J. Stat. 6 1273–1306.
  • [38] Glaser, N. S., Barnett, P., McCaslin, I., Nelson, D., Trainor, J., Louie, J., Kaufman, F., Quayle, K., Roback, M., Malley, R. et al. (2001). Risk factors for cerebral edema in children with diabetic ketoacidosis., New Engl. J. Med. 344 264–269.
  • [39] Glaser, N. S., Wootton-Gorges, S. L., Buonocore, M. H., Tancredi, D. J., Marcin, J. P., Caltagirone, R., Lee, Y., Murphy, C. and Kuppermann, N. (2013). Subclinical cerebral edema in children with diabetic ketoacidosis randomized to 2 different rehydration protocols., Pediatrics 131 e73–e80.
  • [40] Goldenshluger, A. and Lepski, O. (2011). Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality., Ann. Statist. 39 1608–1632.
  • [41] Goldenshluger, A. and Lepski, O. (2013). On adaptive minimax density estimation on Rd., Probab. Theory Related Fields to appear.
  • [42] Hoffmann-Jørgensen, J., Shepp, L. A. and Dudley, R. M. (1979). On the lower tail of Gaussian seminorms., Ann. Probab. 7 319–342.
  • [43] Kerkyacharian, G., Lepski, O. and Picard, D. (2001). Nonlinear estimation in anisotropic multi-index denoising., Probab. Theory Related Fields 121 137–170.
  • [44] Klein, T. and Rio, E. (2005). Concentration around the mean for maxima of empirical processes., Ann. Probab. 33 1060–1077.
  • [45] Lacour, C. (2006). Rates of convergence for nonparametric deconvolution., C. R. Math. Acad. Sci. Paris 342 877–882.
  • [46] Lacour, C. (2008). Adaptive estimation of the transition density of a particular hidden Markov chain., J. Multivariate Anal. 99 787–814.
  • [47] Laib, N. and Louani, D. (2010). Nonparametric kernel regression estimation for functional stationary ergodic data: asymptotic properties., J. Multivariate Anal. 101 2266–2281.
  • [48] Li, W. V. and Shao, Q. M. (2001). Gaussian processes: inequalities, small ball probabilities and applications. In, Stochastic Processes: Theory and Methods. Handbook of Statist. 19 533–597. North-Holland, Amsterdam.
  • [49] Lifshits, M. A. (1997). On the lower tail probabilities of some random series., Ann. Probab. 25 424–442.
  • [50] Mas, A. (2012). Lower bound in regression for functional data by representation of small ball probabilities., Electron. J. Stat. 6 1745–1778.
  • [51] Masry, E. (2005). Nonparametric regression estimation for dependent functional data: asymptotic normality., Stochastic Process. Appl. 115 155–177.
  • [52] Plancade, S. (2013). Adaptive estimation of the conditional cumulative distribution function from current status data., J. Statist. Plann. Inference 143 1466–1485.
  • [53] Rachdi, M. and Vieu, P. (2007). Nonparametric regression for functional data: automatic smoothing parameter selection., J. Statist. Plann. Inference 137 2784–2801.
  • [54] Ramsay, J. O. and Silverman, B. W. (2005)., Functional Data Analysis, second ed. Springer Series in Statistics. Springer, New York.
  • [55] Shang, H. L. (2013). Bayesian bandwidth estimation for a nonparametric functional regression model with unknown error density., Comput. Statist. Data Anal. 67 185–198.
  • [56] Tsybakov, A. B. (2009)., Introduction to Nonparametric Estimation. Springer Series in Statistics. Springer, New York. Revised and extended from the 2004 French original, Translated by V. Zaiats.