The Annals of Statistics

Optimal model selection for density estimation of stationary data under various mixing conditions

Matthieu Lerasle

Full-text: Open access


We propose a block-resampling penalization method for marginal density estimation with nonnecessary independent observations. When the data are β or τ-mixing, the selected estimator satisfies oracle inequalities with leading constant asymptotically equal to 1.

We also prove in this setting the slope heuristic, which is a data-driven method to optimize the leading constant in the penalty.

Article information

Ann. Statist., Volume 39, Number 4 (2011), 1852-1877.

First available in Project Euclid: 26 July 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G07: Density estimation 62G09: Resampling methods
Secondary: 62M99: None of the above, but in this section

Density estimation optimal model selection resampling methods slope heuristic weak dependence


Lerasle, Matthieu. Optimal model selection for density estimation of stationary data under various mixing conditions. Ann. Statist. 39 (2011), no. 4, 1852--1877. doi:10.1214/11-AOS888.

Export citation


  • Andrews, D. W. K. (1984). Nonstrong mixing autoregressive processes. J. Appl. Probab. 21 930–934.
  • Arlot, S. (2008). V-fold cross-validation improved: V-fold penalization. Available at arXiv:0802.0566v2.
  • Arlot, S. (2009). Model selection by resampling penalization. Electron. J. Stat. 3 557–624.
  • Arlot, S. and Massart, P. (2009). Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 245–279.
  • Baraud, Y., Comte, F. and Viennet, G. (2001). Adaptive estimation in autoregression or β-mixing regression via model selection. Ann. Statist. 29 839–875.
  • Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301–413.
  • Berbee, H. C. P. (1979). Random Walks with Stationary Increments and Renewal Theory. Mathematical Centre Tracts 112. Mathematisch Centrum, Amsterdam.
  • Birgé, L. and Massart, P. (1997). From model selection to adaptive estimation. In Festschrift for Lucien Le Cam 55–87. Springer, New York.
  • Birgé, L. and Massart, P. (2007). Minimal penalties for Gaussian model selection. Probab. Theory Related Fields 138 33–73.
  • Bousquet, O. (2002). A Bennett concentration inequality and its application to suprema of empirical processes. C. R. Math. Acad. Sci. Paris 334 495–500.
  • Bradley, R. C. (2007). Introduction to Strong Mixing Conditions. Vol. 1. Kendrick Press, Heber City, UT.
  • Comte, F., Dedecker, J. and Taupin, M. L. (2008). Adaptive density deconvolution with dependent inputs. Math. Methods Statist. 17 87–112.
  • Comte, F. and Merlevède, F. (2002). Adaptive estimation of the stationary density of discrete and continuous time mixing processes. ESAIM Probab. Stat. 6 211–238 (electronic).
  • Dedecker, J. and Prieur, C. (2005). New dependence coefficients. Examples and applications to statistics. Probab. Theory Related Fields 132 203–236.
  • Dedecker, J., Doukhan, P., Lang, G., León, J. R., Louhichi, S. and Prieur, C. (2007). Weak Dependence: With Examples and Applications. Lecture Notes in Statistics 190. Springer, New York.
  • Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1996). Density estimation by wavelet thresholding. Ann. Statist. 24 508–539.
  • Doukhan, P. (1994). Mixing: Properties and Examples. Lecture Notes in Statistics 85. Springer, New York.
  • Gannaz, I. and Wintenberger, O. (2009). Adaptive density estimation under dependence. ESAIM Probab. Stat. 14 151–172.
  • Klein, T. and Rio, E. (2005). Concentration around the mean for maxima of empirical processes. Ann. Probab. 33 1060–1077.
  • Künsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations. Ann. Statist. 17 1217–1241.
  • Lacour, C. (2008). Nonparametric estimation of the stationary density and the transition density of a Markov chain. Stochastic Process. Appl. 118 232–260.
  • Lerasle, M. (2009). Adaptive density estimation of stationary β-mixing and τ-mixing processes. Math. Methods Statist. 18 59–83.
  • Lerasle, M. (2011a). Optimal model selection in density estimation. Ann. Inst. Henri Poincaré Probab. Stat. To appear. Available at arXiv:0910.1654.
  • Lerasle, M. (2011b). Supplement to “Optimal model selection for density estimation of stationary data under various mixing conditions.” DOI:10.1214/11-AOS888SUPP.
  • Liu, R. Y. and Singh, K. (1992). Moving block jackknife and bootstrap capture weak dependence. In Exploring the Limits of Bootstrap (R. Lepage and L. Billard, eds.) 225–248. Wiley, New York.
  • Massart, P. and Nédélec, E. (2006). Risk bounds for statistical learning. Ann. Statist. 34 2326–2366.
  • Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scand. J. Stat. 9 65–78.
  • Volkonskiĭ, V. A. and Rozanov, Y. A. (1959). Some limit theorems for random functions. I. Teor. Veroyatn. Primen. 4 186–207.

Supplemental materials

  • Supplementary material: Proofs of Lemmas 5.1 and 5.2. In the Supplementary Material, we give complete proofs of the concentrations Lemmas 5.1 and 5.2. We use coupling results, respectively, of Berbee (1979) and Dedecker and Prieur (2005), to build sequences of independent random variables (A_0^∗, …, A_(p−1)^∗) approximating the sequence of blocks (A_0, …, A_(p−1)), respectively in the β and τ mixing case. We prove concentration lemmas equivalent to Lemmas 5.1 and 5.2 for these approximating random variables. The main tools here are the concentration inequalities of Bousquet (2002) and Klein and Rio (2005) for the maximum of the empirical process. We prove finally some covariance inequalities to evaluate the expectation of p(m) and deduce the rates ε_n = (ln n)^(−1/2).