Bernoulli

Optimal rates for plug-in estimators of density level sets

Philippe Rigollet and Régis Vert
Source: Bernoulli Volume 15, Number 4 (2009), 1154-1178.

Abstract

In the context of density level set estimation, we study the convergence of general plug-in methods under two main assumptions on the density for a given level λ. More precisely, it is assumed that the density (i) is smooth in a neighborhood of λ and (ii) has γ-exponent at level λ. Condition (i) ensures that the density can be estimated at a standard nonparametric rate and condition (ii) is similar to Tsybakov’s margin assumption which is stated for the classification framework. Under these assumptions, we derive optimal rates of convergence for plug-in estimators. Explicit convergence rates are given for plug-in estimators based on kernel density estimators when the underlying measure is the Lebesgue measure. Lower bounds proving optimality of the rates in a minimax sense when the density is Hölder smooth are also provided.

First Page: Show Hide
Full-text: Access denied (no subscription detected)
We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.bj/1262962230
Digital Object Identifier: doi:10.3150/09-BEJ184
Zentralblatt MATH identifier: 1200.62034
Mathematical Reviews number (MathSciNet): MR2597587

References

Audibert, J.-Y. and Tsybakov, A. (2005). Fast learning rates for plug-in classifiers under the margin condition. Technical report, Laboratoire de Probabilités et Modèles Aléatoires de Paris 6. Available at http://arxiv.org/abs/math/0507180.
Mathematical Reviews (MathSciNet): MR2336861
Zentralblatt MATH: 1118.62041
Digital Object Identifier: doi:10.1214/009053606000001217
Project Euclid: euclid.aos/1183667286
Audibert, J.-Y. and Tsybakov, A. (2007). Fast learning rates for plug-in classifiers. Ann. Statist. 35 608–633.
Mathematical Reviews (MathSciNet): MR2336861
Zentralblatt MATH: 1118.62041
Digital Object Identifier: doi:10.1214/009053606000001217
Project Euclid: euclid.aos/1183667286
Baíllo, A. (2003). Total error in a plug-in estimator of level sets. Statist. Probab. Lett. 65 411–417.
Baíllo, A., Cuesta-Albertos, J.A. and Cuevas, A. (2001). Convergence rates in nonparametric estimation of level sets. Statist. Probab. Lett. 53 27–35.
Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 203–268.
Mathematical Reviews (MathSciNet): MR1848946
Zentralblatt MATH: 1037.62001
Digital Object Identifier: doi:10.1007/s100970100031
Cuevas, A. and Fraiman, R. (1997). A plug-in approach to support estimation. Ann. Statist. 25 2300–2312.
Mathematical Reviews (MathSciNet): MR1604449
Zentralblatt MATH: 0897.62034
Digital Object Identifier: doi:10.1214/aos/1030741073
Project Euclid: euclid.aos/1030741073
Cuevas, A., González-Manteiga, W. and Rodríguez-Casal, A. (2006). Plug-in estimation of general level sets. Aust. N. Z. J. Stat. 48 7–19.
Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. New York: Springer.
Mathematical Reviews (MathSciNet): MR1383093
Zentralblatt MATH: 0853.68150
Devroye, L. and Wise, G.L. (1980). Detection of abnormal behavior via nonparametric estimation of the support. SIAM J. Appl. Math. 38 480–488.
Mathematical Reviews (MathSciNet): MR579432
Zentralblatt MATH: 0479.62028
Digital Object Identifier: doi:10.1137/0138038
Gayraud, G. and Rousseau, J. (2005). Rates of convergence for a Bayesian level set estimation. Scand. J. Statist. 32 639–660.
Mathematical Reviews (MathSciNet): MR2232347
Digital Object Identifier: doi:10.1111/j.1467-9469.2005.00448.x
Hartigan, J.A. (1987). Estimation of a convex density contour in two dimensions. J. Amer. Statist. Assoc. 82 267–270.
Mathematical Reviews (MathSciNet): MR883354
Zentralblatt MATH: 0607.62045
Digital Object Identifier: doi:10.2307/2289162
Hartigan, J.H. (1975). Clustering Algorithms. New York: Wiley.
Mathematical Reviews (MathSciNet): MR405726
Zentralblatt MATH: 0372.62040
Mammen, E. and Tsybakov, A.B. (1999). Smooth discrimination analysis. Ann. Statist. 27 1808–1829.
Mathematical Reviews (MathSciNet): MR1765618
Zentralblatt MATH: 0961.62058
Digital Object Identifier: doi:10.1214/aos/1017939240
Project Euclid: euclid.aos/1017939240
Molchanov, I.S. (1998). A limit theorem for solutions of inequalities. Scand. J. Statist. 25 235–242.
Mathematical Reviews (MathSciNet): MR1614288
Digital Object Identifier: doi:10.1111/1467-9469.00100
Müller, D.W. and Sawitzki, G. (1987). Using excess mass estimates to investigate the modality of a distribution. Technical Report 398, SFB 123, Univ. Heidelberg.
Polonik, W. (1995). Measuring mass concentrations and estimating density contour clusters – an excess mass approach. Ann. Statist. 23 855–881.
Mathematical Reviews (MathSciNet): MR1345204
Zentralblatt MATH: 0841.62045
Digital Object Identifier: doi:10.1214/aos/1176324626
Project Euclid: euclid.aos/1176324626
Polonik, W. (1997). Minimum volume sets and generalized quantile processes. Stochastic Process. Appl. 69 1–24.
Mathematical Reviews (MathSciNet): MR1464172
Zentralblatt MATH: 0905.62053
Digital Object Identifier: doi:10.1016/S0304-4149(97)00028-8
Rigollet, P. (2006). Oracle inequalities, aggregation and adaptation. Ph.D. thesis, Université Paris–VI. Available at http://tel.archives-ouvertes.fr/tel-00115494.
Rigollet, P. (2007). Generalization error bounds in semi-supervised classification under the cluster assumption. J. Mach. Learn. Res. 8 1369–1392.
Mathematical Reviews (MathSciNet): MR2332435
Scott, C.D. and Nowak, R.D. (2006). Learning minimum volume sets. J. Mach. Learn. Res. 7 665–704.
Mathematical Reviews (MathSciNet): MR2274383
Steinwart, I., Hush, D. and Scovel, C. (2005). A classification framework for anomaly detection. J. Mach. Learn. Res. 6 211–232.
Mathematical Reviews (MathSciNet): MR2249820
Stuetzle, W. (2003). Estimating the cluster type of a density by analyzing the minimal spanning tree of a sample. J. Classification 20 25–47.
Mathematical Reviews (MathSciNet): MR1983120
Digital Object Identifier: doi:10.1007/s00357-003-0004-6
Tarigan, B. and van de Geer, S. (2006). Classifiers of support vector machine type with l1 complexity regularization. Bernoulli 12 1045–1076.
Mathematical Reviews (MathSciNet): MR2274857
Digital Object Identifier: doi:10.3150/bj/1165269150
Project Euclid: euclid.bj/1165269150
Tsybakov, A.B. (1997). On nonparametric estimation of density level sets. Ann. Statist. 25 948–969.
Mathematical Reviews (MathSciNet): MR1447735
Zentralblatt MATH: 0881.62039
Digital Object Identifier: doi:10.1214/aos/1069362732
Project Euclid: euclid.aos/1069362732
Tsybakov, A.B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
Mathematical Reviews (MathSciNet): MR2051002
Zentralblatt MATH: 1105.62353
Digital Object Identifier: doi:10.1214/aos/1079120131
Project Euclid: euclid.aos/1079120131
Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. New York: Springer.
Mathematical Reviews (MathSciNet): MR2724359
Tsybakov, A.B. and van de Geer, S.A. (2005). Square root penalty: Adaptation to the margin in classification and in edge estimation. Ann. Statist. 33 1203–1224.
Mathematical Reviews (MathSciNet): MR2195633
Digital Object Identifier: doi:10.1214/009053604000001066
Project Euclid: euclid.aos/1120224100
Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
Mathematical Reviews (MathSciNet): MR1641250
Yang, Y. (1999). Minimax nonparametric classification – part I: Rates of convergence. IEEE Trans. Inform. Theory 45 2271–2284.

2012 © Bernoulli Society for Mathematical Statistics and Probability

Bernoulli

Bernoulli

Turn MathJax Off
What is MathJax?