Source: Bernoulli Volume 15, Number 4
(2009), 1154-1178.
In the context of density level set estimation, we study the convergence of general plug-in methods under two main assumptions on the density for a given level λ. More precisely, it is assumed that the density (i) is smooth in a neighborhood of λ and (ii) has γ-exponent at level λ. Condition (i) ensures that the density can be estimated at a standard nonparametric rate and condition (ii) is similar to Tsybakov’s margin assumption which is stated for the classification framework. Under these assumptions, we derive optimal rates of convergence for plug-in estimators. Explicit convergence rates are given for plug-in estimators based on kernel density estimators when the underlying measure is the Lebesgue measure. Lower bounds proving optimality of the rates in a minimax sense when the density is Hölder smooth are also provided.
References
Audibert, J.-Y. and Tsybakov, A. (2005). Fast learning rates for plug-in classifiers under the margin condition. Technical report, Laboratoire de Probabilités et Modèles Aléatoires de Paris 6. Available at http://arxiv.org/abs/math/0507180.
Audibert, J.-Y. and Tsybakov, A. (2007). Fast learning rates for plug-in classifiers. Ann. Statist. 35 608–633.
Baíllo, A. (2003). Total error in a plug-in estimator of level sets. Statist. Probab. Lett. 65 411–417.
Baíllo, A., Cuesta-Albertos, J.A. and Cuevas, A. (2001). Convergence rates in nonparametric estimation of level sets. Statist. Probab. Lett. 53 27–35.
Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 203–268.
Cuevas, A. and Fraiman, R. (1997). A plug-in approach to support estimation. Ann. Statist. 25 2300–2312.
Cuevas, A., González-Manteiga, W. and Rodríguez-Casal, A. (2006). Plug-in estimation of general level sets. Aust. N. Z. J. Stat. 48 7–19.
Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. New York: Springer.
Devroye, L. and Wise, G.L. (1980). Detection of abnormal behavior via nonparametric estimation of the support. SIAM J. Appl. Math. 38 480–488.
Mathematical Reviews (MathSciNet):
MR579432
Gayraud, G. and Rousseau, J. (2005). Rates of convergence for a Bayesian level set estimation. Scand. J. Statist. 32 639–660.
Hartigan, J.A. (1987). Estimation of a convex density contour in two dimensions. J. Amer. Statist. Assoc. 82 267–270.
Mathematical Reviews (MathSciNet):
MR883354
Hartigan, J.H. (1975). Clustering Algorithms. New York: Wiley.
Mathematical Reviews (MathSciNet):
MR405726
Mammen, E. and Tsybakov, A.B. (1999). Smooth discrimination analysis. Ann. Statist. 27 1808–1829.
Molchanov, I.S. (1998). A limit theorem for solutions of inequalities. Scand. J. Statist. 25 235–242.
Müller, D.W. and Sawitzki, G. (1987). Using excess mass estimates to investigate the modality of a distribution. Technical Report 398, SFB 123, Univ. Heidelberg.
Polonik, W. (1995). Measuring mass concentrations and estimating density contour clusters – an excess mass approach. Ann. Statist. 23 855–881.
Polonik, W. (1997). Minimum volume sets and generalized quantile processes. Stochastic Process. Appl. 69 1–24.
Rigollet, P. (2006). Oracle inequalities, aggregation and adaptation. Ph.D. thesis, Université Paris–VI. Available at http://tel.archives-ouvertes.fr/tel-00115494.
Rigollet, P. (2007). Generalization error bounds in semi-supervised classification under the cluster assumption. J. Mach. Learn. Res. 8 1369–1392.
Scott, C.D. and Nowak, R.D. (2006). Learning minimum volume sets. J. Mach. Learn. Res. 7 665–704.
Steinwart, I., Hush, D. and Scovel, C. (2005). A classification framework for anomaly detection. J. Mach. Learn. Res. 6 211–232.
Stuetzle, W. (2003). Estimating the cluster type of a density by analyzing the minimal spanning tree of a sample. J. Classification 20 25–47.
Tarigan, B. and van de Geer, S. (2006). Classifiers of support vector machine type with l1 complexity regularization. Bernoulli 12 1045–1076.
Tsybakov, A.B. (1997). On nonparametric estimation of density level sets. Ann. Statist. 25 948–969.
Tsybakov, A.B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. New York: Springer.
Tsybakov, A.B. and van de Geer, S.A. (2005). Square root penalty: Adaptation to the margin in classification and in edge estimation. Ann. Statist. 33 1203–1224.
Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
Yang, Y. (1999). Minimax nonparametric classification – part I: Rates of convergence. IEEE Trans. Inform. Theory 45 2271–2284.