The Annals of Applied Probability

Asymptotic normality of plug-in level set estimates

David M. Mason and Wolfgang Polonik

Source: Ann. Appl. Probab. Volume 19, Number 3 (2009), 1108-1142.

Abstract

We establish the asymptotic normality of the G-measure of the symmetric difference between the level set and a plug-in-type estimator of it formed by replacing the density in the definition of the level set by a kernel density estimator. Our proof will highlight the efficacy of Poissonization methods in the treatment of large sample theory problems of this kind.

Primary Subjects: 60F05, 60F15, 62E20, 62G07
Keywords: Central limit theorem; kernel density estimator; level set estimation

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoap/1245071021
Digital Object Identifier: doi:10.1214/08-AAP569
Zentralblatt MATH identifier: 05580235
Mathematical Reviews number (MathSciNet): MR2537201

References

Baíllo, A., Cuevas, A. and Justel, A. (2000). Set estimation and nonparametric detection. Canad. J. Statist. 28 765–782.
Baíllo, A., Cuestas-Albertos, J. A. and Cuevas, A. (2001). Convergence rates in nonparametric estimation of level sets. Statist. Probab. Lett. 53 27–35.
Baíllo, A. (2003). Total error in a plug-in estimator of level sets. Statist. Probab. Lett. 65 411–417.
Baíllo, A. and Cuevas, A. (2006). Image estimators based on marked bins. Statistics 40 277–288.
Beirlant, J. and Mason, D. M. (1995). On the asymptotic normality of Lp-norms of empirical functionals. Math. Methods Statist. 4 1–19.
Mathematical Reviews (MathSciNet): MR1324687
Biau, G., Cadre, B. and Pelletier, B. (2008). Exact rates in density support estimation. J. Multivariate Anal. 99 2185–2207.
Mathematical Reviews (MathSciNet): MR2463383
Digital Object Identifier: doi:10.1016/j.jmva.2008.02.021
Bredon, G. E. (1993). Topology and Geometry. Graduate Texts in Mathematics 139. Springer, New York.
Mathematical Reviews (MathSciNet): MR1224675
Cadre, B. (2006). Kernel estimation of density level sets. J. Multivariate Anal. 97 999–1023.
Mathematical Reviews (MathSciNet): MR2256570
Digital Object Identifier: doi:10.1016/j.jmva.2005.05.004
Cavalier, L. (1997). Nonparametric estimation of regression level sets. Statistics 29 131–160.
Mathematical Reviews (MathSciNet): MR1484386
Digital Object Identifier: doi:10.1080/02331889708802579
Cuevas, A., Febrero, M. and Fraiman, R. (2000). Estimating the number of clusters. Canad. J. Statist. 28 367–382.
Mathematical Reviews (MathSciNet): MR1792055
Digital Object Identifier: doi:10.2307/3315985
Cuevas, A., González-Manteiga, W. and Rodríguez-Casal, A. (2006). Plug-in estimation of general level sets. Aust. N. Z. J. Statist. 48 7–19.
Desforges, M. J., Jacob, P. J. and Cooper, J. E. (1998). Application of probability density estimation to the detection of abnormal conditions in engineering. Proceedings of the Institute of Mechanical Engineering 212 687–703.
Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist. 33 1380–1403.
Mathematical Reviews (MathSciNet): MR2195639
Digital Object Identifier: doi:10.1214/009053605000000129
Project Euclid: euclid.aos/1120224106
Fan, W., Miller, M., Stolfo, S. J., Lee, W. and Chan, P. K. (2001). Using artificial anomalies to detect unknown and known network intrusions. In IEEE International Conference on Data Mining (ICDM’01) 123–130. IEEE Computer Society.
Feller, W. (1966). An Introduction to Probability Theory and Its Applications, Vol. II. Wiley, New York.
Mathematical Reviews (MathSciNet): MR210154
Gardner, A. B., Krieger, A. M., Vachtsevanos, G. and Litt, B. (2006). One-class novelty detection for seizure analysis from intracranial EEG. J. Mach. Learn. Res. 7 1025–1044.
Mathematical Reviews (MathSciNet): MR2274396
Gayraud, G. and Rousseau, J. (2005). Rates of convergence for a Bayesian level set estimation. Scand. J. Statist. 32 639–660.
Mathematical Reviews (MathSciNet): MR2232347
Digital Object Identifier: doi:10.1111/j.1467-9469.2005.00448.x
Gerig, G., Jomier, M., Chakos, M. (2001). VALMET: A new validation tool for assessing and improving 3D object segmentation. In Medical Image Computing and Computer Assisted Intervention MICCAI 2208 (W. Niessen and M. Viergever, eds.) 516–523. Springer, New York.
Giné, E., Mason, D. M. and Zaitsev, A. Y. (2003). The L1-norm density estimator process. Ann. Probab. 31 719–768.
Mathematical Reviews (MathSciNet): MR1964947
Digital Object Identifier: doi:10.1214/aop/1048516534
Project Euclid: euclid.aop/1048516534
Goldenshluger, A. and Zeevi, A. (2004). The Hough transform estimator. Ann. Statist. 32 1908–1932.
Mathematical Reviews (MathSciNet): MR2102497
Digital Object Identifier: doi:10.1214/009053604000000760
Project Euclid: euclid.aos/1098883776
Hall, P. and Kang, K.-H. (2005). Bandwidth choice for nonparametric classification. Ann. Statist. 33 284–306.
Mathematical Reviews (MathSciNet): MR2157804
Digital Object Identifier: doi:10.1214/009053604000000959
Project Euclid: euclid.aos/1112967707
Hartigan, J. A. (1975). Clustering Algorithms. Wiley, New York.
Mathematical Reviews (MathSciNet): MR405726
Hartigan, J. A. (1987). Estimation of a convex density contour in two dimensions. J. Amer. Statist. Assoc. 82 267–270.
Mathematical Reviews (MathSciNet): MR883354
Digital Object Identifier: doi:10.2307/2289162
Horváth, L. (1991). On Lp-norms of multivariate density estimators. Ann. Statist. 19 1933–1949.
Huo, X. and Lu, J.-C. (2004). A network flow approach in finding maximum likelihood estimate of high concentration regions. Comput. Statist. Data Anal. 46 33–56.
Mathematical Reviews (MathSciNet): MR2056823
Jang, W. (2006). Nonparametric density estimation and clustering in astronomical sky surveys. Comput. Statist. Data Anal. 50 760–774.
Mathematical Reviews (MathSciNet): MR2207006
Johnson, W. B., Schechtman, G. and Zinn, J. (1985). Best constants in moment inequalities for linear combinations of independent and exchangeable random variables. Ann. Probab. 13 234–253.
Mathematical Reviews (MathSciNet): MR770640
Digital Object Identifier: doi:10.1214/aop/1176993078
Project Euclid: euclid.aop/1176993078
King, S. P., King, D. M., Anuzis, P., Astley, L., Tarassenko, K., Hayton, P. and Utete, S. (2002). The use of novelty detection techniques for monitoring high-integrity plant. In Proceedings of the 2002 International Conference on Control Applications 1 221–226. Cancun, Mexico.
Klemelä, J. (2004). Visualization of multivariate density estimates with level set trees. J. Comput. Graph. Statist. 13 599–620.
Mathematical Reviews (MathSciNet): MR2087717
Digital Object Identifier: doi:10.1198/106186004X2642
Klemelä, J. (2006a). Visualization of multivariate density estimates with shape trees. J. Comput. Graph. Statist. 15 372–397.
Mathematical Reviews (MathSciNet): MR2256150
Digital Object Identifier: doi:10.1198/106186006X113007
Klemelä, J. (2008). Visualization of scales of multivariate density estimates. Unpublished manuscript.
Lang, S. (1997). Undergraduate Analysis, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR1476913
Mason, D. and Polonik, W. (2008). Asymptotic normality of plug-in level set estimates (expanded version). Unpublished manuscript.
Markou, M. and Singh, S. (2003). Novelty detection: A review—part 1: Statistical approaches. Signal Processing 83 2481–2497.
Molchanov, I. S. (1998). A limit theorem for solutions of inequalities. Scand. J. Statist. 25 235–242.
Mathematical Reviews (MathSciNet): MR1614288
Digital Object Identifier: doi:10.1111/1467-9469.00100
Nairac, A., Townsend, N., Carr, R., King, S., Cowley, L. and Tarassenko, L. (1997). A system for the analysis of jet engine vibration data. Integrated Comput. Aided Eng. 6 53–65.
Polonik, W. (1995). Measuring mass concentrations and estimating density contour clusters—an excess mass approach. Ann. Statist. 23 855–881.
Mathematical Reviews (MathSciNet): MR1345204
Digital Object Identifier: doi:10.1214/aos/1176324626
Project Euclid: euclid.aos/1176324626
Prastawa, M., Bullitt, E., Ho, S. and Gerig, G. (2003). Robust estimation for brain tumor segmentation. In MICCAI Proceedings LNCS 2879 530–537. Springer, Berlin.
Rigollet, P. and Vert, R. (2008). Fast rates for plug-in estimators of density level sets. Available at arxiv.org/pdf/math/0611473.
Roederer, M. and Hardy, R. R. (2001). Frequency difference gating: A multivariate method for identifying subsets that differ between samples. Cytometry 45 56–64.
Rosenblatt, M. (1975). A quadratic measure of deviation of two-dimensional density estimates and a test of independence. Ann. Statist. 3 1–14.
Mathematical Reviews (MathSciNet): MR428579
Digital Object Identifier: doi:10.1214/aos/1176342996
Project Euclid: euclid.aos/1176342996
Scott, C. D. and Davenport, M. (2006). Regression level set estimation via cost-sensitive classification. IEEE Trans. Inf. Theory. 55 2752–2757.
Scott, C. D. and Nowak, R. D. (2006). Learning minimum volume sets. J. Mach. Learn. Res. 7 665–704.
Mathematical Reviews (MathSciNet): MR2274383
Shergin, V. V. (1990). The central limit theorem for finitely-dependent random variables. In Probability Theory and Mathematical Statistics, Vol. II (Vilnius, 1989) 424–431. “Mokslas,” Vilnius.
Mathematical Reviews (MathSciNet): MR1153895
Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. Wiley, New York.
Mathematical Reviews (MathSciNet): MR838963
Steinwart, I., Hush, D. and Scovel, C. (2004). Density level detection is classification. Technical report, Los Alamos national laboratory.
Steinwart, I., Hush, D. and Scovel, C. (2005). A classification framework for anomaly detection. J. Mach. Learn. Res. 6 211–232 (electronic).
Mathematical Reviews (MathSciNet): MR2249820
Stuetzle, W. (2003). Estimating the cluster type of a density by analyzing the minimal spanning tree of a sample. J. Classification 20 25–47.
Mathematical Reviews (MathSciNet): MR1983120
Digital Object Identifier: doi:10.1007/s00357-003-0004-6
Theiler, J. and Cai, D. M. (2003). Resampling approach for anomaly detection in multispectral images. In Proceedings of the SPIE 5093 230–240.
Tsybakov, A. B. (1997). On nonparametric estimation of density level sets. Ann. Statist. 25 948–969.
Mathematical Reviews (MathSciNet): MR1447735
Digital Object Identifier: doi:10.1214/aos/1069362732
Project Euclid: euclid.aos/1069362732
Tsybakov, A. B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
Mathematical Reviews (MathSciNet): MR2051002
Digital Object Identifier: doi:10.1214/aos/1079120131
Project Euclid: euclid.aos/1079120131
Vert, R. and Vert, J.-P. (2006). Consistency and convergence rates of one-class SVMs and related algorithms. J. Mach. Learn. Res. 7 817–854.
Mathematical Reviews (MathSciNet): MR2274388
Walther, G. (1997). Granulometric smoothing. Ann. Statist. 25 2273–2299.
Mathematical Reviews (MathSciNet): MR1604445
Digital Object Identifier: doi:10.1214/aos/1069362379
Project Euclid: euclid.aos/1030741072
Wand, M. (2005). Statistical methods for flow cytometric data. Presentation. Available at http://www.uow.edu.au/~mwand/talks.html.
Willett, R. M. and Nowak, R. D. (2005). Level set estimation in medical imaging. In Proceedings of the IEEE Statistical Signal Processing Workshop. Bordeaux, France.
Willett, R. M. and Nowak, R. D. (2006). Minimax optimal level set estimation. Submitted to IEEE Trans. Image Proc. Available at http://www.ee.duke.edu/~willett/.
Mathematical Reviews (MathSciNet): MR2472804
Digital Object Identifier: doi:10.1109/TIP.2007.910175
Yeung, D. W. and Chow, C. (2002). Parzen window network intrusion detectors. In Proceedings of the 16th International Conference on Pattern Recognition 4 385–388. Quebec, Canada.

2009 © Institute of Mathematical Statistics