Consistency of data-driven histogram methods for density estimation and classification

Gábor Lugosi; Andrew Nobel

doi:10.1214/aos/1032894460

April 1996 Consistency of data-driven histogram methods for density estimation and classification

Gábor Lugosi, Andrew Nobel

Ann. Statist. 24(2): 687-706 (April 1996). DOI: 10.1214/aos/1032894460

Abstract

We present general sufficient conditions for the almost sure $L_1$-consistency of histogram density estimates based on data-dependent partitions. Analogous conditions guarantee the almost-sure risk consistency of histogram classification schemes based on data-dependent partitions. Multivariate data are considered throughout.

In each case, the desired consistency requires shrinking cells, subexponential growth of a combinatorial complexity measure and sublinear growth of the number of cells. It is not required that the cells of every partition be rectangles with sides parallel to the coordinate axis or that each cell contain a minimum number of points. No assumptions are made concerning the common distribution of the training vectors.

We apply the results to establish the consistency of several known partitioning estimates, including the $k_n$-spacing density estimate, classifiers based on statistically equivalent blocks and classifiers based on multivariate clustering schemes.

Citation

Download Citation

Gábor Lugosi. Andrew Nobel. "Consistency of data-driven histogram methods for density estimation and classification." Ann. Statist. 24 (2) 687 - 706, April 1996. https://doi.org/10.1214/aos/1032894460

Information

Published: April 1996

First available in Project Euclid: 24 September 2002

zbMATH: 0859.62040

MathSciNet: MR1394983

Digital Object Identifier: 10.1214/aos/1032894460

Subjects:

Primary: 62G07

Secondary: 62H30

Keywords: histogram classification , histogram density estimation , Partitioning rules , statistically equivalent blocks , Vapnik-Chervonenkis inequality

Access the abstract

JOURNAL ARTICLE
20 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY