The Annals of Statistics
- Ann. Statist.
- Volume 24, Number 3 (1996), 1084-1105.
Histogram regression estimation using data-dependent partitions
We establish general sufficient conditions for the $L_2$-consistency of multivariate histogram regression estimates based on data-dependent partitions. These same conditions insure the consistency of partitioning regression estimates based on local polynomial fits, and, with an additional regularity assumption, the consistency of histogram estimates for conditional medians.
Our conditions require shrinking cells, subexponential growth of a combinatorial complexity measure and sublinear growth of restricted cell counts. It is not assumed that the cells of every partition be rectangles with sides parallel to the coordinate axis or that each cell contain a minimum number of points. Response variables are assumed to be bounded throughout.
Our results may be applied to a variety of partitioning schemes. We established the consistency of histograms regression estimates based on cubic partitions with data-dependent offsets, k-thresholding in one dimension and empirically optimal nearest-neighbor clustering schemes. In addition, it is shown that empirically optimal regression trees are consistent when the size of the trees grows with the number of samples at an appropriate rate.
Ann. Statist., Volume 24, Number 3 (1996), 1084-1105.
First available in Project Euclid: 20 September 2002
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62G07: Density estimation
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Nobel, Andrew. Histogram regression estimation using data-dependent partitions. Ann. Statist. 24 (1996), no. 3, 1084--1105. doi:10.1214/aos/1032526958. https://projecteuclid.org/euclid.aos/1032526958