We establish general sufficient conditions for the $L_2$-consistency of multivariate histogram regression estimates based on data-dependent partitions. These same conditions insure the consistency of partitioning regression estimates based on local polynomial fits, and, with an additional regularity assumption, the consistency of histogram estimates for conditional medians.
Our conditions require shrinking cells, subexponential growth of a combinatorial complexity measure and sublinear growth of restricted cell counts. It is not assumed that the cells of every partition be rectangles with sides parallel to the coordinate axis or that each cell contain a minimum number of points. Response variables are assumed to be bounded throughout.
Our results may be applied to a variety of partitioning schemes. We established the consistency of histograms regression estimates based on cubic partitions with data-dependent offsets, k-thresholding in one dimension and empirically optimal nearest-neighbor clustering schemes. In addition, it is shown that empirically optimal regression trees are consistent when the size of the trees grows with the number of samples at an appropriate rate.
"Histogram regression estimation using data-dependent partitions." Ann. Statist. 24 (3) 1084 - 1105, June 1996. https://doi.org/10.1214/aos/1032526958