## The Annals of Statistics

### Tree-structured regression and the differentiation of integrals

Richard A. Olshen

#### Abstract

This paper provides answers to questions regarding the almost sure limiting behavior of rooted, binary tree-structured rules for regression. Examples show that questions raised by Gordon and Olshen in 1984 have negative answers. For these examples of regression functions and sequences of their associated binary tree-structured approximations, for all regression functions except those in a set of the first category, almost sure consistency fails dramatically on events of full probability. One consequence is that almost sure consistency of binary tree-structured rules such as CART requires conditions beyond requiring that (1) the regression function be in ℒ1, (2) partitions of a Euclidean feature space be into polytopes with sides parallel to coordinate axes, (3) the mesh of the partitions becomes arbitrarily fine almost surely and (4) the empirical learning sample content of each polytope be “large enough.” The material in this paper includes the solution to a problem raised by Dudley in discussions. The main results have a corollary regarding the lack of almost sure consistency of certain Bayes-risk consistent rules for classification.

#### Article information

Source
Ann. Statist., Volume 35, Number 1 (2007), 1-12.

Dates
First available in Project Euclid: 6 June 2007

https://projecteuclid.org/euclid.aos/1181100178

Digital Object Identifier
doi:10.1214/009053606000001000

Mathematical Reviews number (MathSciNet)
MR2332266

Zentralblatt MATH identifier
1122.62027

#### Citation

Olshen, Richard A. Tree-structured regression and the differentiation of integrals. Ann. Statist. 35 (2007), no. 1, 1--12. doi:10.1214/009053606000001000. https://projecteuclid.org/euclid.aos/1181100178

#### References

• Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA. Since 1993 this book has been published by Chapman and Hall, New York.
• Busemann, H. and Feller, W. (1934). Zur Differentiation der Lebesgueschen Integrale. Fund. Math. 22 226–256.
• de Guzmán, M. (1975). Differentiation of Integrals in R$^n$. Lecture Notes in Math. 481. Springer, Berlin.
• Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
• Devroye, L. and Krzyżak, A. (2002). New multivariate product density estimators. J. Multivariate Anal. 82 88–110.
• Donoho, D. L. (1997). CART and best-ortho-basis: A connection. Ann. Statist. 25 1870–1911.
• Gersho, A. and Gray, R. M. (1992). Vector Quantization and Signal Compression. Kluwer, Dordrecht.
• Gordon, L. and Olshen, R. A. (1984). Almost surely consistent nonparametric regression from recursive partitioning schemes. J. Multivariate Anal. 15 147–163.
• Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, New York.
• Lugosi, G. and Nobel, A. (1996). Consistency of data-driven histogram methods for density estimation and classification. Ann. Statist. 24 687–706.
• Nobel, A. (1996). Histogram regression estimation using data-dependent partitions. Ann. Statist. 24 1084–1105.
• Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge Univ. Press.
• Saks, S. (1934). Remarks on the differentiability of the Lebesgue indefinite integral. Fund. Math. 22 257–261.
• Stone, C. J. (1977). Consistent nonparametric regression (with discussion). Ann. Statist. 5 595–645.
• Zhang, H. and Singer, B. (1999). Recursive Partitioning in the Health Sciences. Springer, New York.