Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335–1353; Mach. Learn. 66 (2007) 209–242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter $\alpha$ of margin conditions and a rate $s$ of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness $\beta$ of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Hölder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.
"Classification algorithms using adaptive partitioning." Ann. Statist. 42 (6) 2141 - 2163, December 2014. https://doi.org/10.1214/14-AOS1234