Translator Disclaimer
September, 1992 Minimum Impurity Partitions
David Burshtein, Vincent Della Pietra, Dimitri Kanevsky, Arthur Nadas
Ann. Statist. 20(3): 1637-1646 (September, 1992). DOI: 10.1214/aos/1176348789


Let $(X, U)$ be jointly distributed on $\mathscr{X} \times \mathscr{R}^n$. Let $Y = E(U\mid X)$ and let $\mathscr{U}$ be the convex hull of the range of $U$. Let $C: \mathscr{X} \rightarrow \mathscr{C} = \{1,2,\ldots,k\}, k \geq 1$, induce a measurable $k$ way partition $\{\mathscr{X}_1,\ldots,\mathscr{X}_k\}$ of $\mathscr{X}$. Define the impurity of $\mathscr{X}_c = C^{-1}(c)$ to be $\phi(c, E(U\mid C(X) = c))$, where $\phi: \mathscr{C} \times \mathscr{U} \rightarrow \mathscr{R}^1$ is a concave function in its second argument. Define the impurity $\Psi$ of the partition as the average impurity of its members: $\Psi(C) = E\phi(C(X), E(U\mid C(X))$. We show that for any $C: \mathscr{X} \rightarrow \mathscr{C}$ there exists a mapping $\tilde{C}: \mathscr{U} \rightarrow \mathscr{C}$, such that $\Psi(\tilde{C}(Y)) \leq \Psi(C)$ and such that $\tilde{C}^{-1}(c)$ is convex, that is, for each $i, j \in C, i \neq j$, there exists a separating hyperplane between $\tilde{C}^{-1}(i)$ and $\tilde{C}^{-1}(j)$. This generalizes some results in statistics and information theory. Suitable choices of $U$ and $\phi$ lead to optimal partitions of simple form useful in the construction of classification trees and multidimensional regression trees.


Download Citation

David Burshtein. Vincent Della Pietra. Dimitri Kanevsky. Arthur Nadas. "Minimum Impurity Partitions." Ann. Statist. 20 (3) 1637 - 1646, September, 1992.


Published: September, 1992
First available in Project Euclid: 12 April 2007

zbMATH: 0781.62094
MathSciNet: MR1186270
Digital Object Identifier: 10.1214/aos/1176348789

Primary: 62H30
Secondary: 62C05, 62C10, 62J02, 68T05, 68T10

Rights: Copyright © 1992 Institute of Mathematical Statistics


Vol.20 • No. 3 • September, 1992
Back to Top