Abstract
We describe an approach to linear feature selection for $n$-dimensional normally distributed observation vectors which belong to one of $m$ populations. More specifically, we consider the problem of finding a rank $k k \times n$ matrix $B$ which minimizes the probability of misclassification with respect to the $k$-dimensional transformed density functions when a Bayes optimal (maximum likelihood) classification scheme is used. Theoretical results are presented which, for the case $k = 1$, give rise to a numerically tractable expression for the variation in the probability of misclassification with respect to $\mathbf{B}$. The use of this exression in a computational procedure for obtaining a $\mathbf{B}$ which minimizes the probability of misclassification in the case of two populations is discussed.
Citation
L. F. Guseman Jr.. B. Charles Peters Jr.. Homer F. Walker. "On Minimizing the Probability of Misclassification for Linear Feature Selection." Ann. Statist. 3 (3) 661 - 668, May, 1975. https://doi.org/10.1214/aos/1176343128
Information