On data depth and distribution-free discriminant analysis using separating surfaces

Anil K. Ghosh; Probal Chaudhuri

doi:10.3150/bj/1110228239

January 2005 On data depth and distribution-free discriminant analysis using separating surfaces

Anil K. Ghosh, Probal Chaudhuri

Author Affiliations +

Bernoulli 11(1): 1-27 (January 2005). DOI: 10.3150/bj/1110228239

Abstract

A very well-known traditional approach in discriminant analysis is to use some linear (or nonlinear) combination of measurement variables which can enhance class separability. For instance, a linear (or a quadratic) classifier finds the linear projection (or the quadratic function) of the measurement variables that will maximize the separation between the classes. These techniques are very useful in obtaining good lower-dimensional views of class separability. Fisher's discriminant analysis, which is primarily motivated by the multivariate normal distribution, uses the first- and second-order moments of the training sample to build such classifiers. These estimates, however, are highly sensitive to outliers, and they are not reliable for heavy-tailed distributions. This paper investigates two distribution-free methods for linear classification, which are based on the notions of statistical depth functions. One of these classifiers is closely related to Tukey's half-space depth, while the other is based on the concept of regression depth. Both these methods can be generalized for constructing nonlinear surfaces to discriminate among competing classes. These depth-based methods assume some finite-dimensional parametric form of the discriminating surface and use the distributional geometry of the data cloud to build the classifier. We use a few simulated and real data sets to examine the performance of these discriminant analysis tools and study their asymptotic properties under appropriate regularity conditions.

Citation

Download Citation

Anil K. Ghosh. Probal Chaudhuri. "On data depth and distribution-free discriminant analysis using separating surfaces." Bernoulli 11 (1) 1 - 27, January 2005. https://doi.org/10.3150/bj/1110228239

Information

Published: January 2005

First available in Project Euclid: 7 March 2005

zbMATH: 1059.62064

MathSciNet: MR2121452

Digital Object Identifier: 10.3150/bj/1110228239

Keywords: Bayes risk , elliptic symmetry , generalized U-statistic , half-space depth , linear discriminant analysis , location-shift models , misclassification rates , optimal Bayes classifier , quadratic discriminant analysis , regression depth , robustness , Vapnik-Chervonenkis dimension

Access the abstract

JOURNAL ARTICLE
27 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY