Bernoulli
- Bernoulli
- Volume 10, Number 6 (2004), 989-1010.
Some theory for Fisher's linear discriminant function, `naive Bayes', and some alternatives when there are many more variables than observations
Peter J. Bickel and Elizaveta Levina
Abstract
We show that the `naive Bayes' classifier which assumes independent covariates greatly outperforms the Fisher linear discriminant rule under broad conditions when the number of variables grows faster than the number of observations, in the classical problem of discriminating between two normal populations. We also introduce a class of rules spanning the range between independence and arbitrary dependence. These rules are shown to achieve Bayes consistency for the Gaussian `coloured noise' model and to adapt to a spectrum of convergence rates, which we conjecture to be minimax.
Article information
Source
Bernoulli Volume 10, Number 6 (2004), 989-1010.
Dates
First available in Project Euclid: 21 January 2005
Permanent link to this document
http://projecteuclid.org/euclid.bj/1106314847
Digital Object Identifier
doi:10.3150/bj/1106314847
Mathematical Reviews number (MathSciNet)
MR2108040
Zentralblatt MATH identifier
1064.62073
Keywords
Fisher's linear discriminant Gaussian coloured noise minimax regret naive Bayes
Citation
Bickel, Peter J.; Levina, Elizaveta. Some theory for Fisher's linear discriminant function, `naive Bayes', and some alternatives when there are many more variables than observations. Bernoulli 10 (2004), no. 6, 989--1010. doi:10.3150/bj/1106314847. http://projecteuclid.org/euclid.bj/1106314847.

