## Statistical Science

- Statist. Sci.
- Volume 21, Number 1 (2006), 1-14.

### Classifier Technology and the Illusion of Progress

**Full-text: Open access**

#### Abstract

A great many tools have been developed for supervised classification, ranging from early methods such as linear discriminant analysis through to modern developments such as neural networks and support vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illusion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.

#### Article information

**Source**

Statist. Sci., Volume 21, Number 1 (2006), 1-14.

**Dates**

First available in Project Euclid: 6 June 2006

**Permanent link to this document**

https://projecteuclid.org/euclid.ss/1149600839

**Digital Object Identifier**

doi:10.1214/088342306000000060

**Mathematical Reviews number (MathSciNet)**

MR2275965

**Zentralblatt MATH identifier**

05191849

**Keywords**

Supervised classification error rate misclassification rate simplicity principle of parsimony population drift selectivity bias flat maximum effect problem uncertainty empirical comparisons

#### Citation

Hand, David J. Classifier Technology and the Illusion of Progress. Statist. Sci. 21 (2006), no. 1, 1--14. doi:10.1214/088342306000000060. https://projecteuclid.org/euclid.ss/1149600839

#### References

- Adams, N. M. and Hand, D. J. (1999). Comparing classifiers when the misallocation costs are uncertain.
*Pattern Recognition***32**1139--1147.Zentralblatt MATH: 1059.62065 - Benton, T. C. (2002). Theoretical and empirical models. Ph.D. dissertation, Dept. Mathematics, Imperial College London.
- Breiman, L. (2001). Statistical modeling: The two cultures (with discussion).
*Statist. Sci.***16**199--231.Mathematical Reviews (MathSciNet): MR1874152

Digital Object Identifier: doi:10.1214/ss/1009213726

Project Euclid: euclid.ss/1009213726

Zentralblatt MATH: 1059.62505 - Duin, R. P. W. (1996). A note on comparing classifiers.
*Pattern Recognition Letters***17**529--536. - Efron, B. (2001). Comment on ``Statistical modeling: The two cultures,'' by L. Breiman.
*Statist. Sci.***16**218--219.Mathematical Reviews (MathSciNet): MR1861072

Digital Object Identifier: doi:10.1214/ss/1009213290

Project Euclid: euclid.ss/1009213290

Zentralblatt MATH: 1059.01542 - Fawcett, T. and Provost, F. (1997). Adaptive fraud detection.
*Data Mining and Knowledge Discovery***1**291--316. - Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems.
*Annals of Eugenics***7**179--188. - Friedman, C. P. and Wyatt, J. C. (1997).
*Evaluation Methods in Medical Informatics*. Springer. New York. - Friedman, J. H. (1997). On bias, variance, 0/1 loss, and the curse of dimensionality.
*Data Mining and Knowledge Discovery***1**55--77. - Gallagher, J. C., Hedlund, L. R., Stoner, S. and Meeger, C. (1988). Vertebral morphometry: Normative data.
*Bone and Mineral***4**189--196. - Hand, D. J. (1981).
*Discrimination and Classification*. Wiley, Chichester. - Hand, D. J. (1996). Classification and computers: Shifting the focus. In
*COMPSTAT-96*:*Proceedings in Computational Statistics*(A. Prat, ed.) 77--88. Physica, Berlin.Mathematical Reviews (MathSciNet): MR1602244 - Hand, D. J. (1997).
*Construction and Assessment of Classification Rules*. Wiley, Chichester.Zentralblatt MATH: 0997.62500 - Hand, D. J. (1998). Strategy, methods, and solving the right problems.
*Comput. Statist.***13**5--14. - Hand, D. J. (1999). Intelligent data analysis and deep understanding. In
*Causal Models and Intelligent Data Management*(A. Gammerman, ed.) 67--80. Springer, Berlin.Mathematical Reviews (MathSciNet): MR1722705 - Hand, D. J. (2001). Modelling consumer credit risk.
*IMA J. Management Mathematics***12**139--155. - Hand, D. J. (2001). Reject inference in credit operations. In
*Handbook of Credit Scoring*(E. Mays, ed.) 225--240. Glenlake, Chicago. - Hand, D. J. (2004). Academic obsessions and classification realities: Ignoring practicalities in supervised classification. In
*Classification*,*Clustering and Data Mining Applications*(D. Banks, L. House, F. R. McMorris, P. Arabie and W. Gaul, eds.) 209--232. Springer, Berlin.Mathematical Reviews (MathSciNet): MR2113611 - Hand, D. J. (2005). Supervised classification and tunnel vision.
*Applied Stochastic Models in Business and Industry***21**97--109.Mathematical Reviews (MathSciNet): MR2137544

Digital Object Identifier: doi:10.1002/asmb.540

Zentralblatt MATH: 1089.62077 - Hand, D. J. and Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A review.
*J. Roy. Statist. Soc. Ser. A***160**523--541. - Hoadley, B. (2001). Comment on ``Statistical modeling: The two cultures,'' by L. Breiman.
*Statist. Sci.***16**220--224.Mathematical Reviews (MathSciNet): MR1874152

Digital Object Identifier: doi:10.1214/ss/1009213726

Project Euclid: euclid.ss/1009213726

Zentralblatt MATH: 1059.62505 - Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets.
*Machine Learning***11**63--90. - Jamain, A. (2004). Meta-analysis of classification methods. Ph.D. dissertation, Dept. Mathematics, Imperial College London.
- Jamain, A. and Hand, D. J. (2005). Mining supervised classification performance studies: A meta-analytic investigation. Technical report, Dept. Mathematics, Imperial College London.
- Kelly, M. G and Hand, D. J. (1999). Credit scoring with uncertain class definitions.
*IMA J. Mathematics Management***10**331--345. - Kelly, M. G., Hand, D. J. and Adams, N. M. (1998). Defining the goals to optimise data mining performance. In
*Proc. Fourth International Conference on Knowledge Discovery and Data Mining*(R. Agrawal, P. Stolorz and G. Piatetsky-Shapiro, eds.) 234--238. AAAI Press, Menlo Park, CA. - Kelly, M. G., Hand, D. J. and Adams, N. M. (1999). The impact of changing populations on classifier performance. In
*Proc. Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*(S. Chaudhuri and D. Madigan, eds.) 367--371. ACM, New York. - Kelly, M. G., Hand, D. J. and Adams, N. M. (1999). Supervised classification problems: How to be both judge and jury. In
*Advances in Intelligent Data Analysis. Lecture Notes in Comput. Sci.***1642**235--244. Springer, Berlin.Mathematical Reviews (MathSciNet): MR1723394 - Klinkenberg, R. and Thorsten, J. (2000). Detecting concept drift with support vector machines. In
*Proc. 17th International Conference on Machine Learning*(P. Langley, ed.) 487--494. Morgan Kaufmann, San Francisco. - Lane, T. and Brodley, C. E. (1998). Approaches to online learning and concept drift for user identification in computer security. In
*Proc. Fourth International Conference on Knowledge Discovery and Data Mining*(R. Agrawal, P. Stolorz and G. Piatetsky-Shapiro, eds.) 259--263. AAAI Press, Menlo Park, CA. - Lewis, E. M. (1990).
*An Introduction to Credit Scoring*. Athena, San Rafael, CA. - Li, H. G. and Hand, D. J. (2002). Direct versus indirect credit scoring classifications.
*J. Operational Research Society***53**647--654. - McLachlan, G. J. (1992).
*Discriminant Analysis and Statistical Pattern Recognition*. Wiley, New York.Mathematical Reviews (MathSciNet): MR1190469 - Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction.
*Machine Learning***4**227--243. - Newman, D. J., Hettich, S., Blake, C. L. and Merz, C. J. (1998). UCI repository of machine learning databases. Dept. Information and Computer Sciences, Univ. California, Irvine. Available at www.ics.uci.edu/~mlearn/MLRepository.html.
- Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments.
*Machine Learning***42**203--231. - Rendell, A. L. and Seshu, R. (1990). Learning hard concepts through constructive induction: Framework and rationale.
*Computational Intelligence***6**247--270. - Ripley, B. D. (1996).
*Pattern Recognition and Neural Networks*. Cambridge Univ. Press. - Rosenberg, E. and Gleit, A. (1994). Quantitative methods in credit management: A survey.
*Oper. Res.***42**589--613. - Salzberg, S. L. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach.
*Data Mining and Knowledge Discovery***1**317--328. - Shavlik, J., Mooney, R. J. and Towell, G. (1991). Symbolic and neural learning algorithms: An experimental comparison.
*Machine Learning***6**111--143.Zentralblatt MATH: 1141.68327 - Thomas, L. C. (2000). A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers.
*International J. Forecasting***16**149--172. - von Winterfeldt, D. and Edwards, W. (1982). Costs and payoffs in perceptual research.
*Psychological Bulletin***91**609--622. - Webb, A. R. (2002).
*Statistical Pattern Recognition*, 2nd ed. Wiley, Chichester.Mathematical Reviews (MathSciNet): MR2191640 - Weiss, S. M., Galen, R. S. and Tadepalli, P. V. (1990). Maximizing the predictive value of production rules.
*Artificial Intelligence***45**47--71.Zentralblatt MATH: 0899.68070 - Widmer, G. and Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts.
*Machine Learning***23**69--101. - Zahavi, J. and Levin, N. (1997). Issues and problems in applying neural computing to target marketing.
*J. Direct Marketing***11**(4) 63--75.

### More like this

- High-dimensional data: p > > n in mathematical statistics and bio-medical applications

Van De Geer, Sara A. and Van Houwelingen, Hans C., Bernoulli, 2004 - Variable selection and updating in model-based
discriminant analysis for high dimensional data with food authenticity
applications

Murphy, Thomas Brendan, Dean, Nema, and Raftery, Adrian E., The Annals of Applied Statistics, 2010 - Classification and clustering of sequencing
data using a Poisson model

Witten, Daniela M., The Annals of Applied Statistics, 2011

- High-dimensional data: p > > n in mathematical statistics and bio-medical applications

Van De Geer, Sara A. and Van Houwelingen, Hans C., Bernoulli, 2004 - Variable selection and updating in model-based
discriminant analysis for high dimensional data with food authenticity
applications

Murphy, Thomas Brendan, Dean, Nema, and Raftery, Adrian E., The Annals of Applied Statistics, 2010 - Classification and clustering of sequencing
data using a Poisson model

Witten, Daniela M., The Annals of Applied Statistics, 2011 - Multivariate Methods Based Soft Measurement for Wine Quality Evaluation

Yin, Shen, Liu, Lei, Gao, Xin, and Karimi, Hamid Reza, Abstract and Applied Analysis, 2014 - Best subset selection via a modern optimization lens

Bertsimas, Dimitris, King, Angela, and Mazumder, Rahul, The Annals of Statistics, 2016 - Hierarchical Bayesian modeling of hitting performance in baseball

Jensen, Shane T., McShane, Blakeley B., and Wyner, Abraham J., Bayesian Analysis, 2009 - Introduction

Balding, David J. and Gastwirth, Joseph L., International Statistical Review, 2003 - Chapter I. Transition to Modern Number Theory

Anthony W. Knapp, Advanced Algebra, Digital Second Edition (East Setauket, NY: Anthony W. Knapp, 2016), 2016 - CHIME: Clustering of high-dimensional Gaussian mixtures with EM algorithm and its optimality

Cai, T. Tony, Ma, Jing, and Zhang, Linjun, The Annals of Statistics, 2019 - Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases

Liang, Yulan and Kelemen, Arpad, Statistics Surveys, 2008