Statistical Science

Classifier Technology and the Illusion of Progress

David J. Hand

Full-text: Open access


A great many tools have been developed for supervised classification, ranging from early methods such as linear discriminant analysis through to modern developments such as neural networks and support vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illusion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.

Article information

Statist. Sci. Volume 21, Number 1 (2006), 1-14.

First available in Project Euclid: 6 June 2006

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Supervised classification error rate misclassification rate simplicity principle of parsimony population drift selectivity bias flat maximum effect problem uncertainty empirical comparisons


Hand, David J. Classifier Technology and the Illusion of Progress. Statist. Sci. 21 (2006), no. 1, 1--14. doi:10.1214/088342306000000060.

Export citation


  • Adams, N. M. and Hand, D. J. (1999). Comparing classifiers when the misallocation costs are uncertain. Pattern Recognition 32 1139--1147.
  • Benton, T. C. (2002). Theoretical and empirical models. Ph.D. dissertation, Dept. Mathematics, Imperial College London.
  • Breiman, L. (2001). Statistical modeling: The two cultures (with discussion). Statist. Sci. 16 199--231.
  • Duin, R. P. W. (1996). A note on comparing classifiers. Pattern Recognition Letters 17 529--536.
  • Efron, B. (2001). Comment on ``Statistical modeling: The two cultures,'' by L. Breiman. Statist. Sci. 16 218--219.
  • Fawcett, T. and Provost, F. (1997). Adaptive fraud detection. Data Mining and Knowledge Discovery 1 291--316.
  • Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics 7 179--188.
  • Friedman, C. P. and Wyatt, J. C. (1997). Evaluation Methods in Medical Informatics. Springer. New York.
  • Friedman, J. H. (1997). On bias, variance, 0/1 loss, and the curse of dimensionality. Data Mining and Knowledge Discovery 1 55--77.
  • Gallagher, J. C., Hedlund, L. R., Stoner, S. and Meeger, C. (1988). Vertebral morphometry: Normative data. Bone and Mineral 4 189--196.
  • Hand, D. J. (1981). Discrimination and Classification. Wiley, Chichester.
  • Hand, D. J. (1996). Classification and computers: Shifting the focus. In COMPSTAT-96: Proceedings in Computational Statistics (A. Prat, ed.) 77--88. Physica, Berlin.
  • Hand, D. J. (1997). Construction and Assessment of Classification Rules. Wiley, Chichester.
  • Hand, D. J. (1998). Strategy, methods, and solving the right problems. Comput. Statist. 13 5--14.
  • Hand, D. J. (1999). Intelligent data analysis and deep understanding. In Causal Models and Intelligent Data Management (A. Gammerman, ed.) 67--80. Springer, Berlin.
  • Hand, D. J. (2001). Modelling consumer credit risk. IMA J. Management Mathematics 12 139--155.
  • Hand, D. J. (2001). Reject inference in credit operations. In Handbook of Credit Scoring (E. Mays, ed.) 225--240. Glenlake, Chicago.
  • Hand, D. J. (2004). Academic obsessions and classification realities: Ignoring practicalities in supervised classification. In Classification, Clustering and Data Mining Applications (D. Banks, L. House, F. R. McMorris, P. Arabie and W. Gaul, eds.) 209--232. Springer, Berlin.
  • Hand, D. J. (2005). Supervised classification and tunnel vision. Applied Stochastic Models in Business and Industry 21 97--109.
  • Hand, D. J. and Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A review. J. Roy. Statist. Soc. Ser. A 160 523--541.
  • Hoadley, B. (2001). Comment on ``Statistical modeling: The two cultures,'' by L. Breiman. Statist. Sci. 16 220--224.
  • Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning 11 63--90.
  • Jamain, A. (2004). Meta-analysis of classification methods. Ph.D. dissertation, Dept. Mathematics, Imperial College London.
  • Jamain, A. and Hand, D. J. (2005). Mining supervised classification performance studies: A meta-analytic investigation. Technical report, Dept. Mathematics, Imperial College London.
  • Kelly, M. G and Hand, D. J. (1999). Credit scoring with uncertain class definitions. IMA J. Mathematics Management 10 331--345.
  • Kelly, M. G., Hand, D. J. and Adams, N. M. (1998). Defining the goals to optimise data mining performance. In Proc. Fourth International Conference on Knowledge Discovery and Data Mining (R. Agrawal, P. Stolorz and G. Piatetsky-Shapiro, eds.) 234--238. AAAI Press, Menlo Park, CA.
  • Kelly, M. G., Hand, D. J. and Adams, N. M. (1999). The impact of changing populations on classifier performance. In Proc. Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (S. Chaudhuri and D. Madigan, eds.) 367--371. ACM, New York.
  • Kelly, M. G., Hand, D. J. and Adams, N. M. (1999). Supervised classification problems: How to be both judge and jury. In Advances in Intelligent Data Analysis. Lecture Notes in Comput. Sci. 1642 235--244. Springer, Berlin.
  • Klinkenberg, R. and Thorsten, J. (2000). Detecting concept drift with support vector machines. In Proc. 17th International Conference on Machine Learning (P. Langley, ed.) 487--494. Morgan Kaufmann, San Francisco.
  • Lane, T. and Brodley, C. E. (1998). Approaches to online learning and concept drift for user identification in computer security. In Proc. Fourth International Conference on Knowledge Discovery and Data Mining (R. Agrawal, P. Stolorz and G. Piatetsky-Shapiro, eds.) 259--263. AAAI Press, Menlo Park, CA.
  • Lewis, E. M. (1990). An Introduction to Credit Scoring. Athena, San Rafael, CA.
  • Li, H. G. and Hand, D. J. (2002). Direct versus indirect credit scoring classifications. J. Operational Research Society 53 647--654.
  • McLachlan, G. J. (1992). Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York.
  • Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning 4 227--243.
  • Newman, D. J., Hettich, S., Blake, C. L. and Merz, C. J. (1998). UCI repository of machine learning databases. Dept. Information and Computer Sciences, Univ. California, Irvine. Available at
  • Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning 42 203--231.
  • Rendell, A. L. and Seshu, R. (1990). Learning hard concepts through constructive induction: Framework and rationale. Computational Intelligence 6 247--270.
  • Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge Univ. Press.
  • Rosenberg, E. and Gleit, A. (1994). Quantitative methods in credit management: A survey. Oper. Res. 42 589--613.
  • Salzberg, S. L. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1 317--328.
  • Shavlik, J., Mooney, R. J. and Towell, G. (1991). Symbolic and neural learning algorithms: An experimental comparison. Machine Learning 6 111--143.
  • Thomas, L. C. (2000). A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers. International J. Forecasting 16 149--172.
  • von Winterfeldt, D. and Edwards, W. (1982). Costs and payoffs in perceptual research. Psychological Bulletin 91 609--622.
  • Webb, A. R. (2002). Statistical Pattern Recognition, 2nd ed. Wiley, Chichester.
  • Weiss, S. M., Galen, R. S. and Tadepalli, P. V. (1990). Maximizing the predictive value of production rules. Artificial Intelligence 45 47--71.
  • Widmer, G. and Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine Learning 23 69--101.
  • Zahavi, J. and Levin, N. (1997). Issues and problems in applying neural computing to target marketing. J. Direct Marketing 11(4) 63--75.