Statistical Science

Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)

Leo Breiman

Full-text: Open access

Abstract

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.

Article information

Source
Statist. Sci. Volume 16, Issue 3 (2001), 199-231.

Dates
First available in Project Euclid: 24 December 2001

Permanent link to this document
http://projecteuclid.org/euclid.ss/1009213726

Digital Object Identifier
doi:10.1214/ss/1009213726

Mathematical Reviews number (MathSciNet)
MR1874152

Citation

Breiman, Leo. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statist. Sci. 16 (2001), no. 3, 199--231. doi:10.1214/ss/1009213726. http://projecteuclid.org/euclid.ss/1009213726.


Export citation

References

  • Amit, Y. and Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation 9 1545- 1588.
  • Arena, C., Sussman, N., Chiang, K., Mazumdar, S., Macina, O. and Li, W. (2000). Bagging Structure-Activity Relationships: A simulation study for assessing misclassification rates. Presented at the Second Indo-U.S. Workshop on Mathematical Chemistry, Duluth, MI. (Available at NSussman@server.ceoh.pitt.edu).
  • Bickel, P., Ritov, Y. and Stoker, T. (2001). Tailor-made tests for goodness of fit for semiparametric hy potheses. Unpublished manuscript.
  • Breiman, L. (1998). Arcing classifiers. Discussion paper, Ann. Statist. 26 801-824.
  • Breiman, L. (2000). Some infinity theory for tree ensembles. (Available at www.stat.berkeley.edu/technical reports).
  • Breiman, L. (2001). Random forests. Machine Learning J. 45 5- 32.
  • Breiman, L. and Friedman, J. (1985). Estimating optimal transformations in multiple regression and correlation. J. Amer. Statist. Assoc. 80 580-619.
  • Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
  • Cristianini, N. and Shawe-Tay lor, J. (2000). An Introduction to Support Vector Machines. Cambridge Univ. Press.
  • Daniel, C. and Wood, F. (1971). Fitting equations to data. Wiley, New York.
  • Dempster, A. (1998). Logicist statistic 1. Models and Modeling. Statist. Sci. 13 3 248-276.
  • Diaconis, P. and Efron, B. (1983). Computer intensive methods in statistics. Scientific American 248 116-131.
  • Domingos, P. (1998). Occam's two razors: the sharp and the blunt. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (R. Agrawal and P. Stolorz, eds.) 37-43. AAAI Press, Menlo Park, CA.
  • Domingos, P. (1999). The role of Occam's razor in knowledge discovery. Data Mining and Knowledge Discovery 3 409-425.
  • Dudoit, S., Fridly and, J. and Speed, T. (2000). Comparison of discrimination methods for the classification of tumors. (Available at www.stat.berkeley.edu/technical reports).
  • Freedman, D. (1987). As others see us: a case study in path analysis (with discussion). J. Ed. Statist. 12 101-223.
  • Freedman, D. (1991). Statistical models and shoe leather. Sociological Methodology 1991 (with discussion) 291-358.
  • Freedman, D. (1991). Some issues in the foundations of statistics. Foundations of Science 1 19-83.
  • Freedman, D. (1994). From association to causation via regression. Adv. in Appl. Math. 18 59-110.
  • Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference 148-156. Morgan Kaufmann, San Francisco.
  • Friedman, J. (1999). Greedy predictive approximation: a gradient boosting machine. Technical report, Dept. Statistics Stanford Univ.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. Ann. Statist. 28 337-407.
  • Gifi, A. (1990). Nonlinear Multivariate Analy sis. Wiley, New York.
  • Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Trans. Pattern Analy sis and Machine Intelligence 20 832-844.
  • Landswher, J., Preibon, D. and Shoemaker, A. (1984). Graphical methods for assessing logistic regression models (with discussion). J. Amer. Statist. Assoc. 79 61-83.
  • McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. Chapman and Hall, London.
  • Meisel, W. (1972). Computer-Oriented Approaches to Pattern Recognition. Academic Press, New York.
  • Michie, D., Spiegelhalter, D. and Tay lor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York.
  • Mosteller, F. and Tukey, J. (1977). Data Analy sis and Regression. Addison-Wesley, Redding, MA.
  • Mountain, D. and Hsiao, C. (1989). A combined structural and flexible functional approach for modelenery substitution. J. Amer. Statist. Assoc. 84 76-87.
  • Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. B 36 111-147.
  • Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer, New York.
  • Vapnik, V (1998). Statistical Learning Theory. Wiley, New York.
  • Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia.