Statistical Science
previous :: next

Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)

Leo Breiman

Source: Statist. Sci. Volume 16, Issue 3 (2001), 199-231.

Abstract

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ss/1009213726
Digital Object Identifier: doi:10.1214/ss/1009213726
Mathematical Reviews number (MathSciNet): MR1874152

References

Amit, Y. and Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation 9 1545- 1588.
Arena, C., Sussman, N., Chiang, K., Mazumdar, S., Macina, O. and Li, W. (2000). Bagging Structure-Activity Relationships: A simulation study for assessing misclassification rates. Presented at the Second Indo-U.S. Workshop on Mathematical Chemistry, Duluth, MI. (Available at NSussman@server.ceoh.pitt.edu).
Bickel, P., Ritov, Y. and Stoker, T. (2001). Tailor-made tests for goodness of fit for semiparametric hy potheses. Unpublished manuscript.
Breiman, L. (1998). Arcing classifiers. Discussion paper, Ann. Statist. 26 801-824.
Mathematical Reviews (MathSciNet): MR99g:62083
Zentralblatt MATH: 0934.62064
Breiman, L. (2000). Some infinity theory for tree ensembles. (Available at www.stat.berkeley.edu/technical reports).
Breiman, L. (2001). Random forests. Machine Learning J. 45 5- 32.
Zentralblatt MATH: 01687841
Breiman, L. and Friedman, J. (1985). Estimating optimal transformations in multiple regression and correlation. J. Amer. Statist. Assoc. 80 580-619.
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
Mathematical Reviews (MathSciNet): MR86b:62101
Cristianini, N. and Shawe-Tay lor, J. (2000). An Introduction to Support Vector Machines. Cambridge Univ. Press.
Daniel, C. and Wood, F. (1971). Fitting equations to data. Wiley, New York.
Dempster, A. (1998). Logicist statistic 1. Models and Modeling. Statist. Sci. 13 3 248-276.
Diaconis, P. and Efron, B. (1983). Computer intensive methods in statistics. Scientific American 248 116-131.
Domingos, P. (1998). Occam's two razors: the sharp and the blunt. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (R. Agrawal and P. Stolorz, eds.) 37-43. AAAI Press, Menlo Park, CA.
Domingos, P. (1999). The role of Occam's razor in knowledge discovery. Data Mining and Knowledge Discovery 3 409-425.
Dudoit, S., Fridly and, J. and Speed, T. (2000). Comparison of discrimination methods for the classification of tumors. (Available at www.stat.berkeley.edu/technical reports).
Freedman, D. (1987). As others see us: a case study in path analysis (with discussion). J. Ed. Statist. 12 101-223.
Freedman, D. (1991). Statistical models and shoe leather. Sociological Methodology 1991 (with discussion) 291-358.
Freedman, D. (1991). Some issues in the foundations of statistics. Foundations of Science 1 19-83.
Freedman, D. (1994). From association to causation via regression. Adv. in Appl. Math. 18 59-110.
Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference 148-156. Morgan Kaufmann, San Francisco.
Friedman, J. (1999). Greedy predictive approximation: a gradient boosting machine. Technical report, Dept. Statistics Stanford Univ.
Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. Ann. Statist. 28 337-407.
Mathematical Reviews (MathSciNet): MR2002c:62050
Gifi, A. (1990). Nonlinear Multivariate Analy sis. Wiley, New York.
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Trans. Pattern Analy sis and Machine Intelligence 20 832-844.
Landswher, J., Preibon, D. and Shoemaker, A. (1984). Graphical methods for assessing logistic regression models (with discussion). J. Amer. Statist. Assoc. 79 61-83.
McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. Chapman and Hall, London.
Meisel, W. (1972). Computer-Oriented Approaches to Pattern Recognition. Academic Press, New York.
Michie, D., Spiegelhalter, D. and Tay lor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York.
Mosteller, F. and Tukey, J. (1977). Data Analy sis and Regression. Addison-Wesley, Redding, MA.
Mountain, D. and Hsiao, C. (1989). A combined structural and flexible functional approach for modelenery substitution. J. Amer. Statist. Assoc. 84 76-87.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. B 36 111-147.
Mathematical Reviews (MathSciNet): MR50:8847
Zentralblatt MATH: 0308.62063
Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer, New York.
Mathematical Reviews (MathSciNet): MR98a:68159
Vapnik, V (1998). Statistical Learning Theory. Wiley, New York.
Mathematical Reviews (MathSciNet): MR99h:62052
Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia.
Mathematical Reviews (MathSciNet): MR91g:62028
previous :: next

2009 © Institute of Mathematical Statistics