Statistical Science

To Explain or to Predict?

Galit Shmueli

Full-text: Open access

Abstract

Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.

Article information

Source
Statist. Sci. Volume 25, Number 3 (2010), 289-310.

Dates
First available in Project Euclid: 4 January 2011

Permanent link to this document
http://projecteuclid.org/euclid.ss/1294167961

Digital Object Identifier
doi:10.1214/10-STS330

Mathematical Reviews number (MathSciNet)
MR2791669

Zentralblatt MATH identifier
1329.62045

Keywords
Explanatory modeling causality predictive modeling predictive power statistical strategy data mining scientific research

Citation

Shmueli, Galit. To Explain or to Predict?. Statist. Sci. 25 (2010), no. 3, 289--310. doi:10.1214/10-STS330. http://projecteuclid.org/euclid.ss/1294167961.


Export citation

References

  • Afshartous, D. and de Leeuw, J. (2005). Prediction in multilevel models. J. Educ. Behav. Statist. 30 109–139.
  • Aitchison, J. and Dunsmore, I. R. (1975). Statistical Prediction Analysis. Cambridge Univ. Press.
  • Bajari, P. and Hortacsu, A. (2003). The winner’s curse, reserve prices and endogenous entry: Empirical insights from ebay auctions. Rand J. Econ. 3 329–355.
  • Bajari, P. and Hortacsu, A. (2004). Economic insights from internet auctions. J. Econ. Liter. 42 457–486.
  • Bapna, R., Jank, W. and Shmueli, G. (2008). Price formation and its dynamics in online auctions. Decision Support Systems 44 641–656.
  • Bell, R. M., Koren, Y. and Volinsky, C. (2008). The BellKor 2008 solution to the Netflix Prize.
  • Bell, R. M., Koren, Y. and Volinsky, C. (2010). All together now: A perspective on the netflix prize. Chance 23 24.
  • Berk, R. A. (2008). Statistical Learning from a Regression Perspective. Springer, New York.
  • Bjornstad, J. F. (1990). Predictive likelihood: A review. Statist. Sci. 5 242–265.
  • Bohlmann, P. and Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statist. Sci. 22 477–505.
  • Breiman, L. (1996). Bagging predictors. Mach. Learn. 24 123–140.
  • Breiman, L. (2001a). Random forests. Mach. Learn. 45 5–32.
  • Breiman, L. (2001b). Statistical modeling: The two cultures. Statist. Sci. 16 199–215.
  • Brown, P. J., Vannucci, M. and Fearn, T. (2002). Bayes model averaging with selection of regressors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 519–536.
  • Campbell, J. Y. and Thompson, S. B. (2005). Predicting excess stock returns out of sample: Can anything beat the historical average? Harvard Institute of Economic Research Working Paper 2084.
  • Carte, T. A. and Craig, J. R. (2003). In pursuit of moderation: Nine common errors and their solutions. MIS Quart. 27 479–501.
  • Chakraborty, S. and Sharma, S. K. (2007). Prediction of corporate financial health by artificial neural network. Int. J. Electron. Fin. 1 442–459.
  • Chen, S.-H., Ed. (2002). Genetic Algorithms and Genetic Programming in Computational Finance. Kluwer, Dordrecht.
  • Collopy, F., Adya, M. and Armstrong, J. (1994). Principles for examining predictive–validity—the case of information-systems spending forecasts. Inform. Syst. Res. 5 170–179.
  • Dalkey, N. and Helmer, O. (1963). An experimental application of the delphi method to the use of experts. Manag. Sci. 9 458–467.
  • Dawid, A. P. (1984). Present position and potential developments: Some personal views: Statistical theory: The prequential approach. J. Roy. Statist. Soc. Ser. A 147 278–292.
  • Ding, Y. and Simonoff, J. (2010). An investigation of missing data methods for classification trees applied to binary response data. J. Mach. Learn. Res. 11 131–170.
  • Domingos, P. (2000). A unified bias–variance decomposition for zero–one and squared loss. In Proceedings of the Seventeenth National Conference on Artificial Intelligence 564–569. AAAI Press, Austin, TX.
  • Dowe, D. L., Gardner, S. and Oppy, G. R. (2007). Bayes not bust! Why simplicity is no problem for Bayesians. Br. J. Philos. Sci. 58 709–754.
  • Dubin, R. (1969). Theory Building. The Free Press, New York.
  • Edwards, J. R. and Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs. Psychological Methods 5 2 155–174.
  • Ehrenberg, A. and Bound, J. (1993). Predictability and prediction. J. Roy. Statist. Soc. Ser. A 156 167–206.
  • Fama, E. F. and French, K. R. (1993). Common risk factors in stock and bond returns. J. Fin. Econ. 33 3–56.
  • Farmer, J. D., Patelli, P. and Zovko, I. I. A. A. (2005). The predictive power of zero intelligence in financial markets. Proc. Natl. Acad. Sci. USA 102 2254–2259.
  • Fayyad, U. M., Grinstein, G. G. and Wierse, A. (2002). Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann, San Francisco, CA.
  • Feelders, A. (2002). Data mining in economic science. In Dealing with the Data Flood 166–175. STT/Beweton, Den Haag, The Netherlands.
  • Findley, D. Y. and Parzen, E. (1998). A conversation with Hirotsugo Akaike. In Selected Papers of Hirotugu Akaike 3–16. Springer, New York.
  • Forster, M. (2002). Predictive accuracy as an achievable goal of science. Philos. Sci. 69 S124–S134.
  • Forster, M. and Sober, E. (1994). How to tell when simpler, more unified, or less ad-hoc theories will provide more accurate predictions. Br. J. Philos. Sci. 45 1–35.
  • Friedman, J. H. (1997). On bias, variance, 0∕1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1 55–77.
  • Gefen, D., Karahanna, E. and Straub, D. (2003). Trust and TAM in online shopping: An integrated model. MIS Quart. 27 51–90.
  • Geisser, S. (1975). The predictive sample reuse method with applications. J. Amer. Statist. Assoc. 70 320–328.
  • Geisser, S. (1993). Predictive Inference: An Introduction. Chapman and Hall, London.
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2003). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC New York/Boca Raton, FL.
  • Ghani, R. and Simmons, H. (2004). Predicting the end-price of online auctions. In International Workshop on Data Mining and Adaptive Modelling Methods for Economics and Management, Pisa, Italy.
  • Goyal, A. and Welch, I. (2007). A comprehensive look at the empirical performance of equity premium prediction. Rev. Fin. Stud. 21 1455–1508.
  • Granger, C. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37 424–438.
  • Greenberg, E. and Parks, R. P. (1997). A predictive approach to model selection and multicollinearity. J. Appl. Econom. 12 67–75.
  • Gurbaxani, V. and Mendelson, H. (1990). An integrative model of information systems spending growth. Inform. Syst. Res. 1 23–46.
  • Gurbaxani, V. and Mendelson, H. (1994). Modeling vs. forecasting—the case of information-systems spending. Inform. Syst. Res. 5 180–190.
  • Hagerty, M. R. and Srinivasan, S. (1991). Comparing the predictive powers of alternative multiple regression models. Psychometrika 56 77–85.
  • Hastie, T., Tibshirani, R. and Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Springer, New York.
  • Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46 1251–1271.
  • Helmer, O. and Rescher, N. (1959). On the epistemology of the inexact sciences. Manag. Sci. 5 25–52.
  • Hempel, C. and Oppenheim, P. (1948). Studies in the logic of explanation. Philos. Sci. 15 135–175.
  • Hitchcock, C. and Sober, E. (2004). Prediction versus accommodation and the risk of overfitting. Br. J. Philos. Sci. 55 1–34.
  • Jaccard, J. (2001). Interaction Effects in Logistic Regression. SAGE Publications, Thousand Oaks, CA.
  • Jank, W. and Shmueli, G. (2010). Modeling Online Auctions. Wiley, New York.
  • Jank, W., Shmueli, G. and Wang, S. (2008). Modeling price dynamics in online auctions via regression trees. In Statistical Methods in eCommerce Research. Wiley, New York.
  • Jap, S. and Naik, P. (2008). Bidanalyzer: A method for estimation and selection of dynamic bidding models. Marketing Sci. 27 949–960.
  • Johnson, W. and Geisser, S. (1983). A predictive view of the detection and characterization of influential observations in regression analysis. J. Amer. Statist. Assoc. 78 137–144.
  • Kadane, J. B. and Lazar, N. A. (2004). Methods and criteria for model selection. J. Amer. Statist. Soc. 99 279–290.
  • Kendall, M. and Stuart, A. (1977). The Advanced Theory of Statistics 1, 4th ed. Griffin, London.
  • Konishi, S. and Kitagawa, G. (2007). Information Criteria and Statistical Modeling. Springer, New York.
  • Krishna, V. (2002). Auction Theory. Academic Press, San Diego, CA.
  • Little, R. J. A. (2007). Should we use the survey weights to weight? JPSM Distinguished Lecture, Univ. Maryland.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data. Wiley, New York.
  • Lucking-Reiley, D., Bryan, D., Prasad, N. and Reeves, D. (2007). Pennies from ebay: The determinants of price in online auctions. J. Indust. Econ. 55 223–233.
  • Mackay, R. J. and Oldford, R. W. (2000). Scientific method, statistical method, and the speed of light. Working Paper 2000-02, Dept. Statistics and Actuarial Science, Univ. Waterloo.
  • Makridakis, S. G., Wheelwright, S. C. and Hyndman, R. J. (1998). Forecasting: Methods and Applications, 3rd ed. Wiley, New York.
  • Montgomery, D., Peck, E. A. and Vining, G. G. (2001). Introduction to Linear Regression Analysis. Wiley, New York.
  • Mosteller, F. and Tukey, J. W. (1977). Data Analysis and Regression. Addison-Wesley, Reading, MA.
  • Muller, J. and Brandl, R. (2009). Assessing biodiversity by remote sensing in mountainous terrain: The potential of lidar to predict forest beetle assemblages. J. Appl. Ecol. 46 897–905.
  • Nabi, J., Kivimäki, M., Suominen, S., Koskenvuo, M. and Vahtera, J. (2010). Does depression predict coronary heart diseaseand cerebrovascular disease equally well? The health and social support prospective cohort study. Int. J. Epidemiol. 39 1016–1024.
  • Palmgren, B. (1999). The need for financial models. ERCIM News 38 8–9.
  • Parzen, E. (2001). Comment on statistical modeling: The two cultures. Statist. Sci. 16 224–226.
  • Patzer, G. L. (1995). Using Secondary Data in Marketing Research: United States and Worldwide. Greenwood Publishing, Westport, CT.
  • Pavlou, P. and Fygenson, M. (2006). Understanding and predicting electronic commerce adoption: An extension of the theory of planned behavior. Mis Quart. 30 115–143.
  • Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82 669–709.
  • Rosenbaum, P. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Rubin, D. B. (1997). Estimating causal effects from large data sets using propensity scores. Ann. Intern. Med. 127 757–763.
  • Saar-Tsechansky, M. and Provost, F. (2007). Handling missing features when applying classification models. J. Mach. Learn. Res. 8 1625–1657.
  • Sarle, W. S. (1998). Prediction with missing inputs. In JCIS 98 Proceedings (P. Wang, ed.) II 399–402. Research Triangle Park, Durham, NC.
  • Seni, G. and Elder, J. F. (2010). Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Mining and Knowledge Discovery). Morgan and Claypool, San Rafael, CA.
  • Shafer, G. (1996). The Art of Causal Conjecture. MIT Press, Cambridge, MA.
  • Schapire, R. E. (1999). A brief introduction to boosting. In Proceedings of the Sixth International Joint Conference on Artificial Intelligence 1401–1406. Stockholm, Sweden.
  • Shmueli, G. and Koppius, O. R. (2010). Predictive analytics in information systems research. MIS Quart. To appear.
  • Simon, H. A. (2001). Science seeks parsimony, not simplicity: Searching for pattern in phenomena. In Simplicity, Inference and Modelling: Keeping it Sophisticatedly Simple 32–72. Cambridge Univ. Press.
  • Sober, E. (2002). Instrumentalism, parsimony, and the Akaike framework. Philos. Sci. 69 S112–S123.
  • Song, H. and Witt, S. F. (2000). Tourism Demand Modelling and Forecasting: Modern Econometric Approaches. Pergamon Press, Oxford.
  • Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction, and Search, 2nd ed. MIT Press, Cambridge, MA.
  • Stone, M. (1974). Cross-validatory choice and assesment of statistical predictions (with discussion). J. Roy. Statist. Soc. Ser. B 39 111–147.
  • Taleb, N. (2007). The Black Swan. Penguin Books, London.
  • Van Maanen, J., Sorensen, J. and Mitchell, T. (2007). The interplay between theory and method. Acad. Manag. Rev. 32 1145–1154.
  • Vaughan, T. S. and Berry, K. E. (2005). Using Monte Carlo techniques to demonstrate the meaning and implications of multicollinearity. J. Statist. Educ. 13 online.
  • Wallis, W. A. (1980). The statistical research group, 1942–1945. J. Amer. Statist. Assoc. 75 320–330.
  • Wang, S., Jank, W. and Shmueli, G. (2008). Explaining and forecasting online auction prices and their dynamics using functional data analysis. J. Business Econ. Statist. 26 144–160.
  • Winkelmann, R. (2008). Econometric Analysis of Count Data, 5th ed. Springer, New York.
  • Woit, P. (2006). Not Even Wrong: The Failure of String Theory and the Search for Unity in Physical Law. Jonathan Cope, London.
  • Wu, S., Harris, T. and McAuley, K. (2007). The use of simplified or misspecified models: Linear case. Canad. J. Chem. Eng. 85 386–398.
  • Zellner, A. (1962). An efficient method of estimating seemingly unrelated regression equations and tests for aggregation bias. J. Amer. Statist. Assoc. 57 348–368.
  • Zellner, A. (2001). Keep it sophisticatedly simple. In Simplicity, Inference and Modelling: Keeping It Sophisticatedly Simple 242–261. Cambridge Univ. Press.
  • Zhang, S., Jank, W. and Shmueli, G. (2010). Real-time forecasting of online auctions via functional k-nearest neighbors. Int. J. Forecast. 26 666–683.