Statistical Science

Models as Approximations—Rejoinder

Andreas Buja, Arun Kumar Kuchibhotla, Richard Berk, Edward George, Eric Tchetgen Tchetgen, and Linda Zhao

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We respond to the discussants of our articles emphasizing the importance of inference under misspecification in the context of the reproducibility/replicability crisis. Along the way, we discuss the roles of diagnostics and model building in regression as well as connections between our well-specification framework and semiparametric theory.

Article information

Statist. Sci., Volume 34, Number 4 (2019), 606-620.

First available in Project Euclid: 8 January 2020

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Well-specification reproducibility/replicability proper scoring rules causal inference semiparametrics diagnostics


Buja, Andreas; Kuchibhotla, Arun Kumar; Berk, Richard; George, Edward; Tchetgen Tchetgen, Eric; Zhao, Linda. Models as Approximations—Rejoinder. Statist. Sci. 34 (2019), no. 4, 606--620. doi:10.1214/19-STS762.

Export citation


  • Adam, D. (2019). Psychology’s reproducibility solution fails first test. Science 364 813. 10.1126/science.364.6443.813.
  • Aronov, P. M. and Miller, B. T. (2019). Foundations of Agnostic Statistics. Cambridge Univ. Press, Cambridge.
  • Athey, S. and Imbens, G. (2017). The econometrics of randomized experiments. In Handbook of Economic Field Experiments 1 73–140. Elsevier, Amsterdam.
  • Azriel, D., Brown, L. D., Sklar, M., Berk, R., Buja, A. and Zhao, L. (2016). Semi-supervised linear regression. Available at arXiv:1612.02391.
  • Berk, R., Buja, A., Brown, L., George, E., Kuchibhotla, A. K., Su, W. and Zhao, L. (2019). Assumption lean regression. Amer. Statist. 10.1080/00031305.2019.1592781.
  • Berk, R., Olson, M., Buja, A. and Ouss, A. (2020). Using recursive partitioning to find and estimate heterogeneous treatment effects in randomized clinical trials. J. Exp. Criminol.. To appear. Available at
  • Boos, D. D. (1992). On generalized score tests. Amer. Statist. 46 327–333.
  • Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580–619.
  • Buja, A., Stuetzle, W. and Yi, S. (2005). Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications. Unpublished manuscript. Available at
  • Cantoni, E. and Ronchetti, E. (2001). Robust inference for generalized linear models. J. Amer. Statist. Assoc. 96 1022–1030.
  • Davies, L. (2014). Data Analysis and Approximate Models: Model Choice, Location-Scale, Analysis of Variance, Nonparametric Regression and Image Analysis. Monographs on Statistics and Applied Probability 133. CRC Press, Boca Raton, FL.
  • Elliott, G., Ghanem, D. and Krüger, F. (2016). Forecasting conditional probabilities of binary outcomes under misspecification. The Review of Economics and Statistics 98 742–755.
  • EMA, FDA (2017). ICH E9(R1) Addendum on Estimands and Sensitivity Analysis in Clinical Trials.,
  • Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359–378.
  • Godambe, V. P. and Thompson, M. E. (1984). Robust estimation through estimating equations. Biometrika 71 115–125.
  • Hartman, N. (2014). Who really found the Higgs Boson.
  • Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I: Statistics 221–233. Univ. California Press, Berkeley, CA.
  • Ioannidis, J. P. A. (2005). Why most published research findings are false. Chance 18 40–47.
  • Koller, M. and Stahel, W. A. (2017). Nonsingular subsampling for regression S estimators with categorical predictors. Comput. Statist. 32 631–646.
  • Kuchibhotla, A. K., Brown, L. D. and Buja, A. (2018a). Model-free study of ordinary least squares linear regression. Available at arXiv:1809.10538.
  • Kuchibhotla, A. K., Brown, L. D., Buja, A., George, E. I. and Zhao, L. (2018b). A model free perspective for linear regression: Uniform-in-model bounds for post selection inference. Available at arXiv:1802.05801.
  • Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J. and Wasserman, L. (2018). Distribution-free predictive inference for regression. J. Amer. Statist. Assoc. 113 1094–1111.
  • McCarthy, D., Zhang, K., Brown, L. D., Berk, R., Buja, A., George, E. I. and Zhao, L. (2018). Calibrated percentile double bootstrap for robust linear regression inference. Statist. Sinica 28 2565–2589.
  • McCullagh, P. and Nelder, J. A. (1983). Generalized linear models. Chapman and Hall, London.
  • Newey, W. K. (1994). The asymptotic variance of semiparametric estimators. Econometrica 62 1349–1382.
  • Newey, W. K., Hsieh, F. and Robins, J. M. (2004). Twicing kernels and a small bias property of semiparametric estimators. Econometrica 72 947–962.
  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge Univ. Press, Cambridge.
  • Peters, J., Bühlmann, P. and Meinshausen, N. (2016). Causal inference by using invariant prediction: Identification and confidence intervals. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 947–1012.
  • Pitkin, E., Berk, R., Brown, L., Buja, A., George, E., Zhang, K. and Zhao, L. (2013). Improved precision in estimating average treatment effects. Available at arXiv:1311.0291.
  • Shah, R. and Peters, J. (2018). The hardness of conditional independence testing and the generalised covariance measure. Available at arXiv:1804.07203.
  • Simmons, J. P., Nelson, L. D. and Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22 1359–1366.
  • Szpiro, A. A., Rice, K. M. and Lumley, T. (2010). Model-robust regression and a Bayesian “sandwich” estimator. Ann. Appl. Stat. 4 2099–2113.
  • Steinberger, L. and Leeb, H. (2018). Conditional predictive inference for high-dimensional stable algorithms. Available at arXiv:1809.01412v1.
  • Stoker, T. M. (1986). Consistent estimation of scaled coefficients. Econometrica 54 1461–1481.
  • White, H. (1980a). Using least squares to approximate unknown regression functions. Internat. Econom. Rev. 21 149–170.
  • White, H. (1980b). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48 817–838.

See also

  • Main article: Models as Approximations I: Consequences Illustrated with Linear Regression.
  • Main article: Models as Approximations II: A Model-Free Theory of Parametric Regression.