Statistical Science

Model Uncertainty

Merlise Clyde and Edward I. George

Full-text: Open access

Abstract

The evolution of Bayesian approaches for model uncertainty over the past decade has been remarkable. Catalyzed by advances in methods and technology for posterior computation, the scope of these methods has widened substantially. Major thrusts of these developments have included new methods for semiautomatic prior specification and posterior exploration. To illustrate key aspects of this evolution, the highlights of some of these developments are described.

Article information

Source
Statist. Sci. Volume 19, Number 1 (2004), 81-94.

Dates
First available in Project Euclid: 14 July 2004

Permanent link to this document
https://projecteuclid.org/euclid.ss/1089808274

Digital Object Identifier
doi:10.1214/088342304000000035

Mathematical Reviews number (MathSciNet)
MR2082148

Zentralblatt MATH identifier
1062.62044

Keywords
Bayes factors classification and regression trees model averaging linear and nonparametric regression objective prior distributions reversible jump Markov chain Monte Carlo variable selection

Citation

Clyde, Merlise; George, Edward I. Model Uncertainty. Statist. Sci. 19 (2004), no. 1, 81--94. doi:10.1214/088342304000000035. https://projecteuclid.org/euclid.ss/1089808274


Export citation

References

  • Abramovich, F., Sapatinas, T. and Silverman, B. W. (1998). Wavelet thresholding via a Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 725--749.
  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (B. Petrov and F. Csáki, eds.) 267--281. Akadémiai Kiadó, Budapest.
  • Andrieu, C., Doucet, A. and Robert, C. (2004). Computational advances for and from Bayesian analysis. Statist. Sci. 19 118--127.
  • Atay-Kayis, A. and Massam, H. (2002). The marginal likelihood for decomposable and nondecomposable graphical Gaussian models. Technical report, Dept. Mathematics, York Univ.
  • Barbieri, M. M. and Berger, J. (2004). Optimal predictive model selection. Ann. Statist. 32 870--897.
  • Bartlett, M. (1957). A comment on D. V. Lindley's statistical paradox. Biometrika 44 533--534.
  • Berger, J. O., Ghosh, J. K. and Mukhopadhyay, N. (2003). Approximations and consistency of Bayes factors as model dimension grows. J. Statist. Plann. Inference 112 241--258.
  • Berger, J. O. and Pericchi, L. R. (1996a). The intrinsic Bayes factor for linear models. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 25--44. Oxford Univ. Press.
  • Berger, J. O. and Pericchi, L. R. (1996b). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109--122.
  • Berger, J. O. and Pericchi, L. R. (1998). Accurate and stable Bayesian model selection: The median intrinsic Bayes factor. Sankhyā Ser. B 60 1--18.
  • Berger, J. O. and Pericchi, L. (2001). Objective Bayesian methods for model selection: Introduction and comparison (with discussion). In Model Selection (P. Lahiri, ed.) 135--207. IMS, Beachwood, OH.
  • Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, New York.
  • Besag, J. and Green, P. J. (1993). Spatial statistics and Bayesian computation (with discussion). J. Roy. Statist. Soc. Ser. B 55 25--37.
  • Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
  • Brooks, S. P., Giudici, P. and Roberts, G. O. (2003). Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 65 3--55.
  • Brown, P. J., Fearn, T. and Vannucci, M. (1999). The choice of variables in multivariate regression: A non-conjugate Bayesian decision theory approach. Biometrika 86 635--648.
  • Brown, P. J., Vannucci, M. and Fearn, T. (1998). Multivariate Bayesian variable selection and prediction. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 627--641.
  • Brown, P. J., Vannucci, M. and Fearn, T. (2002). Bayes model averaging with selection of regressors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 519--536.
  • Buntine, W. (1992). Learning classification trees. Statist. Comput. 2 63--73.
  • Carlin, B. P. and Chib, S. (1995). Bayesian model choice via Markov chain Monte Carlo methods. J. Roy. Statist. Soc. Ser. B 57 473--484.
  • Casella, G. and Moreno, E. (2002). Objective Bayes variable selection. Technical Report 2002-023, Dept. Statistics, Univ. Florida.
  • Chen, M.-H. (1994). Importance-weighted marginal Bayesian posterior density estimation. J. Amer. Statist. Assoc. 89 818--824.
  • Chen, M.-H., Ibrahim, J. G., Shao, Q.-M. and Weiss, R. E. (2003). Prior elicitation for model selection and estimation in generalized linear mixed models. J. Statist. Plann. Inference 111 57--76.
  • Chen, M.-H., Ibrahim, J. G. and Yiannoutsos, C. (1999). Prior elicitation, variable selection and Bayesian computation for logistic regression models. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 223--242.
  • Chen, M.-H. and Shao, Q.-M. (1997). On Monte Carlo methods for estimating ratios of normalizing constants. Ann. Statist. 25 1563--1594.
  • Chen, M.-H., Shao, Q.-M. and Ibrahim, J. G. (2000). Monte Carlo Methods in Bayesian Computation. Springer, New York.
  • Chib, S. (1995). Marginal likelihood from the Gibbs output. J. Amer. Statist. Assoc. 90 1313--1321.
  • Chib, S. and Jeliazkov, I. (2001). Marginal likelihood from the Metropolis--Hastings output. J. Amer. Statist. Assoc. 96 270--281.
  • Chipman, H. A. (1996). Bayesian variable selection with related predictors. Canad. J. Statist. 24 17--36.
  • Chipman, H. A., George, E. I. and McCulloch, R. E. (1998). Bayesian CART model search (with discussion). J. Amer. Statist. Assoc. 93 935--960.
  • Chipman, H. A., George, E. I. and McCulloch, R. E. (2001). The practical implementation of Bayesian model selection (with discussion). In Model Selection (P. Lahiri, ed.) 65--134. IMS, Beachwood, OH.
  • Chipman, H. A., George, E. I. and McCulloch, R. E. (2003). Bayesian treed generalized linear models (with discussion). In Bayesian Statistics 7 (J. M. Bernardo, M. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 85--103. Oxford Univ. Press.
  • Chipman, H. A., Kolaczyk, E. D. and McCulloch, R. E. (1997). Adaptive Bayesian wavelet shrinkage. J. Amer. Statist. Assoc. 92 1413--1421.
  • Clyde, M. (1999). Bayesian model averaging and model search strategies (with discussion). In Bayesian Statistics 6 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 157--185. Oxford Univ. Press.
  • Clyde, M. (2001). Discussion of ``The practical implementation of Bayesian model selection,'' by H. A. Chipman, E. I. George and R. E. McCulloch. In Model Selection (P. Lahiri, ed.) 117--124. IMS, Beachwood, OH.
  • Clyde, M., DeSimone, H. and Parmigiani, G. (1996). Prediction via orthogonalized model mixing. J. Amer. Statist. Assoc. 91 1197--1208.
  • Clyde, M. and George, E. I. (2000). Flexible empirical Bayes estimation for wavelets. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 681--698.
  • Clyde, M., Parmigiani, G. and Vidakovic, B. (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85 391--401.
  • Cui, W. (2002). Variable selection: Empirical Bayes vs. fully Bayes. Ph.D. dissertation, Dept. Management Science and Information Systems, Univ. Texas, Austin.
  • Dawid, A. and Lauritzen, S. (2001). Compatible prior distributions. In Bayesian Methods with Applications to Science, Policy, and Official Statistics, Selected Papers from ISBA 2000: The Sixth World Meeting of the International Society for Bayesian Analysis (E. I. George, ed.) 109--118. Eurostat, Luxembourg.
  • Dellaportas, P. and Forster, J. J. (1999). Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models. Biometrika 86 615--633.
  • Dellaportas, P., Forster, J. J. and Ntzoufras, I. (2002). On Bayesian model and variable selection using MCMC. Statist. Comput. 12 27--36.
  • Dellaportas, P., Giudici, P. and Roberts, G. (2003). Bayesian inference for nondecomposable graphical Gaussian models. Sankhyā Ser. A. 65 43--55.
  • Dempster, A. M. (1972). Covariance selection. Biometrics 28 157--175.
  • Denison, D. G. T., Holmes, C., Mallick, B. K. and Smith, A. F. M. (2002). Bayesian Methods for Nonlinear Classification and Regression. Wiley, New York.
  • Denison, D. G. T., Mallick, B. K. and Smith, A. F. M. (1998a). Automatic Bayesian curve fitting. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 333--350.
  • Denison, D. G. T., Mallick, B. K. and Smith, A. F. M. (1998b). A Bayesian CART algorithm. Biometrika 85 363--377.
  • Denison, D. G. T., Mallick, B. K. and Smith, A. F. M. (1998c). Bayesian MARS. Statist. Comput. 8 337--346.
  • DiCiccio, T. J., Kass, R. E., Raftery, A. and Wasserman, L. (1997). Computing Bayes factors by combining simulation and asymptotic approximations. J. Amer. Statist. Assoc. 92 903--915.
  • Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion). J. Roy. Statist. Soc. Ser. B 57 45--97.
  • Draper, D. and Fouskakis, D. (2000). A case study of stochastic optimization in health policy: Problem formulation and preliminary results. J. Global Optimization 18 399--416.
  • Dupuis, J. A. and Robert, C. P. (2003). Variable selection in qualitative models via an entropic explanatory power. J. Statist. Plann. Inference 111 77--94.
  • Fernández, C., Ley, E. and Steel, M. F. (2001). Benchmark priors for Bayesian model averaging. J. Econometrics 100 381--427.
  • Foster, D. P. and George, E. I. (1994). The risk inflation criterion for multiple regression. Ann. Statist. 22 1947--1975.
  • Furnival, G. M. and Wilson, R. W., Jr. (1974). Regression by leaps and bounds. Technometrics 16 499--511.
  • Geisser, S. (1993). Predictive Inference. An Introduction. Chapman and Hall, London.
  • Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions, with implementation via sampling-based methods (with discussion). In Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 147--167. Oxford Univ. Press.
  • Gelfand, A. E. and Ghosh, S. K. (1998). Model choice: A minimum posterior predictive loss approach. Biometrika 85 1--11.
  • Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85 398--409.
  • Gelman, A. and Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statist. Sci. 13 163--185.
  • George, E. I. (1999). Discussion of ``Bayesian model averaging and model search strategies,'' by M. Clyde. In Bayesian Statistics 6 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 175--177. Oxford Univ. Press.
  • George, E. I. (2000). The variable selection problem. J. Amer. Statist. Assoc. 95 1304--1308.
  • George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731--747.
  • George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881--889.
  • George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339--374.
  • George, E. I., McCulloch, R. and Tsay, R. (1996). Two approaches to Bayesian model selection with applications. In Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner (D. Berry, K. Chaloner and J. Geweke, eds.) 339--348. Wiley, New York.
  • Geweke, J. (1996). Variable selection and model comparison in regression. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 609--620. Oxford Univ. Press.
  • Giudici, P. and Green, P. J. (1999). Decomposable graphical Gaussian model determination. Biometrika 86 785--801.
  • Godsill, S. J. (2001). On the relationship between Markov chain Monte Carlo methods for model uncertainty. J. Comput. Graph. Statist. 10 230--248.
  • Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711--732.
  • Green, P. J. (2003). Trans-dimensional Markov chain Monte Carlo. In Highly Structured Stochastic Systems (P. J. Green, N. L. Hjort and S. Richardson, eds.) 179--206. Oxford Univ. Press.
  • Han, C. and Carlin, B. P. (2001). Markov chain Monte Carlo methods for computing Bayes factors: A comparative review. J. Amer. Statist. Assoc. 96 1122--1132.
  • Hansen, M. H. and Kooperberg, C. (2002). Spline adaptation in extended linear models (with discussion). Statist. Sci. 17 2--51.
  • Hansen, M. H. and Yu, B. (2001). Model selection and the principle of minimum description length. J. Amer. Statist. Assoc. 96 746--774.
  • Hodges, J. S. (1987). Uncertainty, policy analysis and statistics (with discussion). Statist. Sci. 2 259--275.
  • Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). Bayesian model averaging: A tutorial (with discussion). Statist. Sci. 14 382--417. (Corrected version available at http://www.stat.washington. edu/www/research/online/hoeting1999.pdf.)
  • Hoeting, J. A., Raftery, A. E. and Madigan, D. (2002). Bayesian variable and transformation selection in linear regression. J. Comput. Graph. Statist. 11 485--507.
  • Ibrahim, J. G., Chen, M.-H. and MacEachern, S. N. (1999). Bayesian variable selection for proportional hazards models. Canad. J. Statist. 27 701--717.
  • Ibrahim, J. G., Chen, M.-H. and Ryan, L. M. (2000). Bayesian variable selection for time series count data. Statist. Sinica 10 971--987.
  • Johnstone, I. and Silverman, B. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594--1649.
  • Jordan, M. I. (2004). Graphical models. Statist. Sci. 19 140--155.
  • Kass, R. E. and Raftery, A. E. (1995). Bayes factors. J. Amer. Statist. Assoc. 90 773--795.
  • Key, J. T., Pericchi, L. R. and Smith, A. F. M. (1999). Bayesian model choice: What and why? In Bayesian Statistics 6 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 343--370. Oxford Univ. Press.
  • Kohn, R., Marron, J. S. and Yau, P. (2000). Wavelet estimation using Bayesian basis selection and basis averaging. Statist. Sinica 10 109--128.
  • Leamer, E. E. (1978a). Regression selection strategies and revealed priors. J. Amer. Statist. Assoc. 73 580--587.
  • Leamer, E. E. (1978b). Specification Searches: Ad Hoc Inference with Nonexperimental Data. Wiley, New York.
  • Lewis, S. M. and Raftery, A. E. (1997). Estimating Bayes factors via posterior simulation with the Laplace--Metropolis estimator. J. Amer. Statist. Assoc. 92 648--655.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J. (2003). Gaussian hyper-geometric and other mixtures of $g$-priors for Bayesian variable selection. Technical report, Statistical and Applied Mathematical Sciences Inst., Research Triangle Park, NC.
  • Liang, F., Truong, Y. and Wong, W. H. (2001). Automatic Bayesian model averaging for linear regression and applications in Bayesian curve fitting. Statist. Sinica 11 1005--1029.
  • Lindley, D. V. (1968). The choice of variables in multiple regression (with discussion). J. Roy. Statist. Soc. Ser. B 30 31--66.
  • Madigan, D. and Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam's window. J. Amer. Statist. Assoc. 89 1535--1546.
  • Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. Internat. Statist. Rev. 63 215--232.
  • Marriott, J. M., Spencer, N. M. and Pettitt, A. N. (2001). A Bayesian approach to selecting covariates for prediction. Scand. J. Statist. 28 87--97.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, London.
  • Meng, X.-L. and Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statist. Sinica 6 831--860.
  • Miller, A. J. (2002). Subset Selection in Regression, 2nd ed. Chapman and Hall, London.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression (with discussion). J. Amer. Statist. Assoc. 83 1023--1032.
  • Müller, P. and Quintana, F. A. (2004). Nonparametric Bayesian data analysis. Statist. Sci. 19 95--110.
  • Ntzoufras, I., Dellaportas, P. and Forster, J. J. (2003). Bayesian variable and link determination for generalised linear models. J. Statist. Plann. Inference 111 165--180.
  • Ntzoufras, I., Forster, J. J. and Dellaportas, P. (2000). Stochastic search variable selection for log-linear models. J. Statist. Comput. Simulation 68 23--37.
  • O'Hagan, A. (1995). Fractional Bayes factors for model comparison (with discussion). J. Roy. Statist. Soc. Ser. B 57 99--138.
  • Pauler, D. K. (1998). The Schwarz criterion and related methods for normal linear models. Biometrika 85 13--27.
  • Pauler, D. K., Wakefield, J. C. and Kass, R. E. (1999). Bayes factors and approximations for variance component models. J. Amer. Statist. Assoc. 94 1242--1253.
  • Pérez, J. and Berger, J. O. (2000). Expected posterior prior distributions for model selection. Technical Report 00-08, Institute of Statistics and Decision Sciences, Duke Univ.
  • Raftery, A. E. (1996). Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83 251--266.
  • Raftery, A. E., Madigan, D. and Hoeting, J. A. (1997). Bayesian model averaging for linear regression models. J. Amer. Statist. Assoc. 92 179--191.
  • Raftery, A. E., Madigan, D. and Volinsky, C. T. (1996). Accounting for model uncertainty in survival analysis improves predictive performance. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 323--349. Oxford Univ. Press.
  • Roverato, A. (2002). Hyper inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand. J. Statist. 29 391--411.
  • San Martini, A. and Spezzaferri, F. (1984). A predictive model selection criterion. J. Roy. Statist. Soc. Ser. B 46 296--303.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461--464.
  • Shively, T. S., Kohn, R. and Wood, S. (1999). Variable selection and function estimation in additive nonparametric regression using a data-based prior (with discussion). J. Amer. Statist. Assoc. 94 777--806.
  • Smith, A. F. M. and Roberts, G. O. (1993). Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods (with discussion). J. Roy. Statist. Soc. Ser. B 55 3--23.
  • Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics 75 317--343.
  • Smith, M. and Kohn, R. (1997). A Bayesian approach to nonparametric bivariate regression. J. Amer. Statist. Assoc. 92 1522--1535.
  • Smith, M. and Kohn, R. (2002). Parsimonious covariance matrix estimation for longitudinal data. J. Amer. Statist. Assoc. 97 1141--1153.
  • Stewart, L. and Davis, W. W. (1986). Bayesian posterior distributions over sets of possible models with inferences computed by Monte Carlo integration. The Statistician 35 175--182.
  • Strawderman, W. E. (1971). Proper Bayes minimax estimators of the multivariate normal mean. Ann. Math. Statist. 42 385--388.
  • Tierney, L. (1994). Markov chains for exploring posterior distributions (with discussion). Ann. Statist. 22 1701--1728.
  • Tierney, L. and Kadane, J. (1986). Accurate approximations for posterior moments and marginal densities. J. Amer. Statist. Assoc. 81 82--86.
  • Verdinelli, I. and Wasserman, L. (1995). Computing Bayes factors using a generalization of the Savage--Dickey density ratio. J. Amer. Statist. Assoc. 90 614--618.
  • Volinsky, C. T., Madigan, D., Raftery, A. E. and Kronmal, R. A. (1997). Bayesian model averaging in proportional hazard models: Assessing the risk of a stroke. Appl. Statist. 46 433--448.
  • Wakefield, J. and Bennett, J. (1996). The Bayesian modeling of covariates for population pharmacokinetic models. J. Amer. Statist. Assoc. 91 917--927.
  • Wang, X. (2002). Bayesian variable selection for generalized linear models. Ph.D. dissertation, Dept. Management Science and Information Systems, Univ. Texas, Austin.
  • Wolfe, P. J., Godsill, S. J. and Ng, W.-J. (2004). Bayesian variable selection and regularisation for time--frequency surface estimation. J. R. Stat. Soc. Ser. B Stat. Methodol. To appear.
  • Wong, F., Carter, C. and Kohn, R. (2003). Efficient estimation of covariance selection models. Biometrika 90 809--830.
  • Wood, S. and Kohn, R. (1998). A Bayesian approach to robust binary nonparametric regression. J. Amer. Statist. Assoc. 93 203--213.
  • Wood, S., Kohn, R., Shively, T. and Jiang, W. (2002). Model selection in spline nonparametric regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 119--139.
  • Zellner, A. (1984). Posterior odds ratios for regression hypotheses: General considerations and some specific results. In Basic Issues in Econometrics (A. Zellner, ed.) 275--305. Univ. Chicago Press.
  • Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with $g$-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti (P. K. Goel and A. Zellner, eds.) 233--243. North-Holland, Amsterdam.
  • Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics. Proceedings of the First Valencia International Meeting Held in Valencia (Spain) (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 585--603. Valencia Univ. Press.