Bayesian Analysis

Clustered Bayesian Model Averaging

Qingzhao Yu, Steven N. MacEachern, and Mario Peruggia

Full-text: Open access


It is sometimes preferable to conduct statistical analyses based on the combination of several models rather than on the selection of a single model, thus taking into account the uncertainty about the true model. Models are usually combined using constant weights that do not distinguish between different regions of the covariate space. However, a procedure that performs well in a given situation may not do so in another situation. In this paper, we propose the concept of local Bayes factors, where we calculate the Bayes factors by restricting the models to regions of the covariate space. The covariate space is split in such a way that the relative model efficiencies of the various Bayesian models are about the same in the same region while differing in different regions. An algorithm for clustered Bayes averaging is then proposed for model combination, where local Bayes factors are used to guide the weighting of the Bayesian models. Simulations and real data studies show that clustered Bayesian averaging results in better predictive performance compared to a single Bayesian model or Bayesian model averaging where models are combined using the same weights over the entire covariate space.

Article information

Bayesian Anal., Volume 8, Number 4 (2013), 883-908.

First available in Project Euclid: 4 December 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian Model Averaging Clustered Bayes Factor Local Averaging


Yu, Qingzhao; MacEachern, Steven N.; Peruggia, Mario. Clustered Bayesian Model Averaging. Bayesian Anal. 8 (2013), no. 4, 883--908. doi:10.1214/13-BA859.

Export citation


  • Berger, J.O. (1994). “An overview of robust Bayesian analysis” (with discussion), Test, 3, 5-124.
  • Berger, J.O. and Pericchi, L.R. (1996). “The Intrinsic Bayes Factor for Model Selection and Prediction”, Journal of the American Statistical Association, 91(433), 109-122.
  • Berger, J. O., and Pericchi, L. R. (1998). “Accurate and Stable Bayesian Model Selection: The Median Intrinsic Bayes Factor,” Sankhya, Series B, 60, 1–18.
  • Breiman L., Friedman J.H., Olshen R.A., and Stone, C.J. (1984), Classification and Regression Trees, Wadsworth, Belmont, Ca.
  • Breiman L. (1996), “Bagging Predictors,” Machine Learning, 26, 123-140.
  • Breiman L. (2001), “Statistical Modeling: The Two Cultures,” Statistical Science, 16, 199-215, (Disc: p216–231).
  • Casella, G. and Moreno, E. (2006). “Objective Bayesian variable selection,” Journal of the American Statistical Association, 101, 157-167.
  • Chipman H. A., George E. I., and McCulloch R. E. (2010), “BART: Bayesian Additive Regression Tree”, Annals of Applied Statistics, 4(1), 266-298.
  • Clyde, M. A., Ghosh, J. and Littman, M. (2011). “Bayesian Adaptive Sampling for Variable Selection and Model Averaging,” Journal of Computational and Graphical Statistics, 20, 80-101.
  • Friedman, J.H. (1991), “Multivariate Adaptive Regression Splines” (with discussion), Annals of Statistics, 19, 1-67.
  • Geisser, S. and Eddy, W.F. (1979). “A Predictive Approach to Model Selection,” Journal of the American Statistical Association, 74, 153-160.
  • George, E.I. and McCulloch, R.E. (1993), “Variable Selection Via Gibbs Sampling,” Journal of the American Statistical Association, 88, 881-889.
  • Gramacy, R. B. and Lee, H. K. H. (2008), “Bayesian Treed Gaussian Process Models with an Application to Computer Modeling,” Journal of the American Statistical Association, 103, 1119-1130.
  • Hans, C., Dobra, A., and West, M. (2007), “Shotgun Stochastic Search for “Large p” Regression,” Journal of the American Statistical Association, 102, 507-516.
  • Hjort, N.L. and Claeskens, G. (2003), “Frequentist Model Average Estimators”, Journal of the American Statistical Association, 98, 879-899.
  • Jeffreys, H. (1961), Theory of Probability (3rd ed.), Oxford, U.K.: Oxford University Press.
  • Jordan, M.I., and Jacobs, R.A. (1994), “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, 6, 181-214.
  • Kadane, J. B. and Lazar, N. A. (2004). “Methods and criteria for model selection” Journal of the American Statistical Association, 99, 279-290.
  • Kass, R.E. and Raftery, A.E. (1995), “Bayes factors,” Journal of the American Statistical Association, 90, 773-795.
  • Lempers, F. B. (1971). Posterior Probabilities of Alternative Linear Models, Rotterdam: University Press.
  • Madigan D. and Raftery A. E., (1994). “Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam’s Window,” Journal of the American Statistical Association, 89, 1535-1546.
  • Madigan, D. and York, J. (1995). “Bayesian graphical models for discrete data,” International Statistical Review, 63, 215-232.
  • O’Hagan A. (1995), “Fractional Bayes Factors for Model Comparison,” Journal of the Royal Statistical Society, Series B, 57(1), 99-138.
  • Raftery A.E., Madigan D. and Hoeting J.A. (1997), “Bayesian Model Averaging for Linear Regression Models,” Journal of the American Statistical Association, 92,179-191.
  • Rao, J.S., Tibshirani R. (1996), “The out-of-bootstrap method for model averaging and selection,” Technical Report, Department of Statistics, University of Toronto.
  • Ravdin P.M., Siminoff, L.A., Davis, G.J., Mercer, M.B., Hewlett, J., Gerson, N., and Parker, H.L. (2001), “Computer program to assist in making decisions about adjuvant therapy for women with early breast cancer,” Journal of Clinical Oncology, 19(4), 980-991.
  • Tibshirani R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series B, 58(1), 267-288.
  • Xu, L., Jordan, M.I., and Hinton, G.E. (1995), “An alternative model for mixtures of experts,” in Advances in Neural Information Processing Systems (NIPS) 7, Tesauro, G., Touretzky, D.S., and Leen, T.K. (Eds.), Cambridge, MA: MIT Press, 633-640.
  • Xu, X., Lu, P., MacEachern, S.N. and Xu, R. (2011). “Calibrated Bayes Factors for Model Comparison and Prediction,” Technical Report No. 855, Department of Statistics, The Ohio State University.
  • Yang, Y. (2001), “Adaptive Regression by Mixing,” Journal of the American Statistical Association, 96, 574-588.
  • Yu, Q., MacEachern, S.N., and Peruggia, M. (2011), “Bayesian Synthesis: Combining subjective analyses, with an application to ozone data,” Annals of Applied Statistics,5(2B), 1678-1698.
  • Wald, A. (1947). Sequential Analysis, New York: John Wiley and Sons.