Bayesian Analysis

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Dimitris Fouskakis, Ioannis Ntzoufras, and David Draper

Full-text: Open access


In the context of the expected-posterior prior (EPP) approach to Bayesian variable selection in linear models, we combine ideas from power-prior and unit-information-prior methodologies to simultaneously (a) produce a minimally-informative prior and (b) diminish the effect of training samples. The result is that in practice our power-expected-posterior (PEP) methodology is sufficiently insensitive to the size $n^{*}$ of the training sample, due to PEP’s unit-information construction, that one may take $n^{*}$ equal to the full-data sample size $n$ and dispense with training samples altogether. This promotes stability of the resulting Bayes factors, removes the arbitrariness arising from individual training-sample selections, and greatly increases computational speed, allowing many more models to be compared within a fixed CPU budget. We find that, under an independence Jeffreys (reference) baseline prior, the asymptotics of PEP Bayes factors are equivalent to those of Schwartz’s Bayesian Information Criterion (BIC), ensuring consistency of the PEP approach to model selection. Our PEP prior, due to its unit-information structure, leads to a variable-selection procedure that — in our empirical studies — (1) is systematically more parsimonious than the basic EPP with minimal training sample, while sacrificing no desirable performance characteristics to achieve this parsimony; (2) is robust to the size of the training sample, thus enjoying the advantages described above arising from the avoidance of training samples altogether; and (3) identifies maximum-a-posteriori models that achieve better out-of-sample predictive performance than that provided by standard EPPs, the $g$-prior, the hyper-$g$ prior, non-local priors, the Least Absolute Shrinkage and Selection Operator (LASSO) and Smoothly-Clipped Absolute Deviation (SCAD) methods.

Article information

Bayesian Anal. Volume 10, Number 1 (2015), 75-107.

First available in Project Euclid: 28 January 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian variable selection Bayes factors Consistency Expected-posterior priors Gaussian linear models g-prior Hyper-g prior LASSO Non-local priors Prior compatibility Power-prior Training samples SCAD Unit-information prior


Fouskakis, Dimitris; Ntzoufras, Ioannis; Draper, David. Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models. Bayesian Anal. 10 (2015), no. 1, 75--107. doi:10.1214/14-BA887.

Export citation


  • Aitkin, M. (1991). “Posterior Bayes factors.” Journal of the Royal Statistical Society B, 53: 111–142.
  • Barbieri, M. and Berger, J. (2004). “Optimal predictive model selection.” Annals of Statistics, 32: 870–897.
  • Berger, J. and Pericchi, L. (1996a). “The intrinsic Bayes factor for linear models.” In Bayesian Statistics (Volume 5), J. Bernardo, J. Berger, A. Dawid, and A. Smith, eds., 25–44. Oxford University Press.
  • — (1996b). “The intrinsic Bayes factor for model selection and prediction.” Journal of the American Statistical Association, 91: 109–122.
  • — (2004). “Training samples in objective model selection.” Annals of Statistics, 32: 841–869.
  • Breiman, L. and Friedman, J. (1985). “Estimating optimal transformations for multiple regression and correlation.” Journal of the American Statistical Association, 80: 580–598.
  • Casella, G., Girón, F., Martínez, M., and Moreno, E. (2009). “Consistency of Bayesian procedures for variable selection.” Annals of Statistics, 37: 1207–1228.
  • Casella, G. and Moreno, E. (2006). “Objective Bayesian variable selection.” Journal of the American Statistical Association, 101: 157–167.
  • Consonni, G. and Veronese, P. (2008). “Compatibility of prior specifications across linear models.” Statistical Science, 23: 332–353.
  • Fan, J. and Li, R. (2001). “Variable selection via nonconcave penalized likelihood and its oracle properties.” Journal of the American Statistical Association, 96: 1348–1360.
  • Fouskakis, D. and Ntzoufras, I. (2013a). “Computation for intrinsic variable selection in normal regression models via expected-posterior priors.” Statistics and Computing, 23: 491–499.
  • — (2013b). “Limiting behavior of the Jeffreys Power-Expected-Posterior Bayes factor in Gaussian linear models.” Technical report, Department of Mathematics, National Technical University of Athens.
  • Girón, F., Martínez, M., Moreno, E., and Torres, F. (2006). “Objective testing procedures in linear models: calibration of the $p$-values.” Scandinavian Journal of Statistics, 33: 765–784.
  • Good, I. (2004). Probability and the Weighting of Evidence. New York, USA: Haffner.
  • Ibrahim, J. and Chen, M. (2000). “Power prior distributions for regression models.” Statistical Science, 15: 46–60.
  • Iwaki, K. (1997). “Posterior expected marginal likelihood for testing hypotheses.” Journal of Economics, Asia University, 21: 105–134.
  • Johnson, V. and Rossell, D. (2010). “On the use of non-local prior densities in Bayesian hypothesis tests.” Journal of the Royal Statistical Society, Series B, 72: 143–170.
  • — (2012). “Bayesian model selection in high-dimensional settings.” Journal of the American Statistical Association, 107: 649–660.
  • Johnstone, I. M. and Titterington, M. (2009). “Statistical challenges of high-dimensional data.” Philosophical Transactions of the Royal Society A, 367: 4237–4253.
  • Kass, R. and Wasserman, L. (1995). “A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion.” Journal of the American Statistical Association, 90: 928–934.
  • Liang, F., Paulo, R., Molina, G., Clyde, M., and Berger, J. (2008). “Mixtures of $g$–priors for Bayesian variable selection.” Journal of the American Statistical Association, 103: 410–423.
  • Moreno, E. and Girón, F. (2008). “Comparison of Bayesian objective procedures for variable selection in linear regression.” Test, 17: 472–490.
  • Moreno, E., Girón, F., and Torres, F. (2003). “Intrinsic priors for hypothesis testing in normal regression models.” Revista de la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas, 97: 53–61.
  • National Research Council (2005). Mathematics and 21st Century Biology, Committee on Mathematical Sciences Research for Computational Biology. The National Academies Press.
  • Nott, D. and Kohn, R. (2005). “Adaptive sampling for Bayesian variable selection.” Biometrika, 92: 747–763.
  • O’Hagan, A. (1995). “Fractional Bayes factors for model comparison.” Journal of the Royal Statistical Society B, 57: 99–138.
  • Pérez, J. (1998). “Development of Expected Posterior Prior Distribution for Model Comparisons.” Ph.D. thesis, Department of Statistics, Purdue University, USA.
  • Pérez, J. and Berger, J. (2002). “Expected-posterior prior distributions for model selection.” Biometrika, 89: 491–511.
  • Schwarz, G. (1978). “Estimating the dimension of a model.” Annals of Statistics, 6: 461–464.
  • Spiegelhalter, D., Abrams, K., and Myles, J. (2004). Bayesian Approaches to Clinical Trials and Health-Care Evaluation. Statistics in Practice. Chichester, UK: Wiley.
  • Spiegelhalter, D. and Smith, A. (1988). “Bayes factors for linear and log-linear models with vague prior information.” Journal of the Royal Statistical Society B, 44: 377– 387.
  • Tibshirani, R. (1996). “Regression shrinkage and selection via the lasso.” Journal of the Royal Statistical Society B, 58: 267–288.
  • Womack, A., León-Novelo, L., and Casella, G. (2014). “Inference from intrinsic Bayes procedures under model selection and uncertainty.” Journal of the American Statistical Association, forthcoming.
  • Zellner, A. (1976). “Bayesian and non-Bayesian analysis of the regression model with multivariate Student-$t$ error terms.” Journal of the American Statistical Association, 71: 400–405.

Supplemental materials

  • Supplementary material: Web Appendix to “Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models”. The Appendix is available in a web supplement at