Bayesian Analysis

A Loss-Based Prior for Variable Selection in Linear Regression Methods

Cristiano Villa and Jeong Eun Lee

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access


In this work we propose a novel model prior for variable selection in linear regression. The idea is to determine the prior mass by considering the worth of each of the regression models, given the number of possible covariates under consideration. The worth of a model consists of the information loss and the loss due to model complexity. While the information loss is determined objectively, the loss expression due to model complexity is flexible and, the penalty on model size can be even customized to include some prior knowledge. Some versions of the loss-based prior are proposed and compared empirically. Through simulation studies and real data analyses, we compare the proposed prior to the Scott and Berger prior, for noninformative scenarios, and with the Beta-Binomial prior, for informative scenarios.

Article information

Bayesian Anal., Advance publication (2018), 26 pages.

First available in Project Euclid: 14 June 2019

Permanent link to this document

Digital Object Identifier

Bayesian variable selection linear regression loss functions objective priors

Creative Commons Attribution 4.0 International License.


Villa, Cristiano; Lee, Jeong Eun. A Loss-Based Prior for Variable Selection in Linear Regression Methods. Bayesian Anal., advance publication, 14 June 2019. doi:10.1214/19-BA1162.

Export citation


  • Barbieri, M. and Berger, J. O. (2004). “Optimal predictive model selection.” Annals of Statistics 32, 870–897.
  • Bayarri, M. J., Berger, J. O., Forte, A. and García-Donato, G. (2012). “Criteria for Bayesian model choice with application to variable selection.” Annals of Statistics 40, 1550–1577.
  • Bogdan, M., Ghosh J. and Tokar, S. T. (2008). “Selecting explanatory variables with the modified version of the Bayesian information criterion.” Quality and Reliability Engineering International 24, 627–641.
  • Berger, J. O. and Molina, G. (2005). “Posterior model probabilities via path-based pairwise priors.” Statistica Neerlandica 59, 3–15.
  • Berk, R. H. (1966). “Limiting behaviour of posterior distributions when the model is incorrect.” Annals of Mathematical Statistics 37, 51–58.
  • Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. New York: John Wiley & Sons.
  • Brown, P. J., Vannucci, M. and Fearn, T. (1998). “Bayesian wavelength selection in multi-component analysis.” Journal of Chemometrics 12, 173–182.
  • Calon, A., Espinet, E., Palomo-Ponce, S., Tauriello, D. V. F., Iglesias, M., Céspedes, M. V., Sevillano, M., Nadal, C., Jung, P., Zhang, X. H. F., Byrom, D., Riera, A., Rossell, D., Mangues, R., Massague, J., Sancho, E. and Batlle, E. (2012). “Dependency of colonrectal cancer on the tgf-beta-driven programme in stromal cells for metastasis initiation.” Cancer Cell 22, 571–584.
  • Carlin, B. and Louis, T. (2000). “Empirical Bayes: Past, present and future.” Journal of the American Statistical Association 95, 1286–1289.
  • Casella, G. and Moreno, E. (2006). “Objective Bayesian variable selection.” Journal of the American Statistical Association 101, 157–167.
  • Clyde, M. A. and George, E. I. (2004). “Model uncertainty.” Statistical Science 19, 81–94.
  • Cui, W. and George, E. I. (2008). “Empirical Bayes vs. fully Bayes variable selection.” Journal of Statistical Planning and Inference 138, 888–900.
  • Fernández, C., Ley, E. and Steel, M. F. J. (2001). “Benchmark priors for Bayesian model averaging.” Journal of Econometrics 100, 381–427.
  • Garćia-Donato, G. and Martínez-Beneito, M. A. (2013). “On sampling strategies in Bayesian variable selection problems with large model spaces.” Journal of the American Statistical Association 108, 340–352.
  • Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2004). Bayesian Data Analysis. Chapman and Hall/CRC.
  • George, E. I. and Foster, D. P. (2000). “Calibration and empirical Bayes variable selection.” Biometrika 87, 731–747.
  • George, E. I. and McCulloch, R. E. (1993). “Variable selection via Gibbs sampling.” Journal of the American Statistical Association 88, 881–889.
  • Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). “Bayesian model averaging: a tutorial.” Statistical Science 14, 382–401.
  • Jeffreys, H. (1961). Theory of Probability. London: Oxford University Press.
  • Kass, R. E. and Raftery, A. E. (1995). “Bayes factors.” Journal of the American Statistical Association 90, 773–795.
  • Bubinyi, H. (1996). “Evolutionary variable selection in regression and PLS analyses.” Chemometrics 10, 119–133.
  • Kullback, S. and Leibler, R. A. (1951). “On information and sufficiency.” Annals of Mathematical Statistics 22, 79–86.
  • Ley, E. and Steel, M. F. (2009). “On the effect of prior assumptions in Bayesian model averaging with applications to growth regression.” Journal of Applied Econometrics 24, 651–674.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J. O. (2008). “Mixtures of $g$-priors for Bayesian variable selection.” Journal of the American Statistical Association 103, 410–423.
  • Merhav, N. and Feder, M. (1998). “Universal prediction.” IEEE Transactions on Information Theory 44, 2124–2147.
  • Nott, D. J. and Kohn, R. (2005). “Adaptive sampling for Bayesian variable selection.” Biometrika 92, 747–763.
  • O’Hara, R. B. and Sillanpää, M. J. (2009). “A Review of Bayesian Variable Selection Methods: What, How and Which.” Bayesian Analysis 4, 85–118.
  • Raftery, A. E., Madigan, D. and Hoeting, J. A. (1997). “Bayesian model averaging for linear regression models.” Journal of the American Statistical Association 92, 179–191.
  • Rossell, D. and Rubio, F. J. (2017). “Tractable Bayesian variable selection: beyond normality.” arXiv:1609.01708.
  • Rossell, D. and Telesca, D. (2017). “Non-local priors for high-dimensional estimation.” Journal of the American Statistical Association 112, 254–265.
  • Shively, T. S., Kohn, R. and Wood, S. (1999). “Variable selection and function estimation in additive nonparametric regression using a data-based prior.” Journal of the American Statistical Association 447, 777–794.
  • Scott, J. C. and Berger, J. O. (2010). “Bayes and empirical-Bayes multiplicity adjustment in variable-selection problems.” Annals of Statistics 38, 2587–2619.
  • Villa, C. and Lee, J. E. (2019). “A loss-based prior for variable selection in linear regression methods. Supplementary Material.” Bayesian Analysis.
  • Villa, C. and Walker, S. G. (2015). “An objective Bayesian criterion to determine model prior probabilities.” Scandinavian Journal of Statistics 42, 947–966.
  • Woods, H., Steinour, H. and Starke, H. (1932). “Effect of Composition of Porland Cement on Heat Evolved During Hardening.” Industrial and Engineering Chemistry Research 24, 1207–1214.
  • Zellner, A. (1986). “On assessing prior distributions and Bayesian regression analysis with $g$-prior distributions.” In Bayesian inference and Decision Techniques: Essays in Honour of Bruno de Finetti, Goel, P. K., Zellner, A. (eds). North-Holland: Amsterdam, 233–243.
  • Zellner, A. and Siow, A. (1980). “Posterior odds ratios for selected regression hypotheses.” In Bayesian Statistics, Bernardo, J. M., DeGroot, M. H., Lindley, D. V., Smith, A. F. M. (eds). University Press: Valencia, 585–603.

Supplemental materials

  • A loss-based prior for variable selection in linear regression methods. Supplementary Material. The Supplementary Material of “A loss-based prior for variable selection in linear regression” contains the Appendices A, B and C.