Bayesian Analysis

Power-Expected-Posterior Priors for Generalized Linear Models

Dimitris Fouskakis, Ioannis Ntzoufras, and Konstantinos Perrakis

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access

Abstract

The power-expected-posterior (PEP) prior provides an objective, automatic, consistent and parsimonious model selection procedure. At the same time it resolves the conceptual and computational problems due to the use of imaginary data. Namely, (i) it dispenses with the need to select and average across all possible minimal imaginary samples, and (ii) it diminishes the effect that the imaginary data have upon the posterior distribution. These attributes allow for large sample approximations, when needed, in order to reduce the computational burden under more complex models. In this work we generalize the applicability of the PEP methodology, focusing on the framework of generalized linear models (GLMs), by introducing two new PEP definitions which are in effect applicable to any general model setting. Hyper-prior extensions for the power parameter that regulates the contribution of the imaginary data are introduced. We further study the validity of the predictive matching and of the model selection consistency, providing analytical proofs for the former and empirical evidence supporting the latter. For estimation of posterior model and inclusion probabilities we introduce a tuning-free Gibbs-based variable selection sampler. Several simulation scenarios and one real life example are considered in order to evaluate the performance of the proposed methods compared to other commonly used approaches based on mixtures of g-priors. Results indicate that the GLM-PEP priors are more effective in the identification of sparse and parsimonious model formulations.

Article information

Source
Bayesian Anal. (2017), 28 pages.

Dates
First available in Project Euclid: 7 October 2017

Permanent link to this document
https://projecteuclid.org/euclid.ba/1507341641

Digital Object Identifier
doi:10.1214/17-BA1066

Keywords
expected-posterior prior g-prior generalized linear models hyper-g priors imaginary data objective Bayesian model selection power-prior

Rights
Creative Commons Attribution 4.0 International License.

Citation

Fouskakis, Dimitris; Ntzoufras, Ioannis; Perrakis, Konstantinos. Power-Expected-Posterior Priors for Generalized Linear Models. Bayesian Anal., advance publication, 7 October 2017. doi:10.1214/17-BA1066. https://projecteuclid.org/euclid.ba/1507341641


Export citation

References

  • Bayarri, M. J., Berger, J. O., Forte, A., and García-Donato, G. (2012). “Criteria for Bayesian model choice with application to variable selection.”The Annals of Statistics, 40: 1550–1577.
  • Berger, J. O. and Pericchi, L. R. (1996a). “The intrinsic Bayes factor for linear models.” in J. Bernardo, J. Berger, A. Dawid, and A. Smith, eds.,Bayesian Statistics, Vol. 5, 25–44. Oxford University Press.
  • Berger, J. O. and Pericchi, L. R. (1996b). “The intrinsic Bayes factor for model selection and prediction.”Journal of the American Statistical Association, 91: 109–122.
  • Bernardo, J. and Smith, A. (2000).Bayesian Theory, 2nd edition. Chichester, UK: Wiley.
  • Casella, G. and Moreno, E. (2006). “Objective Bayesian variable selection.”Journal of the American Statistical Association, 101: 157–167.
  • Chen, M., Ibrahim, J. G., and Shao, Q.-M. (2000). “Power prior distributions for generalized linear models.”Journal of Statistical Planning and Inference, 84: 121–137.
  • Chen, M.-H., Huang, L., Ibrahim, J. G., and Kim, S. (2008). “Bayesian variable selection and computation for generalized linear models with conjugate priors.”Bayesian Analysis, 3: 585–614.
  • Chen, M.-H. and Ibrahim, J. G. (2003). “Conjugate priors for generalized linear models.”Statistica Sinica, 13: 461–476.
  • Consonni, G. and Veronese, P. (2008). “Compatibility of prior specifications across linear models.”Statistical Science, 23: 332–353.
  • Dellaportas, P., Forster, J. J., and Ntzoufras, I. (2002). “On Bayesian model and variable selection using MCMC.”Statistics and Computing, 12: 27–36.
  • Fouskakis, D. and Ntzoufras, I. (2013). “Computation for intrinsic variable selection in normal regression models via expected-posterior prior.”Statistics and Computing, 23: 491–499.
  • Fouskakis, D. and Ntzoufras, I. (2016). “Power-conditional-expected priors: Using $g$-priors with random imaginary data for variable selection.”Journal of Computational and Graphical Statistics, 25: 647–664.
  • Fouskakis, D., Ntzoufras, I., and Draper, D. (2015). “Power-expected-posterior priors for variable selection in Gaussian linear models.”Bayesian Analysis, 10: 75–107.
  • Fouskakis, D., Ntzoufras, I., and Perrakis, K. (2016). “Variations of power-expected-posterior priors in normal regression models.” arXiv:1609.06926v2.
  • Fouskakis, D., Ntzoufras, I., and Perrakis, K. (2017). “Electronic Appendix of the “Power-Expected-Posterior Priors for Generalized Linear Models”.”Bayesian Analysis.
  • Friel, N. and Pettitt, A. N. (2008). “Marginal likelihood estimation via power posteriors.”Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70: 589–607.
  • Gupta, M. and Ibrahim, J. G. (2009). “An information matrix prior for Bayesian analysis in generalized linear models with high dimensional data.”Statistica Sinica, 19: 1641–1663.
  • Hansen, M. and Yu, B. (2003). “Minimum description length model selection criteria for generalized linear models.”Lecture Notes-Monograph Series, 6: 145–163.
  • Holmes, C. C. and Held, L. (2006). “Bayesian auxiliary variable models for binary and multinomial regression.”Bayesian Analysis, 145–168.
  • Ibrahim, J. G. and Chen, M.-H. (2000). “Power prior distributions for regression models.”Statistical Science, 15: 46–60.
  • Ibrahim, J. G. and Laud, P. W. (1991). “On Bayesian analysis of generalized linear models using Jeffreys’s prior.”Journal of the American Statistical Association, 86: 981–986.
  • Kass, R. E. and Wasserman, L. (1995). “A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion.”Journal of the American Statistical Association, 90: 928–934.
  • Leon-Novelo, L., Moreno, E., and Casella, G. (2012). “Objective Bayes model selection in probit models.”Statistics in Medicine, 31: 353–365.
  • Li, Y. and Clyde, M. A. (2016). “Mixtures ofg-priors in generalized linear models.” arXiv:1503.06913.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). “Mixtures ofg-priors for Bayesian variable selection.”Journal of the American Statistical Association, 103: 410–423.
  • Madigan, D. and York, J. (1995). “Bayesian graphical models for discrete data.”International Statistical Review, 63: 215–232.
  • Maruyama, Y. and George, E. I. (2011). “Fully Bayes factors with a generalized $g$-prior.”The Annals of Statistics, 39: 2740–2765.
  • Moreno, E. and Girón, F. J. (2008). “Comparison of Bayesian objective procedures for variable selection in linear regression.”Test, 17: 472–490.
  • Murray, I., Ghahramani, Z., and MacKay, D. J. C. (2006). “MCMC for doubly-intractable distributions.” inProceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence, (UAI-06), AUAI Press, 359–366.
  • Ntzoufras, I., Dellaportas, P., and Forster, J. J. (2003). “Bayesian variable and link determination for generalized linear models.”Journal of Statistical Planning and Inference, 111: 165–180.
  • Pérez, J. (1998). “Development of Expected Posterior Prior Distribution for Model Comparisons.” Ph.D. thesis, Department of Statistics, Purdue University, USA.
  • Pérez, J. M. and Berger, J. O. (2002). “Expected-posterior prior distributions for model selection.”Biometrika, 89: 491–511.
  • Perrakis, K., Fouskakis, D., and Ntzoufras, I. (2015). “Bayesian Variable Selection for Generalized Linear Models Using the Power-Conditional-Expected-Posterior Prior.” in S. Frühwirth-Schnatter, A. Bitto, G. Kastner, and A. Posekany, eds.,Bayesian Statistics from Methods to Models and Applications: Research from BAYSM 2014, Vol. 126, 59–73. Springer Proceedings in Mathematics and Statistics.
  • Ročková, V. and George, E. I. (2014). “EMVS: The EM approach to Bayesian variable selection.”Journal of the American Statistical Association, 109: 828–846.
  • Sabanés Bové, D. and Held, L. (2011). “Hyper-$g$ priors for generalized linear models.”Bayesian Analysis, 6: 387–410.
  • Scott, J. G. and Berger, J. O. (2010). “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem.”The Annals of Statistics, 38: 2587–2619.
  • Wang, X. and George, E. I. (2007). “Adaptive Bayesian criteria in variable selection for generalized linear models.”Statistica Sinica, 17: 667–690.
  • Zellner, A. (1986). “On Assessing Prior Distributions and Bayesian Regression Analysis Using G-Prior distributions.” In Goel, P. and Zellner, A. (eds.),Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, 233–243. Amsterdam: North-Holland.
  • Zellner, A. and Siow, A. (1980). “Posterior Odds Ratios for Selected Regression Hypothesis (with discussion).” In J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.,Bayesian Statistics, Vol. 1, 585–606 & 618–647 (discussion). Oxford University Press.

Supplemental materials