Bayesian Analysis
- Bayesian Anal.
- Volume 7, Number 2 (2012), 477-502.
Regularization in Regression: Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation
Gilles Celeux, Mohammed El Anbari, Jean-Michel Marin, and Christian P. Robert
Full-text: Open access
Abstract
Using a collection of simulated and real benchmarks, we compare Bayesian and frequentist regularization approaches under a low informative constraint when the number of variables is almost equal to the number of observations on simulated and real datasets. This comparison includes new global noninformative approaches for Bayesian variable selection built on Zellner’s -priors that are similar to Liang et al. (2008). The interest of those calibration-free proposals is discussed. The numerical experiments we present highlight the appeal of Bayesian regularization methods, when compared with non-Bayesian alternatives. They dominate frequentist methods in the sense that they provide smaller prediction errors while selecting the most relevant variables in a parsimonious way.
Article information
Source
Bayesian Anal., Volume 7, Number 2 (2012), 477-502.
Dates
First available in Project Euclid: 16 June 2012
Permanent link to this document
https://projecteuclid.org/euclid.ba/1339878896
Digital Object Identifier
doi:10.1214/12-BA716
Mathematical Reviews number (MathSciNet)
MR2934959
Zentralblatt MATH identifier
1330.62284
Keywords
Model choice regularization methods noninformative priors Zellner’s g–prior calibration Lasso elastic net Dantzig selector
Citation
Celeux, Gilles; El Anbari, Mohammed; Marin, Jean-Michel; Robert, Christian P. Regularization in Regression: Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation. Bayesian Anal. 7 (2012), no. 2, 477--502. doi:10.1214/12-BA716. https://projecteuclid.org/euclid.ba/1339878896
References
- Bartlett, M. (1957). A comment on D.V. Lindley’s statistical paradox. Biometrika, 44:533–534.Zentralblatt MATH: 0073.35702
- Berger, J., Pericchi, L., and Varshavsky, I. (1998). Bayes factors and marginal distributions in invariant situations. Sankhya A, 60:307–321.Mathematical Reviews (MathSciNet): MR1718789
- Bottolo, L. and Richardson, S. (2010). Evolutionary stochastic search for Bayesian model exploration. Bayesian Analysis, 5(3):583–618.
- Breiman, L. and Friedman, J.H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 85(391):580–598.
- Brown, J. and Vannucci, M. (1998). Multivariate Bayesian variable selection and prediction. Journal of the Royal Statistical Society Series B, 60(3):627–641.Mathematical Reviews (MathSciNet): MR1626005
Zentralblatt MATH: 0909.62022
Digital Object Identifier: doi:10.1111/1467-9868.00144 - Butler, R. and Wood, A. (2002). Laplace approximations for hypergeometric functions with matrix arguments. Annals of Statistics, 30:1155–1177.Mathematical Reviews (MathSciNet): MR1926172
Zentralblatt MATH: 1029.62047
Digital Object Identifier: doi:10.1214/aos/1031689021
Project Euclid: euclid.aos/1031689021 - Candes, E. and Tao, T. (2007). The Dantzig Selector: statistical estimation when $p$ is much larger than $n$. Annals of Statistics, 35(6):2313–2351.Mathematical Reviews (MathSciNet): MR2382644
Zentralblatt MATH: 1139.62019
Digital Object Identifier: doi:10.1214/009053606000001523
Project Euclid: euclid.aos/1201012958 - Casella, G. and Moreno, E. (2006). Objective Bayesian variable selection. Journal of the American Statistical Association, 101(473):157–167.Mathematical Reviews (MathSciNet): MR2268035
Zentralblatt MATH: 1118.62313
Digital Object Identifier: doi:10.1198/016214505000000646 - Celeux, G., Marin, J.-M., and Robert, C. (2006). Sélection bayésienne de variables en régression linéaire. Journal de la Société Française de Statistique, 147(1):59–79.Mathematical Reviews (MathSciNet): MR2500591
- Chipman, H. (1996). Bayesian variable selection with related predictors. Canadian Journal of Statistics, 1:17–36.
- Cui, W. and George, E. (2008). Empirical Bayes vs. fully Bayes variable selection. Journal of Statistical Planning and Inference, 138:888–900.Mathematical Reviews (MathSciNet): MR2416869
Zentralblatt MATH: 1130.62007
Digital Object Identifier: doi:10.1016/j.jspi.2007.02.011 - de Finetti, B. (1972). Probability, Induction and Statistics. John Wiley, New York.Zentralblatt MATH: 0275.60001
- DeGroot, M. (1973). Doing what comes naturally: Interpreting a tail area as a posterior probability or as a likelihood ratio. Journal of the American Statistical Association, 68:966–969.
- Dupuis, J. and Robert, C. (2003). Bayesian variable selection in qualitative models by Kullback-Leibler projections. Journal of Statistical Planning and Inference, 111:77–94.Mathematical Reviews (MathSciNet): MR1955873
Zentralblatt MATH: 1033.62066
Digital Object Identifier: doi:10.1016/S0378-3758(02)00286-0 - Fernandez, C., Ley, E., and Steel, M. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100:381–427.Mathematical Reviews (MathSciNet): MR1820410
Zentralblatt MATH: 1091.62507
Digital Object Identifier: doi:10.1016/S0304-4076(00)00076-2 - Foster, D. and George, E. (1994). The risk inflation criterion for multiple regression. Annals of Statistics, 22:1947–1975.Mathematical Reviews (MathSciNet): MR1329177
Zentralblatt MATH: 0829.62066
Digital Object Identifier: doi:10.1214/aos/1176325766
Project Euclid: euclid.aos/1176325766 - George, E. (2000). The variable selection problem. Journal of the American Statistical Association, 95:1304–1308.
- George, E. and Foster, D. (2000). Calibration and empirical Bayes variable selection. Biometrika, 87(4):731–747.Mathematical Reviews (MathSciNet): MR1813972
Zentralblatt MATH: 1029.62008
Digital Object Identifier: doi:10.1093/biomet/87.4.731 - George, E. and McCulloch, R. (1993). Variable selection via Gibbbs sampling. Journal of the American Statistical Association, 88:881–889.
- George, E. and McCulloch, R. (1997). Approaches to Bayesian variable selection. Statistica Sinica, 7:339–373.
- Guo, R. and Speckman, P. (2009). Bayes factor consistency in linear models. In The 2009 International Workshop on Objective Bayes Methodology, Philadelphia, June 5-9, 2009. URL http://www-stat.wharton.upenn.edu/statweb/Conference/OBayes09/AbstractPapers/speckman.pdf.
- Hoerl, A. and Kennard, R. (1970). Ridge regression: biased estimation for non orthogonal problems. Technometrics, 12:55–67.
- Kass, R. and Raftery, A. (1995). Bayes factor and model uncertainty. Journal of the American Statistical Association, 90:773–795.
- Kass, R. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90:928–934.
- Kohn, R., Smith, M., and Chan, D. (2001). Nonparametric regression using linear combinations of basis functions. Statistics and Computing, 11:313–322.
- Liang, F., Paulo, R., Molina, G., Clyde, M., and Berger, J. (2008). Mixtures of $g$-priors for Bayesian variable selection. Journal of the American Statistical Association, 103(481):410–423.Mathematical Reviews (MathSciNet): MR2420243
Zentralblatt MATH: 05564499
Digital Object Identifier: doi:10.1198/016214507000001337 - Lindley, D. (1957). A statistical paradox. Biometrika, 44:187–192.
- Marin, J. and Robert, C. (2007). Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer-Verlag, New York.
- Mitchell, T. and Beauchamp, J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83:1023–1032.
- Nott, D. J. and Green, P. J. (2004). Bayesian variable selection and the Swendsen-Wang algorithm. Journal of Computational and Graphical Statistics, 13:1–17.
- Park, T. and Casella, G. (2008). The Bayesian lasso. Journal of the American Statistical Association, 103(473):681–686.Mathematical Reviews (MathSciNet): MR2524001
Zentralblatt MATH: 05564521
Digital Object Identifier: doi:10.1198/016214508000000337 - Penrose, K., Nelson, A., and Fisher, A. (1985). Generalized body composition prediction equation for men using simple measurement techniques. Medicine and Science in Sports and Exercise, 17(2):189.
- Philips, R. and Guttman, I. (1998). A new criterion for variable selection. Statistics and Probability Letters, 38:11–19.Mathematical Reviews (MathSciNet): MR1629488
- Rao, C. (1973). Linear Statistical Inference and its Applications. John Wiley, New York.Mathematical Reviews (MathSciNet): MR346957
- Robert, C. (1993). A note on the Jeffreys-Lindley paradox. Statistica Sinica, 3:601–608.
- Robert, C. (2001). The Bayesian Choice. Springer-Verlag, 2 edition.Mathematical Reviews (MathSciNet): MR1835885
- Schneider, U. and Corcoran, J. (2004). Perfect sampling for Bayesian variable selection in a linear regression model. Journal of Statistical Planning and Inference, 126:153–171.Mathematical Reviews (MathSciNet): MR2090691
Zentralblatt MATH: 1072.62019
Digital Object Identifier: doi:10.1016/j.jspi.2003.09.009 - Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75:317–343.
- Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B, 58(1):267–288.Mathematical Reviews (MathSciNet): MR1379242
- Yuan, M. and Lin, Y. (2007). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B, 68(1):49–67.Mathematical Reviews (MathSciNet): MR2212574
Zentralblatt MATH: 1141.62030
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00532.x - Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with $g$-prior distribution regression using Bayesian variable selection. In Bayesian inference and decision techniques: Essays in Honor of Bruno De Finetti, pages 233–243. North-Holland / Elsevier.
- Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics, pages 585–603. Valencia: University Press. (Proceedings of the first Valencia meeting).
- Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429.Mathematical Reviews (MathSciNet): MR2279469
Zentralblatt MATH: 1171.62326
Digital Object Identifier: doi:10.1198/016214506000000735 - Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B, 67(2):301–320.Mathematical Reviews (MathSciNet): MR2137327
Zentralblatt MATH: 1069.62054
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00503.x

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Bayesian Model Selection of Regular Vine Copulas
Gruber, Lutz F. and Czado, Claudia, Bayesian Analysis, 2018 - Second-order autoregressive Hidden Markov Model
Zuanetti, Daiane Aparecida and Milan, Luis Aparecido, Brazilian Journal of Probability and Statistics, 2017 - A Two-Component G-Prior for Variable Selection
Zhang, Hongmei, Huang, Xianzheng, Gan, Jianjun, Karmaus, Wilfried, and Sabo-Attwood, Tara, Bayesian Analysis, 2016
- Bayesian Model Selection of Regular Vine Copulas
Gruber, Lutz F. and Czado, Claudia, Bayesian Analysis, 2018 - Second-order autoregressive Hidden Markov Model
Zuanetti, Daiane Aparecida and Milan, Luis Aparecido, Brazilian Journal of Probability and Statistics, 2017 - A Two-Component G-Prior for Variable Selection
Zhang, Hongmei, Huang, Xianzheng, Gan, Jianjun, Karmaus, Wilfried, and Sabo-Attwood, Tara, Bayesian Analysis, 2016 - A Theoretical Investigation of How Evidence Flows in Bayesian Network Meta-Analysis of Disconnected Networks
Béliveau, Audrey and Gustafson, Paul, Bayesian Analysis, 2021 - Bayesian Sparse Multivariate Regression with Asymmetric Nonlocal Priors for Microbiome Data Analysis
Shuler, Kurtis, Sison-Mangus, Marilou, and Lee, Juhee, Bayesian Analysis, 2020 - Variable selection for BART: An application to gene regulation
Bleich, Justin, Kapelner, Adam, George, Edward I., and Jensen, Shane T., Annals of Applied Statistics, 2014 - A new scope of penalized empirical likelihood with high-dimensional estimating equations
Chang, Jinyuan, Tang, Cheng Yong, and Wu, Tong Tong, Annals of Statistics, 2018 - Local polynomial regression and variable selection
Miller, Hugh and Hall, Peter, Borrowing Strength: Theory Powering Applications – A Festschrift for Lawrence D. Brown, 2010 - Marginal Pseudo-Likelihood Learning of Discrete Markov Network Structures
Pensar, Johan, Nyman, Henrik, Niiranen, Juha, and Corander, Jukka, Bayesian Analysis, 2017 - A Loss-Based Prior for Variable Selection in Linear Regression Methods
Villa, Cristiano and Lee, Jeong Eun, Bayesian Analysis, 2020