Bayesian Analysis
previous :: next

Regularization in Regression: Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation

Gilles Celeux, Mohammed El Anbari, Jean-Michel Marin, and Christian P. Robert
Source: Bayesian Anal. Volume 7, Number 2 (2012), 477-502.

Abstract

Using a collection of simulated and real benchmarks, we compare Bayesian and frequentist regularization approaches under a low informative constraint when the number of variables is almost equal to the number of observations on simulated and real datasets. This comparison includes new global noninformative approaches for Bayesian variable selection built on Zellner’s $g$-priors that are similar to Liang et al. (2008). The interest of those calibration-free proposals is discussed. The numerical experiments we present highlight the appeal of Bayesian regularization methods, when compared with non-Bayesian alternatives. They dominate frequentist methods in the sense that they provide smaller prediction errors while selecting the most relevant variables in a parsimonious way.

First Page: Show Hide
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ba/1339878896
Digital Object Identifier: doi:10.1214/12-BA716
Mathematical Reviews number (MathSciNet): MR2934959

References

Bartlett, M. (1957). A comment on D.V. Lindley’s statistical paradox. Biometrika, 44:533–534.
Zentralblatt MATH: 0073.35702
Berger, J., Pericchi, L., and Varshavsky, I. (1998). Bayes factors and marginal distributions in invariant situations. Sankhya A, 60:307–321.
Mathematical Reviews (MathSciNet): MR1718789
Bottolo, L. and Richardson, S. (2010). Evolutionary stochastic search for Bayesian model exploration. Bayesian Analysis, 5(3):583–618.
Mathematical Reviews (MathSciNet): MR2719668
Digital Object Identifier: doi:10.1214/10-BA523
Breiman, L. and Friedman, J.H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 85(391):580–598.
Mathematical Reviews (MathSciNet): MR803258
Zentralblatt MATH: 0594.62044
Brown, J. and Vannucci, M. (1998). Multivariate Bayesian variable selection and prediction. Journal of the Royal Statistical Society Series B, 60(3):627–641.
Mathematical Reviews (MathSciNet): MR1626005
Zentralblatt MATH: 0909.62022
Digital Object Identifier: doi:10.1111/1467-9868.00144
Butler, R. and Wood, A. (2002). Laplace approximations for hypergeometric functions with matrix arguments. Annals of Statistics, 30:1155–1177.
Mathematical Reviews (MathSciNet): MR1926172
Zentralblatt MATH: 1029.62047
Digital Object Identifier: doi:10.1214/aos/1031689021
Project Euclid: euclid.aos/1031689021
Candes, E. and Tao, T. (2007). The Dantzig Selector: statistical estimation when $p$ is much larger than $n$. Annals of Statistics, 35(6):2313–2351.
Mathematical Reviews (MathSciNet): MR2382644
Zentralblatt MATH: 1139.62019
Digital Object Identifier: doi:10.1214/009053606000001523
Project Euclid: euclid.aos/1201012958
Casella, G. and Moreno, E. (2006). Objective Bayesian variable selection. Journal of the American Statistical Association, 101(473):157–167.
Mathematical Reviews (MathSciNet): MR2268035
Zentralblatt MATH: 1118.62313
Digital Object Identifier: doi:10.1198/016214505000000646
Celeux, G., Marin, J.-M., and Robert, C. (2006). Sélection bayésienne de variables en régression linéaire. Journal de la Société Française de Statistique, 147(1):59–79.
Mathematical Reviews (MathSciNet): MR2500591
Chipman, H. (1996). Bayesian variable selection with related predictors. Canadian Journal of Statistics, 1:17–36.
Mathematical Reviews (MathSciNet): MR1394738
Digital Object Identifier: doi:10.2307/3315687
Cui, W. and George, E. (2008). Empirical Bayes vs. fully Bayes variable selection. Journal of Statistical Planning and Inference, 138:888–900.
Mathematical Reviews (MathSciNet): MR2416869
Zentralblatt MATH: 1130.62007
Digital Object Identifier: doi:10.1016/j.jspi.2007.02.011
de Finetti, B. (1972). Probability, Induction and Statistics. John Wiley, New York.
Zentralblatt MATH: 0275.60001
DeGroot, M. (1973). Doing what comes naturally: Interpreting a tail area as a posterior probability or as a likelihood ratio. Journal of the American Statistical Association, 68:966–969.
Mathematical Reviews (MathSciNet): MR362639
Zentralblatt MATH: 0271.62049
Dupuis, J. and Robert, C. (2003). Bayesian variable selection in qualitative models by Kullback-Leibler projections. Journal of Statistical Planning and Inference, 111:77–94.
Mathematical Reviews (MathSciNet): MR1955873
Zentralblatt MATH: 1033.62066
Digital Object Identifier: doi:10.1016/S0378-3758(02)00286-0
Fernandez, C., Ley, E., and Steel, M. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100:381–427.
Mathematical Reviews (MathSciNet): MR1820410
Zentralblatt MATH: 1091.62507
Digital Object Identifier: doi:10.1016/S0304-4076(00)00076-2
Foster, D. and George, E. (1994). The risk inflation criterion for multiple regression. Annals of Statistics, 22:1947–1975.
Mathematical Reviews (MathSciNet): MR1329177
Zentralblatt MATH: 0829.62066
Digital Object Identifier: doi:10.1214/aos/1176325766
Project Euclid: euclid.aos/1176325766
George, E. (2000). The variable selection problem. Journal of the American Statistical Association, 95:1304–1308.
Mathematical Reviews (MathSciNet): MR1825282
Zentralblatt MATH: 1018.62050
George, E. and Foster, D. (2000). Calibration and empirical Bayes variable selection. Biometrika, 87(4):731–747.
Mathematical Reviews (MathSciNet): MR1813972
Zentralblatt MATH: 1029.62008
Digital Object Identifier: doi:10.1093/biomet/87.4.731
George, E. and McCulloch, R. (1993). Variable selection via Gibbbs sampling. Journal of the American Statistical Association, 88:881–889.
George, E. and McCulloch, R. (1997). Approaches to Bayesian variable selection. Statistica Sinica, 7:339–373.
Guo, R. and Speckman, P. (2009). Bayes factor consistency in linear models. In The 2009 International Workshop on Objective Bayes Methodology, Philadelphia, June 5-9, 2009. URL http://www-stat.wharton.upenn.edu/statweb/Conference/OBayes09/AbstractPapers/speckman.pdf.
Hoerl, A. and Kennard, R. (1970). Ridge regression: biased estimation for non orthogonal problems. Technometrics, 12:55–67.
Kass, R. and Raftery, A. (1995). Bayes factor and model uncertainty. Journal of the American Statistical Association, 90:773–795.
Kass, R. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90:928–934.
Mathematical Reviews (MathSciNet): MR1354008
Zentralblatt MATH: 0851.62020
Kohn, R., Smith, M., and Chan, D. (2001). Nonparametric regression using linear combinations of basis functions. Statistics and Computing, 11:313–322.
Mathematical Reviews (MathSciNet): MR1863502
Digital Object Identifier: doi:10.1023/A:1011916902934
Liang, F., Paulo, R., Molina, G., Clyde, M., and Berger, J. (2008). Mixtures of $g$-priors for Bayesian variable selection. Journal of the American Statistical Association, 103(481):410–423.
Mathematical Reviews (MathSciNet): MR2420243
Zentralblatt MATH: 05564499
Digital Object Identifier: doi:10.1198/016214507000001337
Lindley, D. (1957). A statistical paradox. Biometrika, 44:187–192.
Mathematical Reviews (MathSciNet): MR87273
Zentralblatt MATH: 0084.35806
Marin, J. and Robert, C. (2007). Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer-Verlag, New York.
Mathematical Reviews (MathSciNet): MR2289769
Zentralblatt MATH: 1137.62013
Mitchell, T. and Beauchamp, J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83:1023–1032.
Mathematical Reviews (MathSciNet): MR997578
Zentralblatt MATH: 0673.62051
Nott, D. J. and Green, P. J. (2004). Bayesian variable selection and the Swendsen-Wang algorithm. Journal of Computational and Graphical Statistics, 13:1–17.
Mathematical Reviews (MathSciNet): MR2044875
Digital Object Identifier: doi:10.1198/1061860042958
Park, T. and Casella, G. (2008). The Bayesian lasso. Journal of the American Statistical Association, 103(473):681–686.
Mathematical Reviews (MathSciNet): MR2524001
Zentralblatt MATH: 05564521
Digital Object Identifier: doi:10.1198/016214508000000337
Penrose, K., Nelson, A., and Fisher, A. (1985). Generalized body composition prediction equation for men using simple measurement techniques. Medicine and Science in Sports and Exercise, 17(2):189.
Philips, R. and Guttman, I. (1998). A new criterion for variable selection. Statistics and Probability Letters, 38:11–19.
Mathematical Reviews (MathSciNet): MR1629488
Rao, C. (1973). Linear Statistical Inference and its Applications. John Wiley, New York.
Mathematical Reviews (MathSciNet): MR346957
Robert, C. (1993). A note on the Jeffreys-Lindley paradox. Statistica Sinica, 3:601–608.
Mathematical Reviews (MathSciNet): MR1243404
Zentralblatt MATH: 0823.62006
Robert, C. (2001). The Bayesian Choice. Springer-Verlag, 2 edition.
Mathematical Reviews (MathSciNet): MR1835885
Schneider, U. and Corcoran, J. (2004). Perfect sampling for Bayesian variable selection in a linear regression model. Journal of Statistical Planning and Inference, 126:153–171.
Mathematical Reviews (MathSciNet): MR2090691
Zentralblatt MATH: 1072.62019
Digital Object Identifier: doi:10.1016/j.jspi.2003.09.009
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75:317–343.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B, 58(1):267–288.
Mathematical Reviews (MathSciNet): MR1379242
Yuan, M. and Lin, Y. (2007). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B, 68(1):49–67.
Mathematical Reviews (MathSciNet): MR2212574
Zentralblatt MATH: 1141.62030
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00532.x
Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with $g$-prior distribution regression using Bayesian variable selection. In Bayesian inference and decision techniques: Essays in Honor of Bruno De Finetti, pages 233–243. North-Holland / Elsevier.
Mathematical Reviews (MathSciNet): MR881437
Zentralblatt MATH: 0655.62071
Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics, pages 585–603. Valencia: University Press. (Proceedings of the first Valencia meeting).
Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429.
Mathematical Reviews (MathSciNet): MR2279469
Zentralblatt MATH: 1171.62326
Digital Object Identifier: doi:10.1198/016214506000000735
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B, 67(2):301–320.
Mathematical Reviews (MathSciNet): MR2137327
Zentralblatt MATH: 1069.62054
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00503.x
previous :: next

2013 © International Society for Bayesian Analysis

Bayesian Analysis

Bayesian Analysis

Turn MathJax Off
What is MathJax?