Bayesian Analysis

Bayesian Regularization via Graph Laplacian

Fei Liu, Sounak Chakraborty, Fan Li, Yan Liu, and Aurelie C. Lozano

Full-text: Open access

Abstract

Regularization plays a critical role in modern statistical research, especially in high-dimensional variable selection problems. Existing Bayesian methods usually assume independence between variables a priori. In this article, we propose a novel Bayesian approach, which explicitly models the dependence structure through a graph Laplacian matrix. We also generalize the graph Laplacian to allow both positively and negatively correlated variables. A prior distribution for the graph Laplacian is then proposed, which allows conjugacy and thereby greatly simplifies the computation. We show that the proposed Bayesian model leads to proper posterior distribution. Connection is made between our method and some existing regularization methods, such as Elastic Net, Lasso, Octagonal Shrinkage and Clustering Algorithm for Regression (OSCAR) and Ridge regression. An efficient Markov Chain Monte Carlo method based on parameter augmentation is developed for posterior computation. Finally, we demonstrate the method through several simulation studies and an application on a real data set involving key performance indicators of electronics companies.

Article information

Source
Bayesian Anal., Volume 9, Number 2 (2014), 449-474.

Dates
First available in Project Euclid: 26 May 2014

Permanent link to this document
https://projecteuclid.org/euclid.ba/1401148316

Digital Object Identifier
doi:10.1214/14-BA860

Mathematical Reviews number (MathSciNet)
MR3217003

Zentralblatt MATH identifier
1327.62152

Keywords
Bayesian analysis Elastic Net Grouping Lasso OSCAR Regularization Ridge regression Variable selection

Citation

Liu, Fei; Chakraborty, Sounak; Li, Fan; Liu, Yan; Lozano, Aurelie C. Bayesian Regularization via Graph Laplacian. Bayesian Anal. 9 (2014), no. 2, 449--474. doi:10.1214/14-BA860. https://projecteuclid.org/euclid.ba/1401148316


Export citation

References

  • Bondell, H. and Reich, B. (2008). “Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR.” Biometrics, 64: 115–123.
  • Bornn, L., Gottardo, R., and Doucet, A. (2010). “Grouping Priors and the Bayesian Elastic Net.” Technical Report 254, The University of British Columbia, Department of Statistics.
  • Chakraborty, S. and Guo, R. (2011). “Bayesian Hybrid Huberized SVM and its Applications in High Dimensional Medical Data.” Computational Statistics and Data Analysis, 55(3): 1342 – 1356.
  • Chipman, H. (1996). “Bayesian variable selection with related predictors.” Canadian Journal of Statistics, 24: 17–36.
  • Clyde, M., Parmigiani, G., and Vidakovic, B. (1998). “Multiple shrinkage and subset slection in wavelets.” Biometrika, 85: 391–401.
  • Garcia-Donato, G. and Martinez-Beneito, M. (2013). “On sampling strategies in Bayesian variable selection problems with large model spaces.” Journal of the American Statistical Association, 108: 340–352.
  • Gelfand, A. E. and Smith, A. F. M. (1990). “Sampling-based Approaches for Calculating Marginal Densities.” Journal of the American Statistial Association, 85: 398–409.
  • Gelman, A., Carlin, J., Stern, H., and Rubin, R. (2004). Bayesian Data Analysis. Chapman & Hall/CRC.
  • George, E. I. and McCulloch, R. E. (1993). “Variable Selection Via Gibbs Sampling.” Journal of the American Statistical Association, 88(423): 881–889.
  • — (1997). “Approaches for Bayesian Variable Selection.” Statistica Sinica, 7: 339–373.
  • Hans, C. (2009). “Bayesian lasso regression.” Biometrika, 96(4): 835–845.
  • — (2011). “Elastic Net Regression Modeling With the Orthant Normal Prior.” Journal of the American Statistical Association, 106: 1383–1393.
  • Hoerl, A. E. and Kennard, R. W. (1970). “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics, 12(1): 55–67.
  • Ishwaran, H. and Rao, J. S. (2005). “Spike and slab variable selection: frequentist and Bayesian strategies.” Annals of Statistics, 33(2): 730–773.
  • Kaplan, R. and Norton, D. (1992). “The Balanced Scorecard - Measures that Drive Performance.” In Harvard Business Review, 71–79.
  • — (1996). The Balanced Scorecard: Translating Strategy into Action. Boston, MA: Harvard Business School Press.
  • Kim, S. and Xing, E. P. (2009). “Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network.” Public Library of Science Genetics, 5(8).
  • Kuo, L. and Mallick, B. (1998). “Variable selection for regression models.” Sankhya Series B, 60: 65–81.
  • Kyung, M., Gill, J., Ghosh, M., and Casella, G. (2010). “Penalized Regression, Standard Errors, and Bayesian Lassos.” Bayesian Analysis, 5(2): 369–412.
  • Li, C. and Li, H. (2010). “Variable Selection and Regression Analysis for Graph-structured Covariates with an Application to Genomics.” Annals of Applied Statistics, 4(3): 1498–1516.
  • Li, F. and Zhang, N. (2010). “Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics.” Journal of the American Statistical Association, 105(491): 1202–1214.
  • Li, Q. and Lin, N. (2010). “The Bayesian Elastic Net.” Bayesian Analysis, 5(1): 151–170.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). “Mixtures of $g$ Priors for Bayesian Variable Selection.” Journal of the American Statistical Association, 103(481): 410–423.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. Chapman & Hall / CRC.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). “Bayesian variable selection in linear regression.” Journal of American Statistical Association, 83: 1023–1036.
  • Ng, A., Jordan, M., and Weiss, Y. (2002). “On spectral clustering: analysis and an algorithm.” In Dietterich, T., Becker, S., and Ghahramani, Z. (eds.), Advances in Neural Information Processing Systems 14. MIT Press.
  • Park, T. and Casella, G. (2008). “The Bayesian Lasso.” Journal of the American Statistical Association, 103(482): 681–686.
  • Smith, M. and Kohn, R. (1996). “Nonparametric Regression Using Bayesian Variable Selection.” Journal of Econometrics, 75: 317–343.
  • Storey, J. D. and Tibshirani, R. (2003). “Statistical significance for genomewide studies.” Proceedings of the National Academy of Sciences. USA, 100: 9440– 9445.
  • Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society, Series B, 58(1): 267–288.
  • Tipping, M. E. (2001). “Sparse Bayesian Learning and the Relevance Vector Machine.” Journal of Machine Learning Research, 1: 211–244.
  • Tutz, G. and Ulbricht, J. (2009). “Penalized regression with correlation-based penalty.” Statistical Computing, 19: 239–253.
  • Vannucci, M. and Stingo, F. (2011). “Bayesian Models for Variable Selection that Incorporate Biological Information (with discussion).” In Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., and West, M. (eds.), Bayesian Statistics, volume 9, 659–678. Oxford University Press.
  • von Luxburg, U. (2007). “A Tutorial on Spectral Clustering.” Statistics and Computing, 17(4): 395–416.
  • Yuan, M. and Lin, Y. (2006). “Model Selection and Estimation in Regression with Grouped Variables.” Journal of the Royal Statistical Society, Series B, 68(1): 49–67.
  • Zou, H. and Hastie, T. (2005). “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society, Series B, 67(2): 301–320.