Bayesian Analysis

Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation

Hang Qian

Full-text: Open access


We introduce the normal-inverse-gamma summation operator, which combines Bayesian regression results from different data sources and leads to a simple split-and-merge algorithm for big data regressions. The summation operator is also useful for computing the marginal likelihood and facilitates Bayesian model selection methods, including Bayesian LASSO, stochastic search variable selection, Markov chain Monte Carlo model composition, etc. Observations are scanned in one pass and then the sampler iteratively combines normal-inverse-gamma distributions without reloading the data. Simulation studies demonstrate that our algorithms can efficiently handle highly correlated big data. A real-world data set on employment and wage is also analyzed.

Article information

Bayesian Anal., Volume 13, Number 4 (2018), 1011-1035.

First available in Project Euclid: 8 November 2017

Permanent link to this document

Digital Object Identifier

Primary: 62E15: Exact distribution theory 62J07: Ridge regression; shrinkage estimators
Secondary: 68W10: Parallel algorithms

conjugate prior hierarchical shrinkage MapReduce

Creative Commons Attribution 4.0 International License.


Qian, Hang. Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation. Bayesian Anal. 13 (2018), no. 4, 1011--1035. doi:10.1214/17-BA1083.

Export citation


  • Ang, A. and Chen, J. (2007). “CAPM over the Long Run: 1926–2001.” Journal of Empirical Finance, 14(1): 1–40.
  • Branch, W. A. and Evans, G. W. (2006). “A Simple Recursive Forecasting Model.” Economics Letters, 91(2): 158–166.
  • Carlin, B. P. and Chib, S. (1995). “Bayesian Model Choice via Markov Chain Monte Carlo Methods.” Journal of the Royal Statistical Society. Series B (Methodological), 57(3): 473–484.
  • Dean, J. and Ghemawat, S. (2008). “MapReduce: Simplified Data Processing on Large Clusters.” Communications of the ACM, 51(1): 107–113.
  • Fan, J. and Lv, J. (2010). “A Selective Overview of Variable Selection in High Dimensional Feature Space.” Statistica Sinica, 20(1): 101–148.
  • George, E. I. and Foster, D. P. (2000). “Calibration and Empirical Bayes Variable Selection.” Biometrika, 87(4): 731–747.
  • George, E. I. and McCulloch, R. E. (1993). “Variable Selection via Gibbs Sampling.” Journal of the American Statistical Association, 88(423): 881–889.
  • George, E. I. and Mcculloch, R. E. (1997). “Approaches for Bayesian Variable Selection.” Statistica Sinica, 339–374.
  • Geweke, J. (1996). Variable Selection and Model Comparison in Regression. Oxford: Oxford University Press.
  • Ghosh, J. and Reiter, J. P. (2013). “Secure Bayesian Model Averaging for Horizontally Partitioned Data.” Statistics and Computing, 23(3): 311–322.
  • Godsill, S. J. (2001). “On the Relationship between Markov Chain Monte Carlo Methods for Model Uncertainty.” Journal of Computational and Graphical Statistics, 10(2): 230–248.
  • Green, P. J. (1995). “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination.” Biometrika, 82: 711–732.
  • Johnson, V. E. and Rossell, D. (2012). “Bayesian Model Selection in High-Dimensional Settings.” Journal of the American Statistical Association, 107(498): 649–660.
  • Kuo, L. and Mallick, B. (1998). “Variable Selection for Regression Models.” Sankhya: The Indian Journal of Statistics, Series B, 60(1): 65–81.
  • Lin, D., Foster, D. P., and Ungar, L. H. (2011). “VIF Regression: A Fast Regression Algorithm for Large Data.” Journal of the American Statistical Association, 106(493): 232–247.
  • Madigan, D., York, J., and Allard, D. (1995). “Bayesian Graphical Models for Discrete Data.” International Statistical Review, 63(2): 215–232.
  • Miroshnikov, A., Savel’ev, E., and Conlon, E. M. (2015). “BayesSummaryStatLM: An R package for Bayesian Linear Models for Big Data and Data Science.” Manuscript:
  • Neiswanger, W., Wang, C., and Xing, E. P. (2014). “Asymptotically Exact, Embarrassingly Parallel MCMC.” In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 623–632. Arlington:AUAI Press.
  • Ordonez, C., Garcia-Alvarado, C., and Baladandayuthapani, V. (2014). “Bayesian Variable Selection in Linear Regression in One Pass for Large Datasets.” ACM Transactions on Knowledge Discovery from Data, 9(1): 1–14.
  • Park, T. and Casella, G. (2008). “The Bayesian Lasso.” Journal of the American Statistical Association, 103(482): 681–686.
  • Qian, H. (2017). “Supplementary Material for Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation.” Bayesian Analysis.
  • Raftery, A. E., Madigan, D., and Hoeting, J. A. (1997). “Bayesian Model Averaging for Linear Regression Models.” Journal of the American Statistical Association, 92(437): 179–191.
  • Scott, S. L., Blocker, A. W., Bonassi, F. V., Chipman, H. A., George, E. I., and McCulloch, R. E. (2016). “Bayes and Big Data: The Consensus Monte Carlo Algorithm.” International Journal of Management Science and Engineering Management, 11: 78–88.
  • Smith, M. and Kohn, R. (1996). “Nonparametric Regression Using Bayesian Variable Selection.” Journal of Econometrics, 75(2): 317–343.
  • Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society B, 58: 267–288.
  • Zeller, A. (1986). On Assessing Prior Distributions and Bayesian Regression Analysis with g-Prior Distributions. New York: Elsevier Science Publishers.

Supplemental materials

  • Supplementary Material for Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation. Proofs of Proposition 1–9 and data cleaning procedures in Section 6 (in a separate document).