Open Access
December 2018 Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation
Hang Qian
Bayesian Anal. 13(4): 1011-1035 (December 2018). DOI: 10.1214/17-BA1083

Abstract

We introduce the normal-inverse-gamma summation operator, which combines Bayesian regression results from different data sources and leads to a simple split-and-merge algorithm for big data regressions. The summation operator is also useful for computing the marginal likelihood and facilitates Bayesian model selection methods, including Bayesian LASSO, stochastic search variable selection, Markov chain Monte Carlo model composition, etc. Observations are scanned in one pass and then the sampler iteratively combines normal-inverse-gamma distributions without reloading the data. Simulation studies demonstrate that our algorithms can efficiently handle highly correlated big data. A real-world data set on employment and wage is also analyzed.

Citation

Download Citation

Hang Qian. "Big Data Bayesian Linear Regression and Variable Selection by Normal-Inverse-Gamma Summation." Bayesian Anal. 13 (4) 1011 - 1035, December 2018. https://doi.org/10.1214/17-BA1083

Information

Published: December 2018
First available in Project Euclid: 8 November 2017

zbMATH: 06989974
MathSciNet: MR3855361
Digital Object Identifier: 10.1214/17-BA1083

Subjects:
Primary: 62E15 , 62J07
Secondary: 68W10

Keywords: conjugate prior , hierarchical shrinkage , MapReduce

Vol.13 • No. 4 • December 2018
Back to Top