Bayesian Analysis

A one-pass sequential Monte Carlo method for Bayesian analysis of massive datasets

Suhrid Balakrishnan and David Madigan

Full-text: Open access

Abstract

For Bayesian analysis of massive data, Markov chain Monte Carlo (MCMC) techniques often prove infeasible due to computational resource constraints. Standard MCMC methods generally require a complete scan of the dataset for each iteration. Ridgeway and Madigan (2002) and Chopin (2002b) recently presented importance sampling algorithms that combined simulations from a posterior distribution conditioned on a small portion of the dataset with a reweighting of those simulations to condition on the remainder of the dataset. While these algorithms drastically reduce the number of data accesses as compared to traditional MCMC, they still require substantially more than a single pass over the dataset. In this paper, we present "1PFS," an efficient, one-pass algorithm. The algorithm employs a simple modification of the Ridgeway and Madigan (2002) particle filtering algorithm that replaces the MCMC based "rejuvenation" step with a more efficient "shrinkage" kernel smoothing based step. To show proof-of-concept and to enable a direct comparison, we demonstrate 1PFS on the same examples presented in Ridgeway and Madigan (2002), namely a mixture model for Markov chains and Bayesian logistic regression. Our results indicate the proposed scheme delivers accurate parameter estimates while employing only a single pass through the data.

Article information

Source
Bayesian Anal., Volume 1, Number 2 (2006), 345-361.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340371066

Digital Object Identifier
doi:10.1214/06-BA112

Mathematical Reviews number (MathSciNet)
MR2221268

Zentralblatt MATH identifier
1333.62007

Keywords
Sequential Monte Carlo One-Pass Massive Datasets

Citation

Balakrishnan, Suhrid; Madigan, David. A one-pass sequential Monte Carlo method for Bayesian analysis of massive datasets. Bayesian Anal. 1 (2006), no. 2, 345--361. doi:10.1214/06-BA112. https://projecteuclid.org/euclid.ba/1340371066


Export citation