Electronic Journal of Statistics

Adaptive MCMC for multiple changepoint analysis with applications to large datasets

Alan Benson and Nial Friel

Full-text: Open access

Abstract

We consider the problem of Bayesian inference for changepoints where the number and position of the changepoints are both unknown. In particular, we consider product partition models where it is possible to integrate out model parameters for the regime between each changepoint, leaving a posterior distribution over a latent vector indicating the presence or not of a changepoint at each observation. The same problem setting has been considered by Fearnhead (2006) where one can use filtering recursions to make exact inference. However, the complexity of this filtering recursions algorithm is quadratic in the number of observations. Our approach relies on an adaptive Markov Chain Monte Carlo (MCMC) method for finite discrete state spaces. We develop an adaptive algorithm which can learn from the past states of the Markov chain in order to build proposal distributions which can quickly discover where changepoint are likely to be located. We prove that our algorithm leaves the posterior distribution ergodic. Crucially, we demonstrate that our adaptive MCMC algorithm is viable for large datasets for which the filtering recursions approach is not. Moreover, we show that inference is possible in a reasonable time thus making Bayesian changepoint detection computationally efficient.

Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 3365-3396.

Dates
Received: March 2017
First available in Project Euclid: 9 October 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1539050490

Digital Object Identifier
doi:10.1214/18-EJS1418

Keywords
Adaptive MCMC changepoint detection large datasets

Rights
Creative Commons Attribution 4.0 International License.

Citation

Benson, Alan; Friel, Nial. Adaptive MCMC for multiple changepoint analysis with applications to large datasets. Electron. J. Statist. 12 (2018), no. 2, 3365--3396. doi:10.1214/18-EJS1418. https://projecteuclid.org/euclid.ejs/1539050490


Export citation

References

  • [1] Barry, D. and J. A. Hartigan (1992, 03). Product partition models for change point problems., Ann. Statist. 20(1), 260–279.
  • [2] Carpenter, J., P. Clifford, and P. Fearnhead (1999). Improved particle filter for nonlinear problems., IEE Proceedings-Radar, Sonar and Navigation 146(1), 2–7.
  • [3] Chen, J. and A. K. Gupta (2011)., Parametric statistical change point analysis: with applications to genetics, medicine, and finance. Springer Science & Business Media.
  • [4] Chib, S. (1998). Estimation and comparison of multiple change-point models., Journal of econometrics 86(2), 221–241.
  • [5] Fearnhead, P. (2006). Exact and efficient Bayesian inference for multiple changepoint problems., Statistics and Computing 16(2), 203–213.
  • [6] Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination., Biometrika 82(4), 711–732.
  • [7] Griffin, J., K. Latuszynski, and M. Steel (2014). Individual adaptation: an adaptive MCMC scheme for variable selection problems., arXiv preprint arXiv:1412.6760v2.
  • [8] Haario, H., E. Saksman, and J. Tamminen (2001, 04). An adaptive Metropolis algorithm., Bernoulli 7(2), 223–242.
  • [9] Hocking, T. D., V. Boeva, G. Rigaill, G. Schleiermacher, I. Janoueix-Lerosey, O. Delattre, W. Richer, F. Bourdeaut, M. Suguro, M. Seto, et al. (2014). SegAnnDB: interactive web-based genomic segmentation., Bioinformatics 30(11), 1539–1546.
  • [10] Łatuszyński, K. and J. S. Rosenthal (2014). The containment condition and AdapFail algorithms., Journal of Applied Probability 51(04), 1189–1195.
  • [11] Lavielle, M. and E. Lebarbier (2001). An application of MCMC methods for the multiple change-points problem., Signal Processing 81(1), 39–53.
  • [12] Mahendran, N., Z. Wang, F. Hamze, and N. D. Freitas (2012). Adaptive MCMC with Bayesian optimization. In, International Conference on Artificial Intelligence and Statistics, pp. 751–760.
  • [13] Matias, Y., J. S. Vitter, and W. Ni (1993). Dynamic generation of discrete random variates. In, SODA, pp. 361–370.
  • [14] Meyn, S. P. and R. L. Tweedie (2012)., Markov chains and stochastic stability. Springer Science & Business Media.
  • [15] Raftery, A. E. and V. E. Akman (1986). Bayesian analysis of a Poisson process with a change-point., Biometrika 73(1), 85–89.
  • [16] Rosenthal, J. S. and G. O. Roberts (2007). Coupling and ergodicity of adaptive MCMC., Journal of Applied Probablity 44, 458–475.
  • [17] Ruanaidh, J. and W. J. Fitzgerald (2012)., Numerical Bayesian methods applied to signal processing. Springer Science & Business Media.
  • [18] Stephens, D. A. (1994). Bayesian retrospective multiple-changepoint identification., Journal of the Royal Statistical Society. Series C (Applied Statistics) 43(1), 159–178.
  • [19] Vose, M. D. (1999)., The simple genetic algorithm: foundations and theory, Volume 12. MIT press.
  • [20] Walker, A. J. (1974). New fast method for generating discrete random numbers with arbitrary frequency distributions., Electronics Letters 10(8), 127–128.
  • [21] Wyse, J. and N. Friel (2010). Simulation-based Bayesian analysis for multiple changepoints., arXiv preprint arXiv:1011.2932.
  • [22] Wyse, J., N. Friel, et al. (2011). Approximate simulation-free Bayesian inference for multiple changepoint models with dependence within segments., Bayesian Analysis 6(4), 501–528.
  • [23] Yellott, J. I. (1977). The relationship between luce’s choice axiom, thurstone’s theory of comparative judgment, and the double exponential distribution., Journal of Mathematical Psychology 15(2), 109–144.