The Annals of Applied Statistics

Joint modeling of multiple time series via the beta process with application to motion capture segmentation

Emily B. Fox, Michael C. Hughes, Erik B. Sudderth, and Michael I. Jordan

Full-text: Open access

Abstract

We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our model discovers a latent set of dynamical behaviors shared among the sequences, and segments each time series into regions defined by a subset of these behaviors. Using a beta process prior, the size of the behavior set and the sharing pattern are both inferred from data. We develop Markov chain Monte Carlo (MCMC) methods based on the Indian buffet process representation of the predictive distribution of the beta process. Our MCMC inference algorithm efficiently adds and removes behaviors via novel split-merge moves as well as data-driven birth and death proposals, avoiding the need to consider a truncated model. We demonstrate promising results on unsupervised segmentation of human motion capture data.

Article information

Source
Ann. Appl. Stat., Volume 8, Number 3 (2014), 1281-1313.

Dates
First available in Project Euclid: 23 October 2014

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1414091214

Digital Object Identifier
doi:10.1214/14-AOAS742

Mathematical Reviews number (MathSciNet)
MR3271333

Zentralblatt MATH identifier
1303.62048

Keywords
Bayesian nonparametrics beta process hidden Markov models motion capture multiple time series

Citation

Fox, Emily B.; Hughes, Michael C.; Sudderth, Erik B.; Jordan, Michael I. Joint modeling of multiple time series via the beta process with application to motion capture segmentation. Ann. Appl. Stat. 8 (2014), no. 3, 1281--1313. doi:10.1214/14-AOAS742. https://projecteuclid.org/euclid.aoas/1414091214


Export citation

References

  • Aach, J. and Church, G. (2001). Aligning gene expression time series with time warping algorithms. Bioinformatics 17 495–508.
  • Alon, J., Sclaroff, S., Kollios, G. and Pavlovic, V. (2003). Discovering clusters in motion time-series data. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Madison, WI, USA.
  • Altman, R. M. (2007). Mixed hidden Markov models: An extension of the hidden Markov model to the longitudinal data setting. J. Amer. Statist. Assoc. 102 201–210.
  • Aoki, M. and Havenner, A. (1991). State space modeling of multiple time series. Econometric Rev. 10 1–99.
  • Barbič, J., Safonova, A., Pan, J.-Y., Faloutsos, C., Hodgins, J. K. and Pollard, N. S. (2004). Segmenting motion capture data into distinct behaviors. In Proc. of Graphics Interface. London, Ontario, Canada.
  • Beal, M., Ghahramani, Z. and Rasmussen, C. (2001). The infinite hidden Markov model. In Advances in Neural Information Processing Systems (NIPS) 14. Vancouver, Canada.
  • CMU (2009). Carnegie Mellon University graphics lab motion capture database. Available at http://mocap.cs.cmu.edu/.
  • Dahl, D. B. (2005). Sequentially-allocated merge-split sampler for conjugate and nonconjugate dirichlet process mixture models. Technical report, Texas A&M Univ., College Station, TX.
  • Duh, K. (2005). Jointly labeling multiple sequences: A factorial HMM approach. In 43rd Annual Meeting of the Assoc. for Computational Linguistics (ACL). Ann Arbor, MI.
  • Dunson, D. B. (2009). Nonparametric Bayes local partition models for random effects. Biometrika 96 249–262.
  • Dunson, D. B. (2010). Multivariate kernel partition process mixtures. Statist. Sinica 20 1395–1422.
  • Fox, E. B., Sudderth, E. B., Jordan, M. I. and Willsky, A. S. (2009). Sharing features among dynamical systems with beta processes. In Advances in Neural Information Processing Systems (NIPS) 22. Vancouver, Canada.
  • Fox, E. B., Sudderth, E. B., Jordan, M. I. and Willsky, A. S. (2010). Bayesian nonparametric methods for learning Markov switching processes. IEEE Signal Process. Mag. 27 43–54.
  • Fox, E. B., Sudderth, E. B., Jordan, M. I. and Willsky, A. S. (2011a). Bayesian nonparametric inference of switching dynamic linear models. IEEE Trans. Signal Process. 59 1569–1585.
  • Fox, E. B., Sudderth, E. B., Jordan, M. I. and Willsky, A. S. (2011b). A sticky HDP–HMM with application to speaker diarization. Ann. Appl. Stat. 5 1020–1056.
  • Fox, E. B., Hughes, M. C., Sudderth, E. B. and Jordan, M. I. (2014). Supplement to “Joint modeling of multiple time series via the beta process with application to motion capture segmentation.” DOI:10.1214/14-AOAS742SUPP.
  • Frigessi, A., di Stefano, P., Hwang, C.-R. and Sheu, S. J. (1993). Convergence rates of the Gibbs sampler, the Metropolis algorithm and other single-site updating dynamics. J. Roy. Statist. Soc. Ser. B 55 205–219.
  • Ghahramani, Z., Griffiths, T. L. and Sollich, P. (2006). Bayesian nonparametric latent feature models. In Proc. of the Eighth Valencia International Meeting on Bayesian Statistics (Bayesian Statistics 8). Alicante, Spain.
  • Ghahramani, Z. and Jordan, M. I. (1997). Factorial hidden Markov models. Machine Learning 29 245–273.
  • Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711–732.
  • Griffin, J. E. and Steel, M. F. J. (2006). Order-based dependent Dirichlet processes. J. Amer. Statist. Assoc. 101 179–194.
  • Hjort, N. L. (1990). Nonparametric Bayes estimators based on beta processes in models for life history data. Ann. Statist. 18 1259–1294.
  • Hsu, E., Pulli, K. and Popović, J. (2005). Style translation for human motion. In Proc. of the 32nd International Conference on Computer Graphics and Interactive Technologies (SIGGRAPH). Los Angeles, CA.
  • Hughes, M., Fox, E. B. and Sudderth, E. B. (2012). Effective split merge Monte Carlo methods for nonparametric models of sequential data. In Advances in Neural Information Processing Systems (NIPS) 25. Lake Tahoe, NV, USA.
  • Jain, S. and Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. J. Comput. Graph. Statist. 13 158–182.
  • Jain, S. and Neal, R. M. (2007). Splitting and merging components of a nonconjugate Dirichlet process mixture model. Bayesian Anal. 2 445–472.
  • Kingman, J. F. C. (1967). Completely random measures. Pacific J. Math. 21 59–78.
  • Kingman, J. F. C. (1993). Poisson Processes. Oxford Univ. Press, New York.
  • Lehrach, W. P. and Husmeier, D. (2009). Segmenting bacterial and viral DNA sequence alignments with a trans-dimensional phylogenetic factorial hidden Markov model. J. R. Stat. Soc. Ser. C. Appl. Stat. 58 307–327.
  • Listgarten, J., Neal, R., Roweis, S., Puckrin, R. and Cutler, S. (2006). Bayesian detection of infrequent differences in sets of time series with shared structure. In Advances in Neural Information Processing Systems (NIPS) 19. Vancouver, Canada.
  • Liu, J. S. (1996). Peskun’s theorem and a modified discrete-state Gibbs sampler. Biometrika 83 681–682.
  • MacEachern, S. N. (1999). Dependent nonparametric processes. In ASA Proc. of the Section on Bayesian Statistical Science. Amer. Statist. Assoc., Alexandria, VA.
  • Meeds, E., Ghahramani, Z., Neal, R. M. and Roweis, S. T. (2006). Modeling dyadic data with binary latent factors. In Advances in Neural Information Processing Systems (NIPS) 19. Vancouver, Canada.
  • Mørup, M., Schmidt, M. N. and Hansen, L. K. (2011). Infinite multiple membership relational modeling for complex networks. In IEEE International Workshop on Machine Learning for Signal Processing. Beijing, China.
  • Murphy, K. P. (1998). Hidden Markov model (HMM) toolbox for MATLAB. Available at http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html.
  • Murphy, K. P. (2002). Dynamic Bayesian networks: Representation, inference and learning. Ph.D. thesis, Univ. California, Berkeley.
  • Pavlović, V., Rehg, J. M. and MacCormick, J. (2000). Learning switching linear models of human motion. In Advances in Neural Information Processing Systems (NIPS) 13. Vancouver, Canada.
  • Pavlović, V., Rehg, J. M., Cham, T. J. and Murphy, K. P. (1999). A dynamic Bayesian network approach to figure tracking using learned dynamic models. In Proc. of the 7th IEEE International Conference on Computer Vision (ICCV). Kerkyra, Greece.
  • Qi, Y., Paisley, J. W. and Carin, L. (2007). Music analysis using hidden Markov mixture models. IEEE Trans. Signal Process. 55 5209–5224.
  • Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77 257–286.
  • Saria, S., Koller, D. and Penn, A. (2010). Discovering shared and individual latent structure in multiple time series. Available at arXiv:1008.2028.
  • Taylor, G. W., Hinton, G. E. and Roweis, S. T. (2006). Modeling human motion using binary latent variables. In Advances in Neural Information Processing Systems (NIPS) 19. Vancouver, Canada.
  • Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Hierarchical Dirichlet processes. J. Amer. Statist. Assoc. 101 1566–1581.
  • Thibaux, R. and Jordan, M. I. (2007). Hierarchical beta processes and the Indian buffet process. In Proc. of the Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS). San Juan, Puerto Rico.
  • Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist. 22 1701–1762.
  • Tu, Z. and Zhu, S. C. (2002). Image segmentation by data-driven Markov chain Monte Carlo. IEEE Trans. Pattern Anal. Mach. Intell. 24 657–673.
  • Van Gael, J., Teh, Y. W. and Ghahramani, Z. (2009). The infinite factorial hidden Markov model. In Advances in Neural Information Processing Systems (NIPS) 21. Vancouver, Canada.
  • Wang, C. and Blei, D. (2012). A split-merge MCMC algorithm for the hierarchical Dirichlet process. Available at arXiv:1201.1657.
  • Wang, J. M., Fleet, D. J. and Hertzmann, A. (2008). Gaussian process dynamical models for human motion. IEEE Trans. Pattern Anal. Mach. Intell. 30 283–298.
  • West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Models, 2nd ed. Springer, New York.