Open Access
April 2013 Bayesian nonparametric analysis of reversible Markov chains
Sergio Bacallado, Stefano Favaro, Lorenzo Trippa
Ann. Statist. 41(2): 870-896 (April 2013). DOI: 10.1214/13-AOS1102
Abstract

We introduce a three-parameter random walk with reinforcement, called the $(\theta,\alpha,\beta)$ scheme, which generalizes the linearly edge reinforced random walk to uncountable spaces. The parameter $\beta$ smoothly tunes the $(\theta,\alpha,\beta)$ scheme between this edge reinforced random walk and the classical exchangeable two-parameter Hoppe urn scheme, while the parameters $\alpha$ and $\theta$ modulate how many states are typically visited. Resorting to de Finetti’s theorem for Markov chains, we use the $(\theta,\alpha,\beta)$ scheme to define a nonparametric prior for Bayesian analysis of reversible Markov chains. The prior is applied in Bayesian nonparametric inference for species sampling problems with data generated from a reversible Markov chain with an unknown transition kernel. As a real example, we analyze data from molecular dynamics simulations of protein folding.

References

1.

[1] Bacallado, S. (2011). Bayesian analysis of variable-order, reversible Markov chains. Ann. Statist. 39 838–864. MR2816340 1215.62083 10.1214/10-AOS857 euclid.aos/1299680956 [1] Bacallado, S. (2011). Bayesian analysis of variable-order, reversible Markov chains. Ann. Statist. 39 838–864. MR2816340 1215.62083 10.1214/10-AOS857 euclid.aos/1299680956

2.

[2] Bacallado, S., Favaro, S. and Trippa, L. (2013). Supplement to “Bayesian nonparametric analysis of reversible Markov chains.”  DOI:10.1214/13-AOS1102SUPP.[2] Bacallado, S., Favaro, S. and Trippa, L. (2013). Supplement to “Bayesian nonparametric analysis of reversible Markov chains.”  DOI:10.1214/13-AOS1102SUPP.

3.

[3] Beal, M. J., Ghahramani, Z. and Rasmussen, C. E. (2002). The infinite hidden Markov model. Adv. Neural Inf. Process. Syst. 14 577–584.[3] Beal, M. J., Ghahramani, Z. and Rasmussen, C. E. (2002). The infinite hidden Markov model. Adv. Neural Inf. Process. Syst. 14 577–584.

4.

[4] Blackwell, D. and MacQueen, J. B. (1973). Ferguson distributions via Pólya urn schemes. Ann. Statist. 1 353–355. MR362614 10.1214/aos/1176342372 euclid.aos/1176342372 [4] Blackwell, D. and MacQueen, J. B. (1973). Ferguson distributions via Pólya urn schemes. Ann. Statist. 1 353–355. MR362614 10.1214/aos/1176342372 euclid.aos/1176342372

5.

[5] Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Amer. Statist. Assoc. 88 364–373.[5] Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Amer. Statist. Assoc. 88 364–373.

6.

[6] Comtet, L. (1974). Advanced Combinatorics: The Art of Finite and Infinite Expansions, enlarged ed. Reidel, Dordrecht. MR460128 0283.05001[6] Comtet, L. (1974). Advanced Combinatorics: The Art of Finite and Infinite Expansions, enlarged ed. Reidel, Dordrecht. MR460128 0283.05001

7.

[7] Diaconis, P. (1988). Recent progress on de Finetti notions of exchangeability. In Bayesian Statistics 3 (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 111–125. Oxford Univ. Press, New York. MR1008047 0707.60033[7] Diaconis, P. (1988). Recent progress on de Finetti notions of exchangeability. In Bayesian Statistics 3 (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 111–125. Oxford Univ. Press, New York. MR1008047 0707.60033

8.

[8] Diaconis, P. and Freedman, D. (1980). de Finetti’s theorem for Markov chains. Ann. Probab. 8 115–130. MR556418 0426.60064 10.1214/aop/1176994828 euclid.aop/1176994828 [8] Diaconis, P. and Freedman, D. (1980). de Finetti’s theorem for Markov chains. Ann. Probab. 8 115–130. MR556418 0426.60064 10.1214/aop/1176994828 euclid.aop/1176994828

9.

[9] Diaconis, P. and Rolles, S. W. W. (2006). Bayesian analysis for reversible Markov chains. Ann. Statist. 34 1270–1292. MR2278358 1118.62085 10.1214/009053606000000290 euclid.aos/1152540749 [9] Diaconis, P. and Rolles, S. W. W. (2006). Bayesian analysis for reversible Markov chains. Ann. Statist. 34 1270–1292. MR2278358 1118.62085 10.1214/009053606000000290 euclid.aos/1152540749

10.

[10] Engen, S. (1978). Stochastic Abundance Models: With Emphasis on Biological Communities and Species Diversity. Chapman & Hall, London. MR515721[10] Engen, S. (1978). Stochastic Abundance Models: With Emphasis on Biological Communities and Species Diversity. Chapman & Hall, London. MR515721

11.

[11] Favaro, S., Lijoi, A., Mena, R. H. and Prünster, I. (2009). Bayesian non-parametric inference for species variety with a two-parameter Poisson–Dirichlet process prior. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 993–1008. MR2750254 10.1111/j.1467-9868.2009.00717.x[11] Favaro, S., Lijoi, A., Mena, R. H. and Prünster, I. (2009). Bayesian non-parametric inference for species variety with a two-parameter Poisson–Dirichlet process prior. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 993–1008. MR2750254 10.1111/j.1467-9868.2009.00717.x

12.

[12] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230. MR350949 0255.62037 10.1214/aos/1176342360 euclid.aos/1176342360 [12] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230. MR350949 0255.62037 10.1214/aos/1176342360 euclid.aos/1176342360

13.

[13] Fortini, S. and Petrone, S. (2012). Hierarchical reinforced urn processes. Statist. Probab. Lett. 82 1521–1529. MR2930656[13] Fortini, S. and Petrone, S. (2012). Hierarchical reinforced urn processes. Statist. Probab. Lett. 82 1521–1529. MR2930656

14.

[14] Ishwaran, H. and James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc. 96 161–173. MR1952729 1014.62006 10.1198/016214501750332758[14] Ishwaran, H. and James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc. 96 161–173. MR1952729 1014.62006 10.1198/016214501750332758

15.

[15] Keane, M. S. and Rolles, S. W. W. (2000). Edge-reinforced random walk on finite graphs. In Infinite Dimensional Stochastic Analysis (Amsterdam, 1999). Verh. Afd. Natuurkd. 1. Reeks. K. Ned. Akad. Wet. 52 217–234. R. Neth. Acad. Arts Sci., Amsterdam. MR1832379 0986.05092[15] Keane, M. S. and Rolles, S. W. W. (2000). Edge-reinforced random walk on finite graphs. In Infinite Dimensional Stochastic Analysis (Amsterdam, 1999). Verh. Afd. Natuurkd. 1. Reeks. K. Ned. Akad. Wet. 52 217–234. R. Neth. Acad. Arts Sci., Amsterdam. MR1832379 0986.05092

16.

[16] Lijoi, A., Mena, R. H. and Prünster, I. (2007). Bayesian nonparametric estimation of the probability of discovering new species. Biometrika 94 769–786. MR2416792 1156.62374 10.1093/biomet/asm061[16] Lijoi, A., Mena, R. H. and Prünster, I. (2007). Bayesian nonparametric estimation of the probability of discovering new species. Biometrika 94 769–786. MR2416792 1156.62374 10.1093/biomet/asm061

17.

[17] Lijoi, A., Mena, R. H. and Prünster, I. (2007). A Bayesian nonparametric method for prediction in EST analysis. BMC Bioinformatics 8 339–349.[17] Lijoi, A., Mena, R. H. and Prünster, I. (2007). A Bayesian nonparametric method for prediction in EST analysis. BMC Bioinformatics 8 339–349.

18.

[18] Merkl, F. and Rolles, S. W. W. (2009). Recurrence of edge-reinforced random walk on a two-dimensional graph. Ann. Probab. 37 1679–1714. MR2561431 1180.82085 10.1214/08-AOP446 euclid.aop/1253539854 [18] Merkl, F. and Rolles, S. W. W. (2009). Recurrence of edge-reinforced random walk on a two-dimensional graph. Ann. Probab. 37 1679–1714. MR2561431 1180.82085 10.1214/08-AOP446 euclid.aop/1253539854

19.

[19] Pande, V. S., Beauchamp, K. and Bowman, G. R. (2010). Everything you wanted to know about Markov State Models but were afraid to ask. Methods 52 99–105.[19] Pande, V. S., Beauchamp, K. and Bowman, G. R. (2010). Everything you wanted to know about Markov State Models but were afraid to ask. Methods 52 99–105.

20.

[20] Pitman, J. (1996). Some developments of the Blackwell–MacQueen urn scheme. In Statistics, Probability and Game Theory. Institute of Mathematical Statistics Lecture Notes—Monograph Series (T. S. Ferguson, L. S. Shapley and J. B. MacQueen, eds.) 30 245–267. IMS, Hayward, CA. MR1481784 10.1214/lnms/1215453576[20] Pitman, J. (1996). Some developments of the Blackwell–MacQueen urn scheme. In Statistics, Probability and Game Theory. Institute of Mathematical Statistics Lecture Notes—Monograph Series (T. S. Ferguson, L. S. Shapley and J. B. MacQueen, eds.) 30 245–267. IMS, Hayward, CA. MR1481784 10.1214/lnms/1215453576

21.

[21] Pitman, J. and Yor, M. (1997). The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25 855–900. MR1434129 0880.60076 10.1214/aop/1024404422 euclid.aop/1024404422 [21] Pitman, J. and Yor, M. (1997). The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25 855–900. MR1434129 0880.60076 10.1214/aop/1024404422 euclid.aop/1024404422

22.

[22] Propp, J. G. and Wilson, D. B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. In Proceedings of the Seventh International Conference on Random Structures and Algorithms (Atlanta, GA, 1995) 9 223–252. Wiley, New York. MR1611693[22] Propp, J. G. and Wilson, D. B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. In Proceedings of the Seventh International Conference on Random Structures and Algorithms (Atlanta, GA, 1995) 9 223–252. Wiley, New York. MR1611693

23.

[23] Rolles, S. W. W. (2003). How edge-reinforced random walk arises naturally. Probab. Theory Related Fields 126 243–260. MR1990056 1029.60089 10.1007/s00440-003-0260-8[23] Rolles, S. W. W. (2003). How edge-reinforced random walk arises naturally. Probab. Theory Related Fields 126 243–260. MR1990056 1029.60089 10.1007/s00440-003-0260-8

24.

[24] Shaw, D. E. (2010). Atomic-level characterization of the structural dynamics of proteins. Science 330 341–346.[24] Shaw, D. E. (2010). Atomic-level characterization of the structural dynamics of proteins. Science 330 341–346.

25.

[25] Teh, Y. W. and Jordan, M. I. (2010). Hierarchical Bayesian nonparametric models with applications. In Bayesian Nonparametrics 158–207. Cambridge Univ. Press, Cambridge. MR2730663 10.1017/CBO9780511802478.006[25] Teh, Y. W. and Jordan, M. I. (2010). Hierarchical Bayesian nonparametric models with applications. In Bayesian Nonparametrics 158–207. Cambridge Univ. Press, Cambridge. MR2730663 10.1017/CBO9780511802478.006

26.

[26] Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Hierarchical Dirichlet processes. J. Amer. Statist. Assoc. 101 1566–1581. MR2279480 1171.62349 10.1198/016214506000000302[26] Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Hierarchical Dirichlet processes. J. Amer. Statist. Assoc. 101 1566–1581. MR2279480 1171.62349 10.1198/016214506000000302

27.

[27] Zabell, S. L. (1982). W. E. Johnson’s “sufficientness” postulate. Ann. Statist. 10 1090–1099 (1 plate). MR673645 10.1214/aos/1176345975 euclid.aos/1176345975 [27] Zabell, S. L. (1982). W. E. Johnson’s “sufficientness” postulate. Ann. Statist. 10 1090–1099 (1 plate). MR673645 10.1214/aos/1176345975 euclid.aos/1176345975

28.

[28] Zabell, S. L. (2005). The continuum of inductive methods revisited. In Symmetry and its Discontents: Essays on the History of Inductive Probability. Cambridge Univ. Press, New York. MR2199124 1100.01001[28] Zabell, S. L. (2005). The continuum of inductive methods revisited. In Symmetry and its Discontents: Essays on the History of Inductive Probability. Cambridge Univ. Press, New York. MR2199124 1100.01001
Copyright © 2013 Institute of Mathematical Statistics
Sergio Bacallado, Stefano Favaro, and Lorenzo Trippa "Bayesian nonparametric analysis of reversible Markov chains," The Annals of Statistics 41(2), 870-896, (April 2013). https://doi.org/10.1214/13-AOS1102
Published: April 2013
Vol.41 • No. 2 • April 2013
Back to Top