## Brazilian Journal of Probability and Statistics

### A brief tutorial on transformation based Markov Chain Monte Carlo and optimal scaling of the additive transformation

#### Abstract

We consider the recently introduced Transformation-based Markov Chain Monte Carlo (TMCMC) (Stat. Methodol. 16 (2014) 100–116), a methodology that is designed to update all the parameters simultaneously using some simple deterministic transformation of a one-dimensional random variable drawn from some arbitrary distribution on a relevant support. The additive transformation based TMCMC is similar in spirit to random walk Metropolis, except the fact that unlike the latter, additive TMCMC uses a single draw from a one-dimensional proposal distribution to update the high-dimensional parameter. In this paper, we first provide a brief tutorial on TMCMC, exploring its connections and contrasts with various available MCMC methods.

Then we study the diffusion limits of additive TMCMC under various set-ups ranging from the product structure of the target density to the case where the target is absolutely continuous with respect to a Gaussian measure; we also consider the additive TMCMC within Gibbs approach for all the above set-ups. These investigations lead to appropriate scaling of the one-dimensional proposal density. We also show that the optimal acceptance rate of additive TMCMC is 0.439 under all the aforementioned set-ups, in contrast with the well-established 0.234 acceptance rate associated with optimal random walk Metropolis algorithms under the same set-ups. We also elucidate the ramifications of our results and clear advantages of additive TMCMC over random walk Metropolis with ample simulation studies and Bayesian analysis of a real, spatial dataset with which $160$ unknowns are associated.

#### Article information

Source
Braz. J. Probab. Stat., Volume 31, Number 3 (2017), 569-617.

Dates
Accepted: June 2016
First available in Project Euclid: 22 August 2017

https://projecteuclid.org/euclid.bjps/1503388830

Digital Object Identifier
doi:10.1214/16-BJPS325

Mathematical Reviews number (MathSciNet)
MR3693982

Zentralblatt MATH identifier
1378.60100

#### Citation

Dey, Kushal Kr.; Bhattacharya, Sourabh. A brief tutorial on transformation based Markov Chain Monte Carlo and optimal scaling of the additive transformation. Braz. J. Probab. Stat. 31 (2017), no. 3, 569--617. doi:10.1214/16-BJPS325. https://projecteuclid.org/euclid.bjps/1503388830

#### References

• Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. The Annals of Applied Probability 17, 1222–1244.
• Bédard, M. (2008a). Efficient sampling using Metropolis algorithms: Applications of optimal scaling results. Journal of Computational and Graphical Statistics 17, 312–332.
• Bédard, M. (2008b). Optimal acceptance rates for Metropolis algorithms: Moving beyond 0.234. Stochastic Processes and their Applications 118, 2198–2222.
• Bédard, M., Douc, R. and Moulines, E. (2012). Scaling analysis of multiple-try MCMC methods. Stochastic Processes and their Applications 122, 758–786.
• Bédard, M. and Rosenthal, J. S. (2008). Optimal scaling of Metropolis algorithms: Heading toward general target distributions. Canadian Journal of Statistics 36, 483–503.
• Bélisle, C. J. P., Romeijn, H. E. and Smith, R. L. (1993). Hit-and-run algorithms for generating multivariate distributions. Mathematics of Operations Research 18, 255–266.
• Berbee, H. C. P., Boender, C. G. E., Rinnooy Kan, A. H. G., Scheffer, C. L., Smith, R. L. and Telgen, J. (1987). Hit-and-run algorithms for the identification of nonredundant linear inequalities. Mathematical Programming 37, 184–207.
• Beskos, A., Roberts, G. O. and Stuart, A. M. (2009). Optimal scalings for local Metropolis–Hastings chains on non-product targets in high dimensions. The Annals of Applied Probability 19, 863–898.
• Beskos, A. and Stuart, A. M. (2009). MCMC methods for sampling function space. In ICIAM07: 6th International Congress on Industrial and Applied Mathematics (R. Jeltsch and G. Wanner, eds.) 337–364. Zürich: European Mathematical Society.
• Christensen, O. F. (2006). Robust Markov chain Monte Carlo methods for spatial generalized linear mixed models. Journal of Computational and Graphical Statistics 15, 1–17.
• Das, M. and Bhattacharya, S. (2016). Transdimensional transformation based Markov chain Monte Carlo. Preprint. Available at https://arxiv.org/abs/1403.5207.
• Dey, K. K. and Bhattacharya, S. (2016). Adaptive transformation based Markov chain Monte Carlo. Manuscript under preparation.
• Dey, K. K. and Bhattacharya, S. (2016a). On geometric ergodicity of additive and multiplicative transformation based Markov chain Monte Carlo in high dimensions. Brazilian Journal of Probability and Statistics. To appear. Available at https://arxiv.org/abs/1312.0915.
• Dey, K. K. and Bhattacharya, S. (2016b). Supplement to “A brief tutorial on transformation based Markov Chain Monte Carlo and optimal scaling of the additive transformation.” DOI:10.1214/16-BJPS325SUPP.
• Diggle, P. J., Tawn, J. A. and Moyeed, R. A. (1998). Model-based geostatistics (with discussion). Applied Statistics 47, 299–350.
• Dutta, S. (2012). Multiplicative random walk Metropolis–Hastings on the real line. Sankhya B 74, 315–342.
• Dutta, S. and Bhattacharya, S. (2014). Markov chain Monte Carlo based on deterministic transformations. Statistical Methodology 16, 100–116. Also available at arXiv:1306.6684. Supplement available at arXiv:1106.5850.
• Geyer, C. J. (2011). Introduction to Markov chain Monte Carlo. In Handbook of Markov Chain Monte Carlo (S. Brooks, A. Gelman, G. L. Jones and X.-L. Meng, eds.) 3–48. New York: Chapman & Hall/CRC.
• Gilks, W. R., Roberts, G. O. and George, E. I. (1994). Adaptive direction sampling. The Statstician 43, 179–189.
• Johnson, L. T. and Geyer, C. J. (2012). Variable transformation to obtain geometric ergodicity in the random-walk Metropolis algorithm. The Annals of Statistics 40, 3050–3076.
• Jourdain, B., Lelièvre, T. and Miasojedow, B. (2013). Optimal scaling for the transient phase of the random walk Metropolis algorithm: The mean-field limit. Preprint. Available at arXiv:1210.7639v2.
• Koralov, L. B. and Sinai, Y. G. (2007). Theory of Probability and Random Processes. New York: Springer.
• Kou, S. C., Xie, X. S. and Liu, J. S. (2005). Bayesian analysis of single-molecule experimental data. Applied Statistics 54, 469–506.
• Liang, F., Liu, C. and Caroll, R. (2010). Advanced Markov chain Monte Carlo methods: Learning from past samples. New York: Wiley.
• Liu, J. S., Liang, F. and Wong, W. H. (2000). The multiple-try method and local optimization in Metropolis sampling. Journal of the American Statistical Association 95, 121–134.
• Liu, J. S. and Sabatti, S. (2000). Generalized Gibbs sampler and multigrid Monte Carlo for Bayesian computation. Biometrika 87, 353–369.
• Liu, J. S. and Yu, Y. N. (1999). Parameter expansion for data augmentation. Journal of the American Statistical Association 94, 1264–1274.
• Martino, L. and Read, J. (2013). On the flexibility of the design of multiple try Metropolis schemes. Computational Statistics 28, 2797–2823.
• Mattingly, J. C., Pillai, N. S. and Stuart, A. M. (2011). Diffusion limits of the random walk Metropolis algorithm in high dimensions. The Annals of Applied Probability 22, 881–930.
• Neal, P. and Roberts, G. O. (2006). Optimal scaling for partially updating MCMC. Algorithms. The Annals of Applied Probability 16, 475–515.
• Prato, G. D. and Zabczyk, J. (1992). Stochastic Equations in Infinite Dimensions. Encylopedia of Mathematics and Its Applications 44. Cambridge: Cambridge University Press.
• Roberts, G., Gelman, A. and Gilks, W. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. The Annals of Applied Probability 7, 110–120.
• Roberts, G. O. and Gilks, W. R. (1994). Convergence of adaptive direction sampling. Journal of Multivariate Analysis 49, 287–298.
• Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statistical Science 16, 351–367.
• Roberts, G. O. and Rosenthal, R. S. (2009). Examples of adaptive MCMC. Journal of Computational and Graphical Statistics 18, 349–367.
• Romeijn, H. E. and Smith, R. L. (1994). Simulated annealing for constrained global optimization. Journal of Global Optimization 5, 101–126.
• Skorohod, A. V. (1956). Limit theorems for stochastic processes. Theory of Probability and its Applications 1, 261–290.
• Smirnov, N. V. (1948). Tables for estimating the goodness of fit of empirical distributions. Annals of Mathematical Statistics 19, 279–281.
• Smith, R. L. (1996). The hit-and-run sampler: A globally reaching Markov sampler for generating arbitrary multivariate distributions. In Proceedings of the 1996 Winter Simulation Conference (J. M. Charnes, D. J. Morrice, D. T. Brunner and J. J. Swain, eds.), 260–264.
• Storvik, G. (2011). On the flexibility of Metropolis–Hastings acceptance probabilities in auxiliary variable proposal generation. Scandinavian Journal of Statistics 38, 342–358.

#### Supplemental materials

• Supplement to “A brief tutorial on transformation based Markov Chain Monte Carlo and optimal scaling of the additive transformation”. Additional details are provided in this supplementary material, whose sections and figures have the prefix “S-” when referred to in this article. Briefly, in Section S-1, we provide details on computational efficiency of TMCMC. Specifically, we demonstrate with an experiment the superior computational speed of additive TMCMC in comparison with RWM, particularly in high dimensions. In Section S-2 we discuss, with appropriate experiments, the necessity of optimal scaling in additive TMCMC, while in Sections S-3 and S-4 we delve into the robustness issues associated with the scale choices of additive TMCMC and RWM. In Section S-5, we include brief discussions of adaptive versions of RWM and TMCMC. Moreover, the proofs of all our technical results are provided in Sections S-6 and S-7 of the supplement.