The Annals of Applied Probability

Nonasymptotic convergence analysis for the unadjusted Langevin algorithm

Alain Durmus and Éric Moulines

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

In this paper, we study a method to sample from a target distribution $\pi$ over $\mathbb{R}^{d}$ having a positive density with respect to the Lebesgue measure, known up to a normalisation factor. This method is based on the Euler discretization of the overdamped Langevin stochastic differential equation associated with $\pi$. For both constant and decreasing step sizes in the Euler discretization, we obtain nonasymptotic bounds for the convergence to the target distribution $\pi$ in total variation distance. A particular attention is paid to the dependency on the dimension $d$, to demonstrate the applicability of this method in the high-dimensional setting. These bounds improve and extend the results of Dalalyan [J. R. Stat. Soc. Ser. B. Stat. Methodol. (2017) 79 651–676].

Article information

Source
Ann. Appl. Probab., Volume 27, Number 3 (2017), 1551-1587.

Dates
Received: March 2016
Revised: August 2016
First available in Project Euclid: 19 July 2017

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1500451235

Digital Object Identifier
doi:10.1214/16-AAP1238

Mathematical Reviews number (MathSciNet)
MR3678479

Zentralblatt MATH identifier
1377.65007

Subjects
Primary: 65C05: Monte Carlo methods 60F05: Central limit and other weak theorems 62L10: Sequential analysis
Secondary: 65C40: Computational Markov chains 60J05: Discrete-time Markov processes on general state spaces 93E35: Stochastic learning and adaptive control

Keywords
Total variation distance Langevin diffusion Markov Chain Monte Carlo Metropolis adjusted Langevin algorithm rate of convergence

Citation

Durmus, Alain; Moulines, Éric. Nonasymptotic convergence analysis for the unadjusted Langevin algorithm. Ann. Appl. Probab. 27 (2017), no. 3, 1551--1587. doi:10.1214/16-AAP1238. https://projecteuclid.org/euclid.aoap/1500451235


Export citation

References

  • [1] Andrieu, C., De Freitas, N., Doucet, A. and Jordan, M. I. (2003). An introduction to MCMC for machine learning. Mach. Learn. 50 5–43.
  • [2] Bakry, D., Barthe, F., Cattiaux, P. and Guillin, A. (2008). A simple proof of the Poincaré inequality for a large class of probability measures. Electronic Communications in Probability [electronic Only] 13 60–66.
  • [3] Bakry, D., Cattiaux, P. and Guillin, A. (2008). Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré. J. Funct. Anal. 254 727–759.
  • [4] Bakry, D., Gentil, I. and Ledoux, M. (2014). Analysis and Geometry of Markov Diffusion Operators. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 348. Springer, Cham.
  • [5] Bobkov, S. G. (1999). Isoperimetric and analytic inequalities for log-concave probability measures. Ann. Probab. 27 1903–1921.
  • [6] Bolley, F., Gentil, I. and Guillin, A. (2012). Convergence to equilibrium in Wasserstein distance for Fokker–Planck equations. J. Funct. Anal. 263 2430–2457.
  • [7] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration inequalities. In A Nonasymptotic Theory of Independence. Oxford Univ. Press, Oxford.
  • [8] Bubley, R., Dyer, M. and Jerrum, M. (1998). An elementary analysis of a procedure for sampling points in a convex body. Random Structures Algorithms 12 213–235.
  • [9] Cattiaux, P. and Guillin, A. (2009). Trends to equilibrium in total variation distance. Ann. Inst. Henri Poincaré Probab. Stat. 45 117–145.
  • [10] Chen, M. F. and Li, S. F. (1989). Coupling methods for multidimensional diffusion processes. Ann. Probab. 17 151–177.
  • [11] Cotter, S. L., Roberts, G. O., Stuart, A. M. and White, D. (2013). MCMC methods for functions: Modifying old algorithms to make them faster. Statist. Sci. 28 424–446.
  • [12] Dalalyan, A. S. (2017). Theoretical guarantees for approximate sampling from smooth and log-concave densities. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 651–676.
  • [13] Dalalyan, A. S. and Tsybakov, A. B. (2012). Sparse regression learning by aggregation and Langevin Monte-Carlo. J. Comput. System Sci. 78 1423–1443.
  • [14] Durmus, A., Moulines, É. and Pereyra, M. Sampling from convex non continuously differentiable functions, when Moreau meets Langevin. In preparation.
  • [15] Eberle, A. (2015). Reflection couplings and contraction rates for diffusions. Probab. Theory Related Fields 1–36.
  • [16] Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. Wiley, New York.
  • [17] Grenander, U. (1983). Tutorial in Pattern Theory. Brown Univ., Providence, RI.
  • [18] Grenander, U. and Miller, M. I. (1994). Representations of knowledge in complex systems. J. Roy. Statist. Soc. Ser. B 56 549–603.
  • [19] Holley, R. and Stroock, D. (1987). Logarithmic Sobolev inequalities and stochastic Ising models. J. Stat. Phys. 46 1159–1194.
  • [20] Ikeda, N. and Watanabe, S. (1989). Stochastic Differential Equations and Diffusion Processes. North-Holland Mathematical Library. Elsevier, Amsterdam.
  • [21] Karatzas, I. and Shreve, S. E. (1991). Brownian Motion and Stochastic Calculus. Springer, New York.
  • [22] Kullback, S. (1997). Information Theory and Statistics. Dover, Mineola, NY.
  • [23] Lamberton, D. and Pagès, G. (2002). Recursive computation of the invariant distribution of a diffusion. Bernoulli 8 367–405.
  • [24] Lamberton, D. and Pagès, G. (2003). Recursive computation of the invariant distribution of a diffusion: The case of a weakly mean reverting drift. Stoch. Dyn. 3 435–451.
  • [25] Lemaire, V. (2005). Estimation de la mesure invariante d’un processus de diffusion. Ph.D. thesis, Univ. Paris-Est.
  • [26] Lemaire, V. and Menozzi, S. (2010). On some non asymptotic bounds for the Euler scheme. Electron. J. Probab. 15 1645–1681.
  • [27] Lindvall, T. and Rogers, L. C. G. (1986). Coupling of multidimensional diffusions by reflection. Ann. Probab. 14 860–872.
  • [28] Lovász, L. and Vempala, S. (2007). The geometry of logconcave functions and sampling algorithms. Random Structures Algorithms 30 307–358.
  • [29] Mattingly, J. C., Stuart, A. M. and Higham, D. J. (2002). Ergodicity for SDEs and approximations: Locally Lipschitz vector fields and degenerate noise. Stochastic Process. Appl. 101 185–232.
  • [30] Meyn, S. and Tweedie, R. (2009). Markov Chains and Stochastic Stability, 2nd ed. Cambridge Univ. Press, New York.
  • [31] Meyn, S. P. and Tweedie, R. L. (1993). Stability of Markovian processes III: Foster-Lyapunov criteria for continuous-time processes. Adv. in Appl. Probab. 25 518–548.
  • [32] Meyn, S. P. and Tweedie, R. L. (1993). Stability of Markovian processes. III. Foster–Lyapunov criteria for continuous-time processes. Adv. in Appl. Probab. 25 518–548.
  • [33] Nesterov, Y. (2004). Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Boston, MA.
  • [34] Parisi, G. (1981). Correlation functions and computer simulations. Nuclear Phys. B 180 378–384.
  • [35] Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2 341–363.
  • [36] Roberts, G. O. and Tweedie, R. L. (2000). Rates of convergence of stochastically monotone and continuous time Markov models. J. Appl. Probab. 37 359–373.
  • [37] Talay, D. and Tubaro, L. (1991). Expansion of the global error for numerical schemes solving stochastic differential equations. Stoch. Anal. Appl. 8 483–509.