Annals of Applied Probability

Optimal scaling of MaLa for nonlinear regression

Laird Arnault Breyer, Mauro Piccioni, and Sergio Scarlatti

Full-text: Open access


We address the problem of simulating efficiently from the posterior distribution over the parameters of a particular class of nonlinear regression models using a Langevin–Metropolis sampler. It is shown that as the number N of parameters increases, the proposal variance must scale as N1/3 in order to converge to a diffusion. This generalizes previous results of Roberts and Rosenthal [J. R. Stat. Soc. Ser. B Stat. Methodol. 60 (1998) 255–268] for the i.i.d. case, showing the robustness of their analysis.

Article information

Ann. Appl. Probab., Volume 14, Number 3 (2004), 1479-1505.

First available in Project Euclid: 13 July 2004

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60F17: Functional limit theorems; invariance principles
Secondary: 60F05: Central limit and other weak theorems 60F10: Large deviations

Bayesian nonlinear regression Markov chain Monte Carlo Hastings–Metropolis Langevin diffusion propagation of chaos


Breyer, Laird Arnault; Piccioni, Mauro; Scarlatti, Sergio. Optimal scaling of MaLa for nonlinear regression. Ann. Appl. Probab. 14 (2004), no. 3, 1479--1505. doi:10.1214/105051604000000369.

Export citation


  • Ben Arous, G. and Brunaud, M. (1990). Methode de Laplace: Etude variationelle des fluctuations de diffusions de type “champs moyen.” Stochastics 31 79–144.\goodbreak
  • Breyer, L. A. and Roberts, G. O. (2000). From Metropolis to diffusions: Gibbs states and optimal scaling. Stochastic Process. Appl. 90 181–206.
  • Christensen, O. F., Roberts, G. O. and Rosenthal, J. S. (2003). Scaling limits for the transient phase of local Metropolis–Hastings algorithms. Available at
  • Csiszar, I. (1975). I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3 146–158.
  • Csiszar, I. (1984). Sanov property, generalized I-projection and a conditional limit theorem. Ann. Probab. 12 768–793.
  • Dembo, A. and Zeitouni, O. (1998). Large Deviations Techniques and Applications. Springer, New York.
  • Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes. Characterization and Convergence. Wiley, New York.
  • Gamboa, F. (1999). New Bayesian methods for ill-posed problems. Statist. Decisions 17 315–337.
  • Kusuoka, S. and Tamura, Y. (1984). Gibbs measures for mean field potentials. J. Fac. Sci. Tokyo Sect. IA Math. 31 223–245.
  • Neal, R. (1996). Bayesian Learning for Neural Networks. Springer, New York.
  • Petrov, V. V. (1995). Limit Theorems of Probability Theory. Oxford Univ. Press.
  • Piccioni, M. and Scarlatti, S. (2000). Mean field models and propagation of chaos in feedforward neural networks. In Stochastic Processes, Physics and Geometry. New Interplays I (F. Gesztesy, ed.). Amer. Math. Soc., Providence, RI.
  • Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • Roberts, G. O. and Rosenthal, J. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.
  • Roberts, G. O. and Rosenthal, J. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of Langevin diffusions and their discrete approximations. Bernoulli 2 341–363.
  • Rogers, L. C. G. and Williams, D. (1987). Diffusions, Markov Processes, and Martingales II. Wiley, Chichester.
  • Sznitman, A. (1989). Topics in propagation of chaos. Ecole d'Eté de Probabilites de Saint-Flour XIX. Lecture Notes in Math. 1464 165–251. Springer, Berlin.