The Annals of Statistics

Convergence of the Monte Carlo expectation maximization for curved exponential families

Gersende Fort and Eric Moulines

Full-text: Open access


The Monte Carlo expectation maximization (MCEM) algorithm is a versatile tool for inference in incomplete data models, especially when used in combination with Markov chain Monte Carlo simulation methods. In this contribution, the almost-sure convergence of the MCEM algorithm is established. It is shown, using uniform versions of ergodic theorems for Markov chains, that MCEM converges under weak conditions on the simulation kernel. Practical illustrations are presented, using a hybrid random walk Metropolis Hastings sampler and an independence sampler. The rate of convergence is studied, showing the impact of the simulation schedule on the fluctuation of the parameter estimate at the convergence. A novel averaging procedure is then proposed to reduce the simulation variance and increase the rate of convergence.

Article information

Ann. Statist. Volume 31, Number 4 (2003), 1220-1259.

First available: 31 July 2003

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 65C05: Monte Carlo methods 62-04: Explicit machine computation and programs (not the theory of computation or programming)
Secondary: 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)

EM algorithm Monte Carlo EM algorithm Metropolis Hastings algorithms averaging procedure


Fort, Gersende; Moulines, Eric. Convergence of the Monte Carlo expectation maximization for curved exponential families. The Annals of Statistics 31 (2003), no. 4, 1220--1259. doi:10.1214/aos/1059655912.

Export citation


  • BISCARAT, J. (1994). Almost sure convergence of a class of stochastic algorithms. Stochastic Process. Appl. 50 83-94.
  • BOOTH, J. and HOBERT, J. (1999). Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 265-285.
  • BRANDIÈRE, O. (1998). The dy namic sy stem method and the traps. Adv. in Appl. Probab. 30 137-151.
  • BRÖCKER, T. (1975). Differentiable Germs and Catastrophes. Cambridge Univ. Press.
  • CAPPÉ, O., DOUCET, A., LAVIELLE, M. and MOULINES, E. (1999). Simulation-based methods for blind maximum-likelihood filter identification. Signal Processing 73 3-25.
  • CELEUX, G. and DIEBOLT, J. (1992). A stochastic approximation ty pe EM algorithm for the mixture problem. Stochastics Stochastics Rep. 41 119-134.
  • CHAN, J. and KUK, A. (1997). Maximum likelihood estimation for probit-linear mixed models with correlated random effects. Biometrics 53 86-97.
  • CHAN, K. and LEDOLTER, J. (1995). Monte Carlo EM estimation for time series models involving counts. J. Amer. Statist. Assoc. 90 242-252.
  • CHEN, H., GUO, L. and GAO, A. (1988). Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds. Stochastic Process. Appl. 27 217-231.
  • DALAL, S. and WEERAHANDI, S. (1992). Some approximations for the moments of a process used in diffusion of new products. Statist. Probab. Lett. 15 181-189.
  • DALAL, S. and WEERAHANDI, S. (1995). Estimation of innovation diffusion models with application to a consumer durable. Marketing Letters 6 123-136.
  • DELy ON, B., LAVIELLE, M. and MOULINES, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Ann. Statist. 27 94-128.
  • DEMPSTER, A. P., LAIRD, N. M. and RUBIN, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1-38.
  • FORT, G., MOULINES, E., ROBERTS, G. O. and ROSENTHAL, J. S. (2003). On the geometric ergodicity of hy brid samplers. J. Appl. Probab. 40 123-146.
  • GUO, S. and THOMPSON, E. (1991). Monte-Carlo estimation of variance component models for large complex pedigrees. IMA Journal of Mathematics Applied in Medicine and Biology 8 171-189.
  • HALL, P. and HEy DE, C. C. (1980). Martingale Limit Theory and Its Application. Academic Press, New York.
  • JAMSHIDIAN, M. and JENNRICH, R. (1997). Acceleration of the EM algorithm by using quasiNewton methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 59 569-587.
  • MENG, X. and SCHILLING, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. J. Amer. Statist. Assoc. 91 1254-1267.
  • MEy N, S. P. and TWEEDIE, R. L. (1993). Markov Chains and Stochastic Stability. Springer, London.
  • PIERRE-LOTI-VIAUD, D. (1995). Random perturbations of recursive sequences with an application to an epidemic model. J. Appl. Probab. 32 559-578.
  • PÓLy A, G. and SZEGÖ, G. (1976). Problems and Theorems in Analy sis 2. Springer, New York.
  • POLy AK, B. (1990). New stochastic approximation ty pe procedures. Automat. Remote Control 51 937-946.
  • SHAPIRO, A. and WARDI, Y. (1996). Convergence analysis of stochastic algorithms. Math. Oper. Res. 21 615-628.
  • SHERMAN, R., HO, Y. and DALAL, S. (1999). Conditions for convergence of Monte Carlo EM sequences with an application to product diffusion modeling. Econom. J. 2 248-267.
  • TANNER, M. A. (1996). Tools for Statistical Inference, 3rd ed. Springer, New York.
  • WEI, G. and TANNER, M. (1990). A Monte-Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms. J. Amer. Statist. Assoc. 85 699-704.
  • WU, C. (1983). On the convergence properties of the EM algorithm. Ann. Statist. 11 95-103.
  • ZEGER, S. L. (1988). A regression model for time series of counts. Biometrika 75 621-629.