Statistical Science

Cross-Fertilizing Strategies for Better EM Mountain Climbing and DA Field Exploration: A Graphical Guide Book

David A. van Dyk and Xiao-Li Meng

Full-text: Open access

Abstract

In recent years, a variety of extensions and refinements have been developed for data augmentation based model fitting routines. These developments aim to extend the application, improve the speed and/or simplify the implementation of data augmentation methods, such as the deterministic EM algorithm for mode finding and stochastic Gibbs sampler and other auxiliary-variable based methods for posterior sampling. In this overview article we graphically illustrate and compare a number of these extensions, all of which aim to maintain the simplicity and computation stability of their predecessors. We particularly emphasize the usefulness of identifying similarities between the deterministic and stochastic counterparts as we seek more efficient computational strategies. We also demonstrate the applicability of data augmentation methods for handling complex models with highly hierarchical structure, using a high-energy high-resolution spectral imaging model for data from satellite telescopes, such as the Chandra X-ray Observatory.

Article information

Source
Statist. Sci., Volume 25, Number 4 (2010), 429-449.

Dates
First available in Project Euclid: 14 March 2011

Permanent link to this document
https://projecteuclid.org/euclid.ss/1300108229

Digital Object Identifier
doi:10.1214/09-STS309

Mathematical Reviews number (MathSciNet)
MR2807762

Zentralblatt MATH identifier
1329.62040

Keywords
AECM blocking collapsing conditional augmentation ECM ECME efficient augmentation data augmentation Gibbs Sampling marginal augmentation model reduction NEM nesting

Citation

van Dyk, David A.; Meng, Xiao-Li. Cross-Fertilizing Strategies for Better EM Mountain Climbing and DA Field Exploration: A Graphical Guide Book. Statist. Sci. 25 (2010), no. 4, 429--449. doi:10.1214/09-STS309. https://projecteuclid.org/euclid.ss/1300108229


Export citation

References

  • Amit, Y. (1991). On rates of convergence of scholastic relaxation for Gaussian and non-Gaussian distributions. J. Multiple Anal. 38 82–89.
  • Besag, J. and Green, P. J. (1993). Spatial statistics and Bayesian computation. J. Roy. Statist. Soc. Ser. B 55 25–37.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1–37.
  • Elvis, M., Matsuoka, M., Siemiginowska, A., Fiore, F., Mihara, T. and Brinkmann, W. (1994). An ASCA GIS spectrum of S5 0014+813 at z=3.384. The Astrophysical Journal 436 L55–L58.
  • Fessler, J. A. and Hero, A. O. (1994). Space-alternating generalized expectation-maximization algorithm. IEEE Trans. Signal Process. 42 2664–2677.
  • Fessler, J. A. and Hero, A. O. (1995). Penalized maximum-likelihood image reconstruction using space-alternating generalized EM algorithm. IEEE Trans. Image Process. 4 1417–1438.
  • Foulley, J.-L. and van Dyk, D. A. (2000). The PX-EM algorithm for fast stable fitting of Henderson’s mixed model. Genetics Selective Evolution 32 143–163.
  • Gelfand, A. E., Sahu, S. K. and Carlin, B. P. (1995). Efficient parameterization for normal linear mixed models. Biometrika 82 479–488.
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2003). Bayesian Data Analysis, 2nd ed. Chapman & Hall, London.
  • Gelman, A., van Dyk, D. A., Huang, Z. and Boscardin, W. J. (2008). Transformation and parameter-expanded Gibbs samplers for multilevel and generalized linear models. J. Comput. Graph. Statist. 17 95–122.
  • Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Analysis and Machine Intelligence 6 721–741.
  • Ghosh, J. and Dunson, D. (2009). Default priors and efficient posterior computation in Bayesian factor analysis. J. Comput. Graph. Statist. 18 306–320.
  • Green, P. J. (1990). On use of the EM algorithm for penalized likelihood estimation. J. Roy. Statist. Soc. Ser. B 52 443–452.
  • Hans, C. M. and van Dyk, D. A. (2003). Accounting for absorption lines in high energy spectra. In Statistical Challenges in Modern Astronomy III (E. Feigelson and G. Babu, eds.) 429–430. Springer, New York.
  • Higdon, D. M. (1998). Auxiliary variable methods for Markov chain Monte Carlo with applications. J. Amer. Statist. Assoc. 93 585–595.
  • Hobert, J. P. (2001). Discussion of “The art of data augmentation,” by D. A. van Dyk and X. L. Meng. J. Comput. Graph. Statist. 10 59–68.
  • Hobert, J. P. and Marchev, D. (2008). A theoretical comparison of the data augmentation, marginal augmentation and PX-DA algorithms. Ann. Statist. 36 532–554.
  • Imai, K. and van Dyk, D. A. (2005a). A Bayesian analysis of the multinomial probit model using marginal data augmentation. J. Econometrics 124 311–334.
  • Imai, K. and van Dyk, D. A. (2005b). MNP: R package for fitting multinomial the probit model. J. Statist. Software 14.
  • Liu, C. and Rubin, D. B. (1994). The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika 81 633–648.
  • Liu, C. and Rubin, D. B. (1995). ML estimation of the t distribution using EM and its extensions, ECM and ECME. Statist. Sinica 5 19–39.
  • Liu, C., Rubin, D. B. and Wu, Y. N. (1998). Parameter expansion for EM acceleration—the PXEM algorithm. Biometrika 75 755–770.
  • Liu, J. S. (1994). The fraction of missing information and convergence rate for data augmentation. In Computing Science and Statistics. Computationally Intensive Statistical Methods. Proceedings of the 26th Symposium on the Interface 490–497. Interface Foundation of North America, Fairfax Station, VA.
  • Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. Springer, New York.
  • Liu, J. S., Wong, W. H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to comparisons of estimators and augmentation schemes. Biometrika 81 27–40.
  • Liu, J. S. and Wu, Y. N. (1999). Parameter expansion for data augmentation. J. Amer. Statist. Assoc. 94 1264–1274.
  • Marchev, D. and Hobert, J. P. (2004). Geometric ergodicity of van Dyk and Meng’s algorithm for the multivariate student’s t model. J. Amer. Statist. Assoc. 99 228–238.
  • Meng, X.-L. (1994). On the rate of convergence of the ECM algorithm. Ann. Statist. 22 326–339.
  • Meng, X.-L. (1997). The EM algorithm and medical studies: A historical link. Stat. Methods Med. Res. 6 3–23.
  • Meng, X.-L. and Rubin, D. B. (1991). Using EM to obtain asymptotic variance–covariance matrices: The SEM algorithm. J. Amer. Statist. Assoc. 86 899–909.
  • Meng, X.-L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80 267–278.
  • Meng, X.-L. and Rubin, D. B. (1994). On the global and componentwise rates of convergence of the EM algorithm. Linear Algebra Appl. 199 413–425.
  • Meng, X.-L. and van Dyk, D. A. (1997). The EM algorithm—an old folk song sung to a fast new tune (with discussion). J. Roy. Statist. Soc. Ser. B 59 511–567.
  • Meng, X.-L. and van Dyk, D. A. (1998). Fast EM implementations for mixed-effects models. J. Roy. Statist. Soc. Ser. B 60 559–578.
  • Meng, X.-L. and van Dyk, D. A. (1999). Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86 301–320.
  • Navidi, W. (1997). A graphical illustration of the EM algorithm. Amer. Statist. 51 29–31.
  • Park, T. and van Dyk, D. A. (2009). Partially collapsed Gibbs samplers: Illustrations and applications. J. Comput. Graph. Statist. 18 283–305.
  • Park, T., van Dyk, D. A. and Siemiginowska, A. (2008). Searching for narrow emission lines in X-ray spectra: Computation and methods. The Astrophysical Journal 688 807–825.
  • Pilla, R. S. and Lindsay, B. G. (2001). Alternative EM methods for nonparametric finite mixture models. Biometrika 88 535–550.
  • Pope, C. A. and Wong, Y. (2005). Nested Monte Carlo EM algorithm for switching state-space models. IEEE Trans. Knowledge Data Engineering 17 1653–1663.
  • Protassov, R., van Dyk, D. A., Connors, A., Kashyap, V. and Siemiginowska, A. (2002). Statistics: Handle with care—detecting multiple model components with the likelihood ratio test. The Astrophysical Journal 571 545–559.
  • Roberts, G. O. (1996). Markov chain concepts related to sampling algorithms. In Markov Chain Monte Carlo in Practice (W. R. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 45–57. Chapman & Hall, London.
  • Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. Chapman & Hall, London.
  • Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). J. Amer. Statist. Assoc. 82 528–550.
  • Tierney, L. (1994). Markov chains for exploring posterior distributions (with discussion). Ann. Statist. 22 1701–1762.
  • Tierney, L. (1996). Introduction to general state-space Markov chain theory. In Markov Chain Monte Carlo in Practice (W. R. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 59–74. Chapman & Hall, London.
  • Vaida, F. (2005). Convergence of the EM and MM algorithms. Statist. Sinica 15 831–840.
  • van Dyk, D. and Park, T. (2004). Efficient EM-type algorithms for fitting spectral lines in high-energy astrophysics. In Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives: Contributions by Donald Rubin’s Statistical Family (A. Gelman and X.-L. Meng, eds.) 285–296. Wiley, New York.
  • van Dyk, D. and Park, T. (2008). Partially collapsed Gibbs samplers: Theory and methods. J. Amer. Statist. Assoc. 103 790–796.
  • van Dyk, D. A. (2000a). Fitting mixed-effects models using efficient EM-type algorithms. J. Comput. Graph. Statist. 9 78–98.
  • van Dyk, D. A. (2000b). Nesting EM algorithms for computational efficiency. Statist. Sinica 10 203–225.
  • van Dyk, D. A. (2009). Marginal MCMC Methods. Statist. Sinica. To appear.
  • van Dyk, D. A., Connors, A., Esch, D. N., Freeman, P., Kang, H., Karovska, M., Kashyap, V., Siemiginowska, A. and Zezas, A. (2006). Deconvolution in high-energy astrophysics: Science, instrumentation, and methods. Bayesian Anal. 1 189–236.
  • van Dyk, D. A., Connors, A., Kashyap, V. and Siemiginowska, A. (2001). Analysis of energy spectra with low photon counts via Bayesian posterior simulation. The Astrophysical Journal 548 224–243.
  • van Dyk, D. A. and Kang, H. (2004). Highly structured models for spectral analysis in high-energy astrophysics. Statist. Sci. 19 275–293.
  • van Dyk, D. A. and Meng, X.-L. (2001). The art of data augmentation (with discussion). J. Comput. Graph. Statist. 10 1–111.
  • van Dyk, D. A., Meng, X.-L. and Rubin, D. B. (1995). Maximum likelihood estimation via the ECM algorithm: Computing the asymptotic variance. Statist. Sinica 5 55–75.
  • van Dyk, D. A. and Tang, R. (2003). The one-step-late PXEM algorithm. Stat. Comput. 13 137–152.
  • Wei, G. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. J. Amer. Statist. Assoc. 85 699–704.
  • Wu, C. F. J. (1983). On the convergence properties of the EM algorithms. Ann. Statist. 11 95–103.
  • Yu, Y. and Meng, X.-L. (2010). To center or not to center: That is not the question—An ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency (with discussion). J. Comput. Graph. Statist. To appear.