Brazilian Journal of Probability and Statistics

CADEM: A conditional augmented data EM algorithm for fitting one parameter probit models

C. L. N. Azevedo and D. F. Andrade

Full-text: Open access


In this article we develop an estimation method based on the augmented data scheme and EM/SEM (Stochastic EM) algorithms for fitting one-parameter probit (Rasch) IRT (Item Response Theory) models. Instead of using the S steps of the SEM algorithm, that is, instead of simulating values for the unobserved variables (augmented data and the latent traits), we consider the conditional expectations of a set of unobserved variables on the other set of unobserved variables, the current estimates of the parameters and the observed data, based on the full conditional distributions from the Gibbs sampling algorithm. Our method, named the CADEM algorithm (conditional augmented data EM), presents straightforward E steps, which avoid the need to evaluate the usual integrals, also facilitating the M steps, without the need to use numerical methods of optimization. We use the CADEM algorithm to obtain both maximum likelihood estimates and maximum a posteriori estimates of the difficulty parameters for the one-parameter probit (Rasch) model. Also, we obtain estimates for the latent traits, based on conditional expectations. In addition, we show how to calculate the associated standard errors. Some directions are provided to extend our approach to other IRT models. In this respect, we perform a simulation study to compare the estimation methods. The results indicated that our approach is quite comparable to the usual marginal maximum likelihood (MML) and Gibbs sampling methods (GS) in terms of parameter recovery. However, CADEM is as fast as MML and as flexible as GS.

Article information

Braz. J. Probab. Stat. Volume 27, Number 2 (2013), 245-262.

First available in Project Euclid: 21 February 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Item response models maximum likelihood Bayesian estimates augmented data EM algorithm


Azevedo, C. L. N.; Andrade, D. F. CADEM: A conditional augmented data EM algorithm for fitting one parameter probit models. Braz. J. Probab. Stat. 27 (2013), no. 2, 245--262. doi:10.1214/11-BJPS172.

Export citation


  • Albert, J. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational and Behavioral Statistics 17, 251–269.
  • Azevedo, C. L. N., Bolfarine, H. and Andrade, D. F. (2011). Bayesian inference for a skew-normal IRT model under the centred parameterization. Computational Statistics & Data Analysis 55, 353–365.
  • Andrade, D. F. and Tavares, H. R. (2005). Item response theory for longitudinal data: Population parameter estimation. Journal of Multivariate Analysis 51, 1–22.
  • Azevedo, C. L. N., Bolfarine, H. and Andrade, D. F. (2011). Parameter recovery for a skew normal IRT model under a Bayesian approach: Hierarchical framework, prior and kernel sensitivity and sample size. Journal of Statistical Computation and Simulation. To appear, DOI:10.1080/00949655.2011.591798.
  • Baker, F. B. and Seock-Ho, K. (2004). Item Response Theory: Parameter Estimation Techniques, 2nd ed. New York, NY: Marcel Dekker.
  • Bazan, J. L., Branco, M. D. and Bolfarine, H. (2006). A skew item response model. Bayesian Analysis 1, 861–892.
  • Béguin, A. A. B. and Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika 66, 541–561.
  • Bock, D. R. and Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM algorithm. Psychometrika 46, 433–459.
  • Bock, D. R. and Zimowski, M. F. (1997). The multiple groups IRT. In Handbook of Modern Item Response Theory (W. J. van der Linden and R. K. Hambleton, eds.). New York, NY: Springer-Verlag.
  • Chen, Ming-Hui and Dey, D. K. (1998). Bayesian modeling of correlated binary responses via scale mixture of multivariate normal link functions. Sankhya 60, 322–343.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society., Ser. B 39, 1–38.
  • Fonseca, T., Ferreira, M. A. and Migon, H. S. (2008). Objective Bayesian analysis for the Student-$t$ regression model. Biometrika 95(2), 325–333.
  • Fox, J.-P. (2003). Stochastic EM for estimating the parameters of a multilevel IRT model. British Journal of Mathematical and Statistical Psychology 56, 65–81.
  • Fox, J.-P. and Glas, C. A. W. (2001). Bayesian Estimation of a Multilevel IRT Model using Gibbs Sampling. Psychometrika 66, 269–286.
  • Liu, C., Rubin, D. B. and Wu, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika 85, 755–770.
  • Lous, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Ser. B 44, 226–233.
  • Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika 2, 177–195.
  • Nering, M. L. and Ostini, R. (2010). Handbook of Polytomous Item Response Theory Models. London: Routledge Academic.
  • Patz, J. R. and Junker, B. W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics 24, 146–178.
  • Patz, J. R. and Junker, B. W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics 24(4), 342–366.
  • Reckase, M. D. (2009). Multidimensional Item Response Theory. New York, NY: Springer-Verlag.