Bayesian Analysis

A Bayesian structural equations model for multilevel data with missing responses and missing covariates

Ming-Hui Chen, Sonali Das, Sungduk Kim, and Nicholas Warren

Full-text: Open access


Motivated by a large multilevel survey conducted by the US Veterans Health Administration (VHA), we propose a structural equations model which involves a set of latent variables to capture dependence between different responses, a set of facility level random effects to capture facility heterogeneity and dependence between individuals within the same facility, and a set of covariates to account for individual heterogeneity. Identifiability associated with structural equations modeling is addressed and properties of the proposed model are carefully examined. An effective and practically useful modeling strategy is developed to deal with missing responses and to model missing covariates in the structural equations framework. Markov chain Monte Carlo sampling is used to carry out Bayesian posterior computation. Several variations of the proposed model are considered and compared via the deviance information criterion. A detailed analysis of the VHA all employee survey data is presented to illustrate the proposed methodology.

Article information

Bayesian Anal., Volume 3, Number 1 (2008), 197-224.

First available in Project Euclid: 22 June 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

DIC Latent variable Markov chain Monte Carlo Missing at Random Random effects VHA all employee survey data


Das, Sonali; Chen, Ming-Hui; Kim, Sungduk; Warren, Nicholas. A Bayesian structural equations model for multilevel data with missing responses and missing covariates. Bayesian Anal. 3 (2008), no. 1, 197--224. doi:10.1214/08-BA308.

Export citation


  • Ansari, A., Jedidi, K., and Jagpal, S. (2000). "A Hierarchical Bayesian Methodology for Treating Heterogeneity in Structural Equation Models." Marketing Science, 19: 328–347.
  • Arminger, G. and Küsters, U. (1988). "Latent Trait Models Indicators of Mixed Measurement Level." In Langeheine, R. and Rost, J. (eds.), Latent Trait and Latent Class Models, 51–73. New York: Plenum Press.
  • Asch, S. M., McGlynn, E. A., Hogan, M. M., Hayward, R. A., Shekelle, D. P., Rubenstein, L., Keesey, J., Adams, J., and Kerr, E. A. (2004). "Comparison of Quality of Care for Patients in the Veterans Health Administration and Patients in a National Sample." Annals of Internal Medicine, 141: 938–945.
  • Banker, R. D., Potter, G., and Schroeder, R. G. (1993). "Reporting Manufacturing Performance Measures to Workers: An Empirical Study." Journal of Management Accounting Research, 5: 33–55.
  • Bentler, P. M. and Wu, E. J. C. (2002). EQS6 for Windows. Encino, CA: Multivariate Software.
  • Bollen, K. A. (1987). "Total, Direct, and Indirect Effects in Structural Equation Models." Sociological Methodology, 17: 37–69.
  • –- (1989). Structural Equations with Latent Variables. New York: Wiley.
  • –- (2002). "Latent Variables in Psychology and the Social Sciences." Annual Review of Psychology, 53: 605–634.
  • Celeux, G., Forbes, F., Robert, C. P., and Titterington, D. M. (2006). "Deviance Information Criteria for Missing Data Models (with Discussion)." Bayesian Analysis, 1: 651–674.
  • Chen, M.-H., Shao, Q.-M., and Ibrahim, J. G. (2000). Monte Carlo Methods in Bayesian Computation. New York: Springer-Verlag.
  • Chen, Q., Ibrahim, J. G., Chen, M.-H., and Senchaudhuri, P. (2008). "Theory and Inference for Regression Models with Missing Response and Covariates." Journal of Multivariate Analysis, 99: in press.
  • Cowles, C. and Carlin, B. P. (1996). "Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review." Journal of the American Statistical Association, 91: 883–904.
  • Dunson, D. B. (2000). "Bayesian Latent Variable Models for Clustered Mixed Outcomes." Journal of the Royal Statistical Society, Series B, 62: 355–366.
  • Dunson, D. B. and Perreault, S. D. (2001). "Factor Analytic Models of Clustered Multivariate Data with Information Censoring." Biometrics, 57: 302–308.
  • Gelman, A. (2006). "Prior Distributions for Variance Parameters in Hierarchical Models (Comment on Article by Browne and Draper)." Bayesian Analysis, 1: 515–534.
  • Gilks, W. R. and Wild, P. (1992). "Adaptive Rejection Sampling for Gibbs Sampling." Applied Statistics, 41: 337–348.
  • Greenfield, S. and Kaplan, S. H. (2004). "Creating a Culture of Quality: the Remarkable Transformation of the Department of Veterans Affairs Health Care System." Annals of Internal Medicine, 141: 316–318.
  • Hatcher, L. (2000). A Step-by-Step Approach to Using the SAS System for Factor Analysis and Structural Equation Modeling. Cary, NC: SAS Institute Inc.
  • Huang, L., Chen, M.-H., and Ibrahim, J. G. (2005). "Bayesian Analysis for Generalized Linear Models with Nonignorably Missing Covariates." Biometrics, 61: 767–780.
  • Ibrahim, J. G., Chen, M.-H., Lipsitz, S. R., and Herring, A. H. (2005). "Missing Data Methods in Regression Models." Journal of the American Statistical Association, 100: 332–346.
  • Ibrahim, J. G., Lipsitz, S. R., and Chen, M.-H. (1999). "Missing Covariates in Generalized Linear Models When the Missing Data Mechanism Is Nonignorable." Journal of the Royal Statistical Society, Series B, 61: 173–190.
  • Jha, A. K., Perlin, B., Kenneth, W. K., and A., D. R. (2003). "Effect of the Transformation of the Veterans Affairs Health Care System on the Quality of Care." The New England Journal of Medicine, 348: 2218–2227.
  • Jöreskog, K. G. and Sörbom, D. (1996). LISREL 8: Structural Equation Modelling with SIMPLIS Command Language. Hove and London: Scientific Software International.
  • Lee, S. and Shi, J. (2001). "Maximum Likelihood Estimation of Two-Level Latent Variable Models with Mixed Continuous and Polytomous Data." Biometrics, 57: 787–794.
  • Lee, S. and Song, X. (2004a). "Maximum Likelihood Analysis of a General Latent Variable Model with Hierarchically Mixed Data." Biometrics, 60: 624–636.
  • –- (2004b). "Bayesian Model Comparison of Nonlinear Structural Equation Models with Missing Continuous and Ordinal Categorical Data." British Journal of Mathematical and Statistical Psychology, 57: 131–150.
  • Lipsitz, S. R. and Ibrahim, J. G. (1996). "A Conditional Model for Incomplete Covariates in Parametric Regression Models." Biometrika, 83: 916–922.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data. New York: Wiley, 2nd edition.
  • Muthén, B. (2002). "Beyond SEM: General Latent Variable Modeling." Behaviormetrika, 29: 81–117.
  • Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2004). "Generalized Multilevel Structural Equation Modelling." Psychometrika, 69: 167–190.
  • Rubin, D. B. (1976). "Inference and Missing Data." Biometrika, 63: 581–592.
  • Sammel, M. D. and Ryan, L. M. (1996). "Latent Variable Models with Fixed Effects." Biometrics, 52: 650–663.
  • Sammel, M. D., Ryan, L. M., and Legler, J. M. (1997). "Latent Variable Models for Mixed Discrete and Continuous Outcomes." Journal of the Royal Statistical Society, Series B, 59: 667–678.
  • Sánchez, B. N., Budtz-Jørgensen, E., Ryan, L. M., and Hu, H. (2005). "Structural Equations Models: A Review with Applications to Environment Epidemiology." Journal of the American Statistical Association, 100: 1443–1455.
  • Scheines, R., Hoijtink, H., and Boomsma, A. (1999). "Bayesian Estimation and Testing of Structural Equation Models." Psychometrika, 64: 37–52.
  • Song, X. and Lee, S. (2004). "Bayesian Analysis of Two-Level Nonlinear Structural Equation Models with Continuous and Polytomous Data." British Journal of Mathematical and Statistical Psychology, 57: 29–52.
  • Spiegelhalter, D. J., Best, N. G., Carlin., B. P., and van der Linde, A. (2002). "Bayesian Measures of Model Complexity and Fit." Journal of the Royal Statistical Society, Series B, 62: 583–640.