Bayesian Analysis

Model Criticism in Latent Space

Sohan Seth, Iain Murray, and Christopher K. I. Williams

Full-text: Open access


Model criticism is usually carried out by assessing if replicated data generated under the fitted model looks similar to the observed data, see e.g. Gelman, Carlin, Stern, and Rubin (2004, p. 165). This paper presents a method for latent variable models by pulling back the data into the space of latent variables, and carrying out model criticism in that space. Making use of a model's structure enables a more direct assessment of the assumptions made in the prior and likelihood. We demonstrate the method with examples of model criticism in latent space applied to factor analysis, linear dynamical systems and Gaussian processes.

Article information

Bayesian Anal., Volume 14, Number 3 (2019), 703-725.

First available in Project Euclid: 11 June 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

model criticism latent variable models factor analysis linear dynamical systems Gaussian processes

Creative Commons Attribution 4.0 International License.


Seth, Sohan; Murray, Iain; Williams, Christopher K. I. Model Criticism in Latent Space. Bayesian Anal. 14 (2019), no. 3, 703--725. doi:10.1214/18-BA1124.

Export citation


  • Bayarri, M. J. and Berger, J. O. (2000). “p-values for Composite Null Models.” Journal of the American Statistical Association, 95(452): 1127–1142.
  • Belin, T. R. and Rubin, D. B. (1995). “The Analysis of Repeated-Measures Data on Schizophrenic Reaction Times using Mixture Models.” Statistics in Medicine, 14(8): 747–768.
  • Box, G. E. (1980). “Sampling and Bayes’ Inference in Scientific Modelling and Robustness.” Journal of the Royal Statistical Society, 143(4): 383–430.
  • Box, G. E. P. and Draper, N. R. (1987). Empirical Model-Building and Response Surfaces. Wiley.
  • Breusch, T. S. and Pagan, A. R. (1979). “A Simple Test for Heteroscedasticity and Random Coefficient Variation.” Econometrica, 47(5): 1287–1294.
  • Buccigrossi, R. P. and Simoncelli, E. P. (1999). “Image Compression via Joint Statistical Characterization in the Wavelet Domain.” IEEE Transactions on Signal Processing., 8(12): 1688–1701.
  • Candy, J. V. (1986). Signal Processing: The Model Based Approach. McGraw-Hill.
  • Cook, S. R., Gelman, A., and Rubin, D. B. (2006). “Validation of Software for Bayesian Models Using Posterior Quantiles.” Journal of Computational and Graphical Statistics, 15(3): 675–692.
  • Durbin, J. and Watson, G. S. (1950). “Testing for Serial Correlation in Least Squares Regression: I.” Biometrika, 37(3/4): 409–428.
  • Fox, E., Sudderth, E. B., Jordan, M. I., and Willsky, A. S. (2009). “Nonparametric Bayesian Learning of Switching Linear Dynamical Systems.” In Advances in Neural Information Processing Systems 21, 457–464.
  • Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2004). Bayesian Data Analysis. London: Chapman and Hall. Second edition.
  • Gelman, A., Meng, X., and Stern, H. (1996). “Posterior Predictive Assessment of Model Fitness Via Realized Discrepancies.” Statistica Sinica, 733–807.
  • Gopalan, P., Hofman, J. M., and Blei, D. M. (2015). “Scalable Recommendation with Hierarchical Poisson Factorization.” In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, UAI, 326–335.
  • Hinton, G. E., Osindero, S., and Teh, Y. W. (2006). “A Fast Learning Algorithm for Deep Belief Nets.” Neural Computation, 18: 1527–1554.
  • Johnson, V. E. (2007). “Bayesian model assessment using pivotal quantities.” Bayesian Analysis, 2(4): 719–733.
  • Lloyd, J. R. and Ghahramani, Z. (2015). “Statistical Model Criticism Using Kernel Two Sample Tests.” In Advances in Neural Information Processing Systems.
  • Martin, D., Fowlkes, C., Tal, D., and Malik, J. (2001). “A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics.” In Proceedings of 8th International Conference on Computer Vision, volume 2, 416–423.
  • Meulders, M., Gelman, A., Van Mechelen, I., and De Boeck, P. (1998). “Generalizing the Probability Matrix Decomposition Model: an Example of Bayesian Model Checking and Model Expansion.” In Hox, J. J. and de Leeuw, E. D. (ed.), Assumptions, Robustness and Estimation Methods in Multivariate Modeling. Amsterdam: TT-Publikaties.
  • Oh, S. M., Rehg, J. M., Balch, T., and Dellaert, F. (2008). “Learning and Inferring Motion Patterns using Parametric Segmental Switching Linear Dynamic Systems.” International Journal of Computer Vision, 77(1): 103–124.
  • O’Hagan, A. (2003). “HSSS Model Criticism.” In Green, P. J., Hjort, N. L., and Richardson, S. (eds.), Highly Structured Stochastic Systems, 422–444. Oxford University Press.
  • Plummer, M. (2003). “JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling.” In Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003).
  • Rahman, N. A. (1968). A Course in Theoretical Statistics. Charles Griffin and Company.
  • Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. The MIT Press.
  • Ratmann, O., Andrieu, C., Wiuf, C., and Richardson, S. (2009). “Model criticism based on likelihood-free inference, with an application to protein network evolution.” Proceedings of the National Academy of Sciences, 106(26): 10576–10581.
  • Rubin, D. B. (1984). “Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician.” Annals of Statistics, 12: 1151–1172.
  • Salakhutdinov, R. and Mnih, A. (2008). “Bayesian Probabilistic Matrix Factorization using Markov chain Monte Carlo.” In Proceedings of the International Conference on Machine Learning, volume 25.
  • Tang, Y., Salakhutdinov, R., and Hinton, G. E. (2012). “Deep Mixtures of Factor Analysers.” In Proceedings of the 29th International Conference on Machine Learning.
  • Vanhatalo, J., Riihimäki, J., Hartikainen, J., Jylänki, P., Tolvanen, V., and Vehtari, A. (2013). “GPstuff: Bayesian Modeling with Gaussian Processes.” Journal of Machine Learning Research, 14(1): 1175–1179.
  • Wainwright, M. J. and Simoncelli, E. P. (2000). “Scale Mixtures of Gaussians and the Statistics of Natural Images.” In Advances in Neural Information Processing Systems, volume 12, 855–861.
  • White, H. (1980). “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrica, 48(4): 817–838.
  • Wilk, M. B. and Gnanadesikan, R. (1968). “Probability Plotting Methods for the Analysis of Data.” Biometrika, 55(1): 1–17.
  • Yuan, Y. and Johnson, V. E. (2012). “Goodness-of-fit diagnostics for Bayesian hierarchical models.” Biometrics, 68(1): 156–164.
  • Zoran, D. and Weiss, Y. (2012). “Natural Images, Gaussian Mixtures and Dead Leaves.” In Pereira, F., Burges, C. J. C., Bottou, L., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems 25, 1736–1744. Curran Associates, Inc.