Statistical Science

On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables

Paul Gustafson

Full-text: Open access


When a candidate model for data is nonidentifiable, conventional wisdom dictates that the model must be simplified somehow so as to gain identifiability. We explore two scenarios involving mismeasured variables where, in fact, model expansion, as opposed to model contraction, might be used to obtain identifiability. We compare the merits of model contraction and model expansion. We also investigate whether it is necessarily a good idea to alter the model for the sake of identifiability. In particular, estimators obtained from identifiable models are compared to those obtained from nonidentifiable models in tandem with crude prior distributions. Both asymptotic theory and simulations with Markov chain Monte Carlo-based estimators are used to draw comparisons. A technical point which arises is that the asymptotic behavior of a posterior mean from a nonidentifiable model can be investigated using standard asymptotic theory, once the posterior mean is described in terms of the identifiable part of the model only.

Article information

Statist. Sci. Volume 20, Number 2 (2005), 111-140.

First available in Project Euclid: 14 July 2005

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayes analysis identifiability measurement error misclassification nested models prior information


Gustafson, Paul. On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables. Statist. Sci. 20 (2005), no. 2, 111--140. doi:10.1214/088342305000000098.

Export citation


  • Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, Chichester.
  • Black, M. A. and Craig, B. A. (2002). Estimating disease prevalence in the absence of a gold standard. Statistics in Medicine 21 2653--2669.
  • Brenner, H. (1996). How independent are multiple ``independent'' diagnostic classifications? Statistics in Medicine 15 1377--1386.
  • Carroll, R. J., Ruppert, D. and Stefanski, L. A. (1995). Measurement Error in Nonlinear Models. Chapman and Hall/CRC, Boca Raton, FL.
  • Dawid, A. P. (1979). Conditional independence in statistical theory (with discussion). J. Roy. Statist. Soc. Ser. B 41 1--31.
  • Dendukuri, N. and Joseph, L. (2001). Bayesian approaches to modelling the conditional dependence between multiple diagnostic tests. Biometrics 57 158--167.
  • Drews, C. D., Flanders, W. D. and Kosinski, A. S. (1993). Use of two data sources to estimate odds-ratios in case-control studies. Epidemiology 4 327--335.
  • Fryback, D. G. (1978). Bayes' theorem and conditional nonindependence of data in medical diagnosis. Computers and Biomedical Research 11 423--434.
  • Gelfand, A. E. and Sahu, S. K. (1999). Identifiability, improper priors, and Gibbs sampling for generalized linear models. J. Amer. Statist. Assoc. 94 247--253.
  • Georgiadis, M. P., Johnson, W. O., Gardner, I. A. and Singh, R. (2003). Correlation-adjusted estimation of sensitivity and specificity of two diagnostic tests. Appl. Statist. 52 63--76.
  • Gustafson, P. (2002). On the simultaneous effects of model misspecification and errors-in-variables. Canad. J. Statist. 30 463--474.
  • Gustafson, P. (2005). The utility of prior information and stratification for parameter estimation with two screening tests but no gold standard. Statistics in Medicine 24 1203--1217.
  • Gustafson, P., Le, N. D. and Saskin, R. (2001). Case-control analysis with partial knowledge of exposure misclassification probabilities. Biometrics 57 598--609.
  • Hobert, J. P. and Casella, G. (1996). The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J. Amer. Statist. Assoc. 91 1461--1473.
  • Huang, Y. H. S. and Huwang, L. (2001). On the polynomial structural relationship. Canad. J. Statist. 29 495--512.
  • Hui, S. L. and Walter, S. D. (1980). Estimating the error rates of diagnostic tests. Biometrics 36 167--171.
  • Johnson, W. O. and Gastwirth, J. L. (1991). Bayesian inference for medical screening tests: Approximations useful for the analysis of acquired immune deficiency syndrome. J. Roy. Statist. Soc. Ser. B 53 427--439.
  • Johnson, W. O., Gastwirth, J. L. and Pearson, L. M. (2001). Screening without a ``gold-standard'': The Hui--Walter paradigm revisited. American J. Epidemiology 153 921--924.
  • Joseph, L., Gyorkos, T. and Coupal, L. (1995). Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. American J. Epidemiology 141 263--272.
  • Kass, R. E. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc. 90 928--934.
  • Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • Neath, A. A. and Samaniego, F. J. (1997). On the efficacy of Bayesian inference for nonidentifiable models. Amer. Statist. 51 225--232.
  • Palatini, P., Pessina, A. C. and Dal Palu, C. (1993). The hypertension and ambulatory recording venetia study (HARVEST): A trial on the predictive value of ambulatory blood pressure monitoring for the development of fixed hypertension in patients with borderline hypertension. High Blood Pressure 2 11--18.
  • Qu, Y., Tan, M. and Kutner, M. H. (1996). Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 52 797--810.
  • Schork, M. A. and Remington, R. D. (2000). Statistics with Applications to the Biological and Health Sciences, 3rd ed. Prentice--Hall, Upper Saddle River, NJ.
  • Torrance-Rynard, V. L. and Walter, S. D. (1997). Effects of dependent errors in the assessment of diagnostic test performance. Statistics in Medicine 16 2157--2175.
  • Vacek, P. M. (1985). The effect of conditional dependence on the evaluation of diagnostic tests. Biometrics 41 959--968.
  • White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50 1--25.