Statistical Science

On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables

Paul Gustafson

Source: Statist. Sci. Volume 20, Number 2 (2005), 111-140.

Abstract

When a candidate model for data is nonidentifiable, conventional wisdom dictates that the model must be simplified somehow so as to gain identifiability. We explore two scenarios involving mismeasured variables where, in fact, model expansion, as opposed to model contraction, might be used to obtain identifiability. We compare the merits of model contraction and model expansion. We also investigate whether it is necessarily a good idea to alter the model for the sake of identifiability. In particular, estimators obtained from identifiable models are compared to those obtained from nonidentifiable models in tandem with crude prior distributions. Both asymptotic theory and simulations with Markov chain Monte Carlo-based estimators are used to draw comparisons. A technical point which arises is that the asymptotic behavior of a posterior mean from a nonidentifiable model can be investigated using standard asymptotic theory, once the posterior mean is described in terms of the identifiable part of the model only.

Keywords: Bayes analysis; identifiability; measurement error; misclassification; nested models; prior information

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ss/1121347636
Digital Object Identifier: doi:10.1214/088342305000000098
Mathematical Reviews number (MathSciNet): MR2183445
Zentralblatt MATH identifier: 1087.62037

References

Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, Chichester.
Mathematical Reviews (MathSciNet): MR1274699
Zentralblatt MATH: 0796.62002
Black, M. A. and Craig, B. A. (2002). Estimating disease prevalence in the absence of a gold standard. Statistics in Medicine 21 2653--2669.
Brenner, H. (1996). How independent are multiple ``independent'' diagnostic classifications? Statistics in Medicine 15 1377--1386.
Carroll, R. J., Ruppert, D. and Stefanski, L. A. (1995). Measurement Error in Nonlinear Models. Chapman and Hall/CRC, Boca Raton, FL.
Mathematical Reviews (MathSciNet): MR1630517
Zentralblatt MATH: 0853.62048
Dawid, A. P. (1979). Conditional independence in statistical theory (with discussion). J. Roy. Statist. Soc. Ser. B 41 1--31.
Mathematical Reviews (MathSciNet): MR535541
Dendukuri, N. and Joseph, L. (2001). Bayesian approaches to modelling the conditional dependence between multiple diagnostic tests. Biometrics 57 158--167.
Mathematical Reviews (MathSciNet): MR1833302
Digital Object Identifier: doi:10.1111/j.0006-341X.2001.00158.x
Drews, C. D., Flanders, W. D. and Kosinski, A. S. (1993). Use of two data sources to estimate odds-ratios in case-control studies. Epidemiology 4 327--335.
Fryback, D. G. (1978). Bayes' theorem and conditional nonindependence of data in medical diagnosis. Computers and Biomedical Research 11 423--434.
Gelfand, A. E. and Sahu, S. K. (1999). Identifiability, improper priors, and Gibbs sampling for generalized linear models. J. Amer. Statist. Assoc. 94 247--253.
Mathematical Reviews (MathSciNet): MR1689229
Digital Object Identifier: doi:10.2307/2669699
Georgiadis, M. P., Johnson, W. O., Gardner, I. A. and Singh, R. (2003). Correlation-adjusted estimation of sensitivity and specificity of two diagnostic tests. Appl. Statist. 52 63--76.
Mathematical Reviews (MathSciNet): MR1963213
Digital Object Identifier: doi:10.1111/1467-9876.00389
Gustafson, P. (2002). On the simultaneous effects of model misspecification and errors-in-variables. Canad. J. Statist. 30 463--474.
Mathematical Reviews (MathSciNet): MR1944374
Gustafson, P. (2005). The utility of prior information and stratification for parameter estimation with two screening tests but no gold standard. Statistics in Medicine 24 1203--1217.
Mathematical Reviews (MathSciNet): MR2134574
Digital Object Identifier: doi:10.1002/sim.2002
Gustafson, P., Le, N. D. and Saskin, R. (2001). Case-control analysis with partial knowledge of exposure misclassification probabilities. Biometrics 57 598--609.
Mathematical Reviews (MathSciNet): MR1855698
Digital Object Identifier: doi:10.1111/j.0006-341X.2001.00598.x
Hobert, J. P. and Casella, G. (1996). The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J. Amer. Statist. Assoc. 91 1461--1473.
Mathematical Reviews (MathSciNet): MR1439086
Huang, Y. H. S. and Huwang, L. (2001). On the polynomial structural relationship. Canad. J. Statist. 29 495--512.
Mathematical Reviews (MathSciNet): MR1872649
Hui, S. L. and Walter, S. D. (1980). Estimating the error rates of diagnostic tests. Biometrics 36 167--171.
Johnson, W. O. and Gastwirth, J. L. (1991). Bayesian inference for medical screening tests: Approximations useful for the analysis of acquired immune deficiency syndrome. J. Roy. Statist. Soc. Ser. B 53 427--439.
Johnson, W. O., Gastwirth, J. L. and Pearson, L. M. (2001). Screening without a ``gold-standard'': The Hui--Walter paradigm revisited. American J. Epidemiology 153 921--924.
Joseph, L., Gyorkos, T. and Coupal, L. (1995). Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. American J. Epidemiology 141 263--272.
Kass, R. E. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc. 90 928--934.
Mathematical Reviews (MathSciNet): MR1354008
Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR1639875
Zentralblatt MATH: 0916.62017
Neath, A. A. and Samaniego, F. J. (1997). On the efficacy of Bayesian inference for nonidentifiable models. Amer. Statist. 51 225--232.
Mathematical Reviews (MathSciNet): MR1467551
Digital Object Identifier: doi:10.2307/2684892
Palatini, P., Pessina, A. C. and Dal Palu, C. (1993). The hypertension and ambulatory recording venetia study (HARVEST): A trial on the predictive value of ambulatory blood pressure monitoring for the development of fixed hypertension in patients with borderline hypertension. High Blood Pressure 2 11--18.
Qu, Y., Tan, M. and Kutner, M. H. (1996). Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 52 797--810.
Mathematical Reviews (MathSciNet): MR1411731
Schork, M. A. and Remington, R. D. (2000). Statistics with Applications to the Biological and Health Sciences, 3rd ed. Prentice--Hall, Upper Saddle River, NJ.
Torrance-Rynard, V. L. and Walter, S. D. (1997). Effects of dependent errors in the assessment of diagnostic test performance. Statistics in Medicine 16 2157--2175.
Vacek, P. M. (1985). The effect of conditional dependence on the evaluation of diagnostic tests. Biometrics 41 959--968.
White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50 1--25.
Mathematical Reviews (MathSciNet): MR640163

2009 © Institute of Mathematical Statistics