Bayesian Analysis

Bayesian diagnostic techniques for detecting hierarchical structure

Abstract

Motivated by an increasing number of Bayesian hierarchical model applications, the objective of this paper is to evaluate properties of several diagnostic techniques when the fitted model includes some hierarchical structure, but the data are from a model with additional, unknown hierarchical structure. Because there has been no apparent evaluation of Bayesian diagnostics used for this purpose, we start by studying the simple situation where the data come from a normal model with two-stage hierarchical structure while the fitted model does not have any hierarchical structure, and then extend this to the case where the fitted model has two-stage normal hierarchical structure while the data come from a model with three-stage normal structure. We use exact derivations, large sample approximations and numerical examples to evaluate the quality of the diagnostic techniques. Our investigation suggests two promising techniques: distribution of individual posterior predictive $p$ values and the conventional posterior predictive $p$ value with the $F$ statistic as a checking function. We show that (at least) for large sample sizes these $p$ values are uniformly distributed under the null model and are effective in detecting hierarchical structure not included in the null model. Finally, we apply these two techniques to examine the fit of a model for data from the Patterns of Care Study, a two-stage cluster sample of cancer patients undergoing radiation therapy.

Article information

Source
Bayesian Anal., Volume 2, Number 4 (2007), 735-760.

Dates
First available in Project Euclid: 22 June 2012

https://projecteuclid.org/euclid.ba/1340370713

Digital Object Identifier
doi:10.1214/07-BA230

Mathematical Reviews number (MathSciNet)
MR2361973

Zentralblatt MATH identifier
1331.62067

Citation

Yan, Guofen; Sedransk, J. Bayesian diagnostic techniques for detecting hierarchical structure. Bayesian Anal. 2 (2007), no. 4, 735--760. doi:10.1214/07-BA230. https://projecteuclid.org/euclid.ba/1340370713

References

• Bayarri, M.J. and Berger, J.O. (2000). “$P$-values for composite null models," Journal of the American Statistical Association, 95, 1127-1142.
• Bayarri, M.J. and Castellanos, M.E. (2007). “Bayesian checking of hierarchical models," Statistical Science, in press.
• Box, G.E.P. (1980). “Sampling and Bayes inference in scientific modeling and robustness," Journal of the Royal Statistical Society A, 143, 383-430.
• Calvin, J.A. and Sedransk, J. (1991). “Bayesian and Frequentist Predictive Inference for the Patterns of Care Studies," Journal of the American Statistical Association, 86, 36-48.
• Geisser, S. (1993). Predictive Inference: An Introduction, Chapman & Hall, New York, NY.
• Gelfand, A.E., Dey, D.K., and Chang, H. (1992). “Model determination using predictive distributions with implementation via sampling-based methods," In Bayesian Statistics 4, eds. J.M. Bernardo et al., Oxford University Press, London, 147-167.
• Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. (2004). Bayesian Data Analysis, 2nd Edition, Chapman & Hall/CRC, London.
• Gelman, A., Meng, X.L., and Stern, H. (1996). “Posterior predictive assessment of model fitness via realized discrepancies (with discussion)," Statistica Sinica, 6, 733-807.
• Hjort, N.L., Dahl, F.A., and Steinbakk, G. H. (2006). “Post-processing posterior predictive $p$ values," Journal of the American Statistical Association, 101, 1157-1174.
• Marshall, E.C. and Spiegelhalter, D.J. (2003). “Approximate cross-validatory predictive checks in disease mapping models," Statistics in Medicine, 22, 1649-1660.
• Nandram, B., Sedransk, J., and Pickle, L. (1999). “Bayesian analysis of mortality rates for U.S. health service areas," Sankhya B 61, 145-165.
• Nandram, B., Sedransk, J., and Pickle, L. (2000). “Bayesian analysis and mapping of mortality rates for chronic obstructive pulmonary disease," Journal of the American Statistical Association, 95, 1110-1118.
• Robins, J.M., van der Vaart, A., and Ventura, V. (2000). “Asymptotic distribution of $p$-values in composite null models (with discussion)," Journal of the American Statistical Association, 95, 1143-1172.
• Rubin, D.B. (1984). “Bayesianly justifiable and relevant frequency calculations for the applied statistician," The Annals of Statistics, 12, 1151-1172.
• Sinharay, S. and Stern, H.S. (2003). “Posterior predictive model checking in hierarchical models," Journal of Statistical Planning and Inference, 111, 209-221.
• Stern, H.S. and Cressie, N. (2000). “Posterior predictive model checks for disease mapping models," Statistics in Medicine, 19, 2377-2397.
• Yan, G. (2003). Evaluation of Bayesian Diagnostic Methods for Hierarchical Data, Ph.D. dissertation, Department of Statistics, Case Western Reserve University, Cleveland, Ohio.