Source: Ann. Statist. Volume 38, Number 1
(2010), 512-525.
P-values have been the focus of considerable criticism based on various considerations. Still, the P-value represents one of the most commonly used statistical tools. When assessing the suitability of a single hypothesized distribution, it is not clear that there is a better choice for a measure of surprise. This paper is concerned with the definition of appropriate model-based P-values for model checking.
References
[1] Bayarri, M. J. and Berger, J. O. (2000). P-values for composite null models. J. Amer. Statist. Assoc. 95 1127–1142, 1157–1170. With comments and a rejoinder by the authors.
[2] Bayarri, M. J. and Castellanos, M. E. (2007). Bayesian checking of the second levels of hierarchical models. Statist. Sci. 22 322–343. With comments and a rejoinder by the authors.
[3] Berger, J. O. and Delampady, M. (1987). Testing precise hypotheses. Statist. Sci. 2 317–352. With comments and a rejoinder by the authors.
Mathematical Reviews (MathSciNet):
MR920141
[4] Berger, J. O. and Sellke, T. (1987). Testing a point null hypothesis: Irreconcilability of P-values and evidence. J. Amer. Statist. Assoc. 82 112–139. With comments and a rejoinder by the authors.
Mathematical Reviews (MathSciNet):
MR883340
[5] Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, New York.
[6] Evans, M. and Jang, G.-H. (2008). Invariant P-values for model checking and checking for prior-data conflict. Techincal Report 0803, Dept. Statistics, Univ. Toronto.
[7] Fraser, D. A. S. (1979). Inference and Linear Models. McGraw-Hill, New York.
Mathematical Reviews (MathSciNet):
MR535612
[8] Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statist. Sinica 6 733–807. With comments and a rejoinder by the authors.
[9] Hall, P. and Selinger, B. (1986). Statistical significance: Balancing evidence against doubt. Austral. J. Statist. 28 354–370.
[10] Loomis, L. H. and Sternberg, S. (1968). Advanced Calculus. Addison-Wesley, Reading, MA.
Mathematical Reviews (MathSciNet):
MR227327
[11] Meng, X.-L. (1994). Posterior predictive p-values. Ann. Statist. 22 1142–1160.
[12] Morrison, D. E. and Henkel, R. E. (1970). The Significance Test Controversy—A Reader. Aldine Publishing, Chicago.
[13] Royall, R. M. (1997). Statistical Evidence: A Likelihood Paradigm. Monographs on Statistics and Applied Probability 71. Chapman and Hall, London.
[14] Rudin, W. (1974). Real and Complex Analysis, 2nd ed. McGraw-Hill, New York.
Mathematical Reviews (MathSciNet):
MR344043
[15] Schervish, M. J. (1996). P-values: What they are and what they are not. Amer. Statist. 50 203–206.
[16] Tjur, T. (1974). Conditional Probability Distributions. Lecture Notes 2. Univ. Copenhagen, Copenhagen.
Mathematical Reviews (MathSciNet):
MR345151