The Annals of Statistics

Invariant P-values for model checking

Michael Evans and Gun Ho Jang

Full-text: Open access


P-values have been the focus of considerable criticism based on various considerations. Still, the P-value represents one of the most commonly used statistical tools. When assessing the suitability of a single hypothesized distribution, it is not clear that there is a better choice for a measure of surprise. This paper is concerned with the definition of appropriate model-based P-values for model checking.

Article information

Ann. Statist., Volume 38, Number 1 (2010), 512-525.

First available in Project Euclid: 31 December 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F99: None of the above, but in this section

P-values invariance under transformations discrepancy measures for model checking


Evans, Michael; Jang, Gun Ho. Invariant P -values for model checking. Ann. Statist. 38 (2010), no. 1, 512--525. doi:10.1214/09-AOS727.

Export citation


  • [1] Bayarri, M. J. and Berger, J. O. (2000). P-values for composite null models. J. Amer. Statist. Assoc. 95 1127–1142, 1157–1170. With comments and a rejoinder by the authors.
  • [2] Bayarri, M. J. and Castellanos, M. E. (2007). Bayesian checking of the second levels of hierarchical models. Statist. Sci. 22 322–343. With comments and a rejoinder by the authors.
  • [3] Berger, J. O. and Delampady, M. (1987). Testing precise hypotheses. Statist. Sci. 2 317–352. With comments and a rejoinder by the authors.
  • [4] Berger, J. O. and Sellke, T. (1987). Testing a point null hypothesis: Irreconcilability of P-values and evidence. J. Amer. Statist. Assoc. 82 112–139. With comments and a rejoinder by the authors.
  • [5] Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, New York.
  • [6] Evans, M. and Jang, G.-H. (2008). Invariant P-values for model checking and checking for prior-data conflict. Techincal Report 0803, Dept. Statistics, Univ. Toronto.
  • [7] Fraser, D. A. S. (1979). Inference and Linear Models. McGraw-Hill, New York.
  • [8] Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statist. Sinica 6 733–807. With comments and a rejoinder by the authors.
  • [9] Hall, P. and Selinger, B. (1986). Statistical significance: Balancing evidence against doubt. Austral. J. Statist. 28 354–370.
  • [10] Loomis, L. H. and Sternberg, S. (1968). Advanced Calculus. Addison-Wesley, Reading, MA.
  • [11] Meng, X.-L. (1994). Posterior predictive p-values. Ann. Statist. 22 1142–1160.
  • [12] Morrison, D. E. and Henkel, R. E. (1970). The Significance Test Controversy—A Reader. Aldine Publishing, Chicago.
  • [13] Royall, R. M. (1997). Statistical Evidence: A Likelihood Paradigm. Monographs on Statistics and Applied Probability 71. Chapman and Hall, London.
  • [14] Rudin, W. (1974). Real and Complex Analysis, 2nd ed. McGraw-Hill, New York.
  • [15] Schervish, M. J. (1996). P-values: What they are and what they are not. Amer. Statist. 50 203–206.
  • [16] Tjur, T. (1974). Conditional Probability Distributions. Lecture Notes 2. Univ. Copenhagen, Copenhagen.