Statistical Science

Test Martingales, Bayes Factors and p-Values

Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, and Vladimir Vovk

Full-text: Open access


A nonnegative martingale with initial value equal to one measures evidence against a probabilistic hypothesis. The inverse of its value at some stopping time can be interpreted as a Bayes factor. If we exaggerate the evidence by considering the largest value attained so far by such a martingale, the exaggeration will be limited, and there are systematic ways to eliminate it. The inverse of the exaggerated value at some stopping time can be interpreted as a p-value. We give a simple characterization of all increasing functions that eliminate the exaggeration.

Article information

Statist. Sci., Volume 26, Number 1 (2011), 84-101.

First available in Project Euclid: 9 June 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayes factors evidence hypothesis testing martingales p-values


Shafer, Glenn; Shen, Alexander; Vereshchagin, Nikolai; Vovk, Vladimir. Test Martingales, Bayes Factors and p -Values. Statist. Sci. 26 (2011), no. 1, 84--101. doi:10.1214/10-STS347.

Export citation


  • [1] Aalen, O., Andersen, P. K., Borgan, Ø., Gill, R. and Keiding, N. (2009). History of applications of martingales in survival analysis. Electronic J. History Probab. Statist. 5. Available at
  • [2] Abramowitz, M. and Stegun, I. A., eds. (1964). Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables. US Government Printing Office, Washington, DC.
  • [3] Aldrich, J. P-value and prob-value. Earliest Known Uses of Some of the Words of Mathematics. Available at jeff560.
  • [4] Anscombe, F. J. (1954). Fixed-sample-size analysis of sequential observations. Biometrics 10 89–100.
  • [5] Apostol, T. M. (1999). An elementary view of Euler’s summation formula. Amer. Math. Monthly 106 409–418.
  • [6] Armitage, P. (1961). Discussion of “Consistency in statistical inference and decision,” by C. A. B. Smith. J. Roy. Statist. Soc. Ser. B 23 30–31.
  • [7] Bernardo, J. M. and Smith, A. F. M. (2000). Bayesian Theory. Wiley, Chichester.
  • [8] Bienvenu, L., Shafer, G. and Shen, A. (2009). On the history of martingales in the study of randomness. Electronic J. History Probab. Statist. 5. Available at
  • [9] Bru, B., Bru, M.-F. and Chung, K. L. (2009). Borel and the St. Petersburg martingale. Electronic J. History Probab. Statist. 5. Available at
  • [10] Cox, D. R. (2006). Principles of Statistical Inference. Cambridge Univ. Press, Cambridge.
  • [11] Dawid, A. P. (1984). Statistical theory: The prequential approach. J. Roy. Statist. Soc. Ser. A 147 278–292.
  • [12] Dawid, A. P., de Rooij, S., Shafer, G., Shen, A., Vereshchagin, N. and Vovk, V. (2011). Insuring against loss of evidence in game-theoretic probability. Statist. Probab. Lett. 81 157–162.
  • [13] Dellacherie, C. and Meyer, P.-A. (1982). Probabilities and Potential B: Theory of Martingales. North-Holland, Amsterdam.
  • [14] Dempster, A. P. (1969). Elements of Continuous Multivariate Analysis. Addison-Wesley, Reading, MA.
  • [15] Edwards, W., Lindman, H. and Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review 70 193–242.
  • [16] Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh.
  • [17] Hipp, C. and Mattner, L. (2007). On the normal approximation to symmetric binomial distributions. Teor. Veroyatn. Primen. 52 610–617.
  • [18] Itô, K. and Watanabe, S. (1965). Transformation of Markov processes by multiplicative functionals. Ann. l’Inst. Fourier 15 15–30.
  • [19] Johnson, G. and Helms, L. L. (1963). Class D supermartingales. Bull. Amer. Math. Soc. 69 59–62.
  • [20] Kass, R. E. and Raftery, A. E. (1995). Bayes factors. J. Amer. Statist. Assoc. 90 773–795.
  • [21] Lai, T. L. (2009). Martingales in sequential analysis and time series, 1945–1985. Electronic J. History Probab. Statist. 5. Available at
  • [22] Laplace, P. S. (1774). Mémoire sur la probabilité des causes par les évènemens. Savants étranges 6 621–656. English translation (1986): Memoir on the probability of the causes of events. Statist. Sci. 1 364–378.
  • [23] Lehmann, E. L. (2006). Nonparametrics: Statistical Methods Based on Ranks, revised 1st ed. Springer, New York.
  • [24] Lévy, P. (1937). Théorie de l’addition des variables aléatoires. Gauthier-Villars, Paris.
  • [25] Li, M. and Vitányi, P. (2008). An Introduction to Kolmogorov Complexity and Its Applications, 3rd ed. Springer, New York.
  • [26] Locker, B. (2009). Doob at Lyon. Electronic J. History Probab. Statist. 5. Available at
  • [27] Martin-Löf, P. (1966). Algorithmen und zufällige Folgen. Vier Vorträge von Per Martin-Löf (Stockholm) gehalten am Mathematischen Institut der Universität Erlangen-Nürnberg, Erlangen. This document, dated 16 April 1966, consists of notes taken by K. Jacobs and W. Müller from lectures by Martin-Löf at Erlangen on April 5, 6, 14, and 15. There are copies in several university libraries in Germany and the United States. Available at www.probabilityandfinance. com/misc/erlangen.pdf.
  • [28] Martin-Löf, P. (1969). The literature on von Mises’ Kollektivs revisited. Theoria 35 12–37.
  • [29] Medvegyev, P. (2007). Stochastic Integration Theory. Oxford Univ. Press, Oxford.
  • [30] Meyer, P. A. (1966). Probability and Potentials. Blaisdell, Waltham, MA.
  • [31] Neyman, J. and Pearson, E. (1933). On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. Roy. Soc. London Ser. A 231 289–337.
  • [32] Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Magazine 50 157–175.
  • [33] R Development Core Team (2010). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.
  • [34] Schnorr, C.-P. (1971). Zufälligkeit und Wahrscheinlichkeit. Eine algorithmische Begründung der Wahrscheinlichkeitstheorie. Springer, Berlin.
  • [35] Sellke, T., Bayarri, M. J. and Berger, J. (2001). Calibration of p-values for testing precise null hypotheses. Amer. Statist. 55 62–71.
  • [36] Shafer, G. (2006). From Cournot’s principle to market efficiency. The Game-Theoretic Probability and Finance project, Working Paper 15. Available at
  • [37] Stigler, S. M. (1986). Laplace’s 1774 memoir on inverse probability. Statist. Sci. 1 359–363.
  • [38] Todhunter, I. (1865). A History of the Mathematical Theory of Probability from the Time of Pascal to that of Laplace. Macmillan, London.
  • [39] Ville, J. (1939). Etude critique de la notion de collectif. Gauthier-Villars, Paris.
  • [40] Vovk, V. (1987). The law of the iterated logarithm for random Kolmogorov, or chaotic, sequences. Theory Probab. Appl. 32 413–425. Russian original: Закон повторного логарифма для случайных по Колмогорову, или хаотических, последовательностей. Теория вероятностей и ее применения 32 456–468.
  • [41] Vovk, V. (1993). A logic of probability, with application to the foundations of statistics (with discussion). J. Roy. Statist. Soc. Ser. B 55 317–351.
  • [42] Vovk, V., Gammerman, A. and Shafer, G. (2005). Algorithmic Learning in a Random World. Springer, New York.
  • [43] Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p-values. Psychon. Bull. Rev. 14 779–804.