## Probability Surveys

### Time-uniform Chernoff bounds via nonnegative supermartingales

#### Abstract

We develop a class of exponential bounds for the probability that a martingale sequence crosses a time-dependent linear threshold. Our key insight is that it is both natural and fruitful to formulate exponential concentration inequalities in this way. We illustrate this point by presenting a single assumption and theorem that together unify and strengthen many tail bounds for martingales, including classical inequalities (1960–80) by Bernstein, Bennett, Hoeffding, and Freedman; contemporary inequalities (1980–2000) by Shorack and Wellner, Pinelis, Blackwell, van de Geer, and de la Peña; and several modern inequalities (post-2000) by Khan, Tropp, Bercu and Touati, Delyon, and others. In each of these cases, we give the strongest and most general statements to date, quantifying the time-uniform concentration of scalar, matrix, and Banach-space-valued martingales, under a variety of nonparametric assumptions in discrete and continuous time. In doing so, we bridge the gap between existing line-crossing inequalities, the sequential probability ratio test, the Cramér-Chernoff method, self-normalized processes, and other parts of the literature.

#### Article information

Source
Probab. Surveys, Volume 17 (2020), 257-317.

Dates
First available in Project Euclid: 20 May 2020

https://projecteuclid.org/euclid.ps/1589940218

Digital Object Identifier
doi:10.1214/18-PS321

#### Citation

Howard, Steven R.; Ramdas, Aaditya; McAuliffe, Jon; Sekhon, Jasjeet. Time-uniform Chernoff bounds via nonnegative supermartingales. Probab. Surveys 17 (2020), 257--317. doi:10.1214/18-PS321. https://projecteuclid.org/euclid.ps/1589940218

#### References

• Ahlswede, R. and Winter, A. (2002), ‘Strong converse for identification via quantum channels’, IEEE Trans. Inf. Theor. 48(3), 569–579.
• Azuma, K. (1967), ‘Weighted sums of certain dependent random variables.’, Tohoku Mathematical Journal 19(3), 357–367.
• Bacry, E., Gaïffas, S. and Muzy, J.-F. (2018), ‘Concentration inequalities for matrix martingales in continuous time’, Probability Theory and Related Fields 170(1-2), 525–553.
• Ball, K., Carlen, E. A. and Lieb, E. H. (1994), ‘Sharp uniform convexity and smoothness inequalities for trace norms’, Inventiones Mathematicae 115(1), 463–482.
• Barlow, M. T., Jacka, S. D. and Yor, M. (1986), ‘Inequalities for a pair of processes stopped at a random time’, Proceedings of the London Mathematical Society s3-52(1), 142–172.
• Bennett, G. (1962), ‘Probability inequalities for the sum of independent random variables’, Journal of the American Statistical Association 57(297), 33–45.
• Bercu, B., Delyon, B. and Rio, E. (2015), Concentration inequalities for sums and martingales, Springer International Publishing, Cham.
• Bercu, B. and Touati, A. (2008), ‘Exponential inequalities for self-normalized martingales with applications’, The Annals of Applied Probability 18(5), 1848–1869.
• Bernstein, S. (1927), Theory of probability, Gastehizdat Publishing House, Moscow.
• Blackwell, D. (1997), Large deviations for martingales, in ‘Festschrift for Lucien Le Cam’, Springer, New York, NY, pp. 89–91.
• Blackwell, D. and Freedman, D. A. (1973), ‘On the amount of variance needed to escape from a strip’, The Annals of Probability 1(5), 772–787.
• Boucheron, S., Lugosi, G. and Massart, P. (2013), Concentration inequalities: a nonasymptotic theory of independence, 1st edn, Oxford University Press, Oxford.
• Chen, X. (2012a), A statistical approach for performance analysis of uncertain systems, in R. E. Karlsen, D. W. Gage, C. M. Shoemaker and G. R. Gerhart, eds, ‘Unmanned Systems Technology XIV’, Vol. 8387, International Society for Optics and Photonics, SPIE, pp. 583–594.
• Chen, X. (2012b), ‘New optional stopping theorems and maximal inequalities on stochastic processes’, 1207.3733 [math, stat].
• Chernoff, H. (1952), ‘A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations’, The Annals of Mathematical Statistics 23(4), 493–507.
• Christofides, D. and Markström, K. (2007), ‘Expansion properties of random Cayley graphs and vertex transitive graphs via matrix martingales’, Random Structures & Algorithms 32(1), 88–100.
• Chung, F. and Lu, L. (2006), ‘Concentration inequalities and martingale inequalities: a survey’, Internet Mathematics 3(1), 79–127.
• Craig, C. C. (1933), ‘On the Tchebychef inequality of Bernstein’, The Annals of Mathematical Statistics 4(2), 94–102.
• Cramér, H. (1938), ‘Sur un nouveau théorème-limite de la théorie des probabilités’, Actualités Scientifiques 736.
• Darling, D. A. and Robbins, H. (1967), ‘Iterated logarithm inequalities’, Proceedings of the National Academy of Sciences 57(5), 1188–1192.
• de la Peña, V. H. (1999), ‘A general class of exponential inequalities for martingales and ratios’, The Annals of Probability 27(1), 537–564.
• de la Peña, V. H. and Giné, E. (1999), Decoupling, Probability and its applications, Springer New York, New York, NY.
• de la Peña, V. H., Klass, M. J. and Lai, T. L. (2000), Moment bounds for self-normalized martingales, in ‘High Dimensional Probability II’, Birkhäuser, Boston, MA, pp. 3–11.
• de la Peña, V. H., Klass, M. J. and Lai, T. L. (2001), Self-normalized processes: exponential inequalities, moments, and limit theorems. Stanford University Technical Report No. 2001-6.
• de la Peña, V. H., Klass, M. J. and Lai, T. L. (2004), ‘Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws’, The Annals of Probability 32(3), 1902–1933.
• de la Peña, V. H., Klass, M. J. and Lai, T. L. (2007), ‘Pseudo-maximization and self-normalized processes’, Probability Surveys 4, 172–192.
• de la Peña, V. H., Klass, M. J. and Lai, T. L. (2009), ‘Theory and applications of multivariate self-normalized processes’, Stochastic Processes and their Applications 119(12), 4210–4227.
• de la Peña, V. H., Lai, T. L. and Shao, Q.-M. (2009), Self-normalized processes: limit theory and statistical applications, Springer, Berlin.
• Delyon, B. (2009), ‘Exponential inequalities for sums of weakly dependent variables’, Electronic Journal of Probability 14, 752–779.
• Delyon, B. (2015), Exponential inequalities for dependent processes, Technical report.
• Dembo, A. and Zeitouni, O. (2010), Large deviations techniques and applications, Springer, Berlin, Heidelberg.
• Doob, J. L. (1940), ‘Regularity properties of certain families of chance variables’, 47(3), 455–486.
• Dubins, L. E. and Savage, L. J. (1965), ‘A Tchebycheff-like inequality for stochastic processes’, Proceedings of the National Academy of Sciences 53(2), 274–275.
• Durrett, R. (2017), Probability: theory and examples, 5th edn.
• Efron, B. (1969), ‘Student’s $t$-test under symmetry conditions’, Journal of the American Statistical Association 64(328), 1278–1302.
• Fan, X., Grama, I. and Liu, Q. (2012), ‘Hoeffding’s inequality for supermartingales’, Stochastic Processes and their Applications 122(10), 3545–3559.
• Fan, X., Grama, I. and Liu, Q. (2015), ‘Exponential inequalities for martingales with applications’, Electronic Journal of Probability 20(1), 1–22.
• Freedman, D. A. (1975), ‘On tail probabilities for martingales’, The Annals of Probability 3(1), 100–118.
• Galambos, J. (1978), The asymptotic theory of extreme order statistics, Wiley. Google-Books-ID: O06aAAAAIAAJ.
• Gittens, A. and Tropp, J. A. (2011), ‘Tail bounds for all eigenvalues of a sum of random matrices’, ACM Report 2014-02, Caltech.
• Godwin, H. J. (1955), ‘On generalizations of Tchebychef’s inequality’, Journal of the American Statistical Association 50(271), 923–945.
• Hoeffding, W. (1963), ‘Probability inequalities for sums of bounded random variables’, Journal of the American Statistical Association 58(301), 13–30.
• Jorgensen, B. (1997), The theory of dispersion models, CRC Press.
• Kearns, M. and Saul, L. (1998), Large deviation methods for approximate probabilistic inference, in ‘Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence’, UAI’98, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 311–319.
• Khan, R. A. (2009), ‘${L}_{p}$-version of the Dubins–Savage inequality and some exponential inequalities’, Journal of Theoretical Probability 22(2), 348.
• Koltchinskii, V. and Lounici, K. (2017), ‘Concentration inequalities and moment bounds for sample covariance operators’, Bernoulli 23(1), 110–133.
• Lepingle, D. (1978), Sur le comportement asymptotique des martingales locales, in A. Dold, B. Eckmann, C. Dellacherie, P. A. Meyer and M. Weil, eds, ‘Séminaire de Probabilités XII’, Vol. 649, Springer, Berlin, Heidelberg, pp. 148–161.
• Lieb, E. H. (1973), ‘Convex trace functions and the Wigner–Yanase–Dyson conjecture’, Advances in Mathematics 11, 267–288.
• Logan, B. F., Mallows, C. L., Rice, S. O. and Shepp, L. A. (1973), ‘Limit distributions of self-normalized sums’, The Annals of Probability 1(5), 788–809.
• Mackey, L., Jordan, M. I., Chen, R. Y., Farrell, B. and Tropp, J. A. (2014), ‘Matrix concentration inequalities via the method of exchangeable pairs’, The Annals of Probability 42(3), 906–945.
• Mazliak, L. and Shafer, G., eds (2009), The splendors and miseries of martingales, Electronic Journal for History of Probability and Statistics 5(1). [Special issue]. http://www.jehps.net/juin2009.html.
• McDiarmid, C. (1998), Concentration, in M. Habib, C. McDiarmid, J. Ramirez-Alfonsin and B. Reed, eds, ‘Probabilistic Methods for Algorithmic Discrete Mathematics’, Springer, New York, pp. 195–248.
• Minsker, S. (2017), ‘On some extensions of Bernstein’s inequality for self-adjoint operators’, Statistics and Probability Letters 127, 111–119.
• Oliveira, R. (2010a), ‘The spectrum of random $k$-lifts of large graphs (with possibly large $k$)’, Journal of Combinatorics 1(3), 285–306.
• Oliveira, R. (2010b), ‘Sums of random Hermitian matrices and an inequality by Rudelson’, Electronic Communications in Probability 15, 203–212.
• Papapantoleon, A. (2008), ‘An introduction to Lévy processes with applications in finance’, arXiv:0804.0482.
• Pinelis, I. (1992), An approach to inequalities for the distributions of infinite-dimensional martingales, in ‘Probability in Banach Spaces, 8: Proceedings of the Eighth International Conference’, Birkhäuser, Boston, MA, pp. 128–134.
• Pinelis, I. (1994), ‘Optimum bounds for the distributions of martingales in Banach spaces’, The Annals of Probability 22(4), 1679–1706.
• Prokhorov, A. V. (1995), Bernstein inequality, in M. Hazewinkel, ed., ‘Encyclopaedia of Mathematics’, Vol. 1 of Encyclopaedia of Mathematics, Springer, Boston, MA, p. 365.
• Prokhorov, Y. V. (1959), ‘An extremal problem in probability theory’, Theory of Probability & Its Applications 4(2), 201–203.
• Protter, P. E. (2005), Stochastic integration and differential equations, Springer Science & Business Media.
• Raginsky, M. and Sason, I. (2012), ‘Concentration of measure inequalities in information theory, communications and coding (second edition)’, arXiv:1212.4663 [cs, math].
• Robbins, H. (1970), ‘Statistical methods related to the law of the iterated logarithm’, The Annals of Mathematical Statistics 41(5), 1397–1409.
• Robbins, H. and Siegmund, D. (1970), ‘Boundary crossing probabilities for the Wiener process and sample sums’, The Annals of Mathematical Statistics 41(5), 1410–1429.
• Rockafellar, R. T. (1970), Convex analysis, Princeton mathematical series, Princeton University Press, Princeton, N.J.
• Rudelson, M. (1999), ‘Random vectors in the isotropic position’, Journal of Functional Analysis 164(1), 60–72.
• Shao, Q.-M. (1997), ‘Self-normalized large deviations’, The Annals of Probability 25(1), 285–328.
• Shorack, G. R. and Wellner, J. A. (1986), Empirical processes with applications to statistics, Wiley, New York.
• Siegmund, D. (1985), Sequential analysis, Springer New York, New York, NY.
• Tropp, J. A. (2011), ‘Freedman’s inequality for matrix martingales’, Electronic Communications in Probability 16, 262–270.
• Tropp, J. A. (2012), ‘User-friendly tail bounds for sums of random matrices’, Foundations of Computational Mathematics 12(4), 389–434.
• Tropp, J. A. (2015), ‘An introduction to matrix concentration inequalities’, Foundations and Trends in Machine Learning 8(1-2), 1–230.
• Uspensky, J. V. (1937), Introduction to mathematical probability, 1st edn., McGraw-Hill Book Company, Inc, New York, London.
• van de Geer, S. (1995), ‘Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes’, The Annals of Statistics 23(5), 1779–1801.
• Vershynin, R. (2012), Introduction to the non-asymptotic analysis of random matrices, in Y. C. Eldar and G. Kutyniok, eds, ‘Compressed Sensing: Theory and Applications’, Cambridge University Press, pp. 210–268.
• Ville, J. (1939), Étude Critique de la Notion de Collectif, Gauthier-Villars, Paris.
• Wainwright, M. J. (2017), High-dimensional statistics: a non-asymptotic viewpoint, Cambridge University Press.
• Wald, A. (1945), ‘Sequential tests of statistical hypotheses’, Annals of Mathematical Statistics 16(2), 117–186.