The Annals of Statistics

On prediction of individual sequences

Nicolò Cesa-Bianchi and Gábor Lugosi
Source: Ann. Statist. Volume 27, Number 6 (1999), 1865-1895.

Abstract

Sequential randomized prediction of an arbitrary binary sequence is investigated. No assumption is made on the mechanism of generating the bit sequence. The goal of the predictor is to minimize its relative loss (or regret), that is, to make almost as few mistakes as the best “expert” in a fixed, possibly infinite, set of experts. We point out a surprising connection between this prediction problem and empirical process theory. First, in the special case of static (memoryless) expert, we completely characterize the minimax regret in terms of the maximum of an associated Rademacher process. Then we show general upper and lower bounds on the minimax regret in terms of the geometry of the class of experts. As main examples, we determine the exact order of magnitude of the minimax regret for the class of autoregressive linear predictors and for the class of Markov experts.

First Page: Show Hide
Primary Subjects: 62C20
Secondary Subjects: 60G20
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1017939242
Mathematical Reviews number (MathSciNet): MR1765620
Digital Object Identifier: doi:10.1214/aos/1017939242
Zentralblatt MATH identifier: 0961.62081

References

1 AZUMA, K. 1967. Weighted sums of certain dependent random variables. Tohoku Math. J. 68 357 367.
Mathematical Reviews (MathSciNet): MR36:4623
Zentralblatt MATH: 0178.21103
Digital Object Identifier: doi:10.2748/tmj/1178243286
Project Euclid: euclid.tmj/1178243286
2 BILLINGSLEY, P. 1968. Convergence of Probability Measures. Wiley, New York.
Mathematical Reviews (MathSciNet): MR38:1718
6 CHOW, Y. S. and TEICHER, H. 1988. Probability Theory, Independence, Interchangeability, Martingales, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR953964
7 CHUNG, T. H. 1997. Minimax learning on iterated games via distributional majorization. Ph.D. dissertation, Stanford Univ.
8 COVER, T. M. 1965. Behavior of sequential predictors of binary sequences. In Proceedings of the Fourth Prague Conference on Information Theory, Statistical Decision Functions, Random Processes 263 272. Publishing House Czechoslovak Academy Sciences, Prague.
Mathematical Reviews (MathSciNet): MR217944
9 DEVROYE, L., GYORFI, L. and LUGOSI, G. 1996. A Probabilistic Theory of Pattern Recogni¨ tion. Springer, New York.
10 FEDER, M., MERHAV, N. and GUTMAN, M. 1992. Universal prediction of individual sequences. IEEE Trans. Inform. Theory 38 1258 1270.
Mathematical Reviews (MathSciNet): MR93b:94019
Digital Object Identifier: doi:10.1109/18.144706
11 GALAMBOS, J. 1987. The Asymptotic Theory of Extreme Order Statistics. Wiley, New York.
Mathematical Reviews (MathSciNet): MR89a:60059
12 GILBERT, E. N. 1952. A comparison of signalling alphabets. Bell System Technical J. 31 504 522.
13 GINE, E. 1996. Empirical processes and applications: an overview. Bernoulli 2 1 28. ´
Mathematical Reviews (MathSciNet): MR97e:60060
Digital Object Identifier: doi:10.2307/3318565
Project Euclid: euclid.bj/1193758786
14 HALL, P. and HEYDE, C. C. 1980. Martingale Limit Theory and Its Application. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR83a:60001
Zentralblatt MATH: 0462.60045
15 HAUSSLER, D. 1995. Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik Chervonenkis dimension. J. Combin. Theory Ser. A 69 217 232.
Mathematical Reviews (MathSciNet): MR96f:52027
Zentralblatt MATH: 0818.60005
Digital Object Identifier: doi:10.1016/0097-3165(95)90052-7
16 HOEFFDING, W. 1963. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13 30.
Mathematical Reviews (MathSciNet): MR26:1908
Zentralblatt MATH: 0127.10602
Digital Object Identifier: doi:10.2307/2282952
17 LEDOUX, M. and TALAGRAND, M. 1991. Probability in Banach Spaces. Springer, New York.
Mathematical Reviews (MathSciNet): MR93c:60001
Zentralblatt MATH: 0748.60004
18 LITTLESTONE, N. and WARMUTH, M. K. 1994. The weighted majority algorithm. Inform. and Comput. 108 212 261.
Mathematical Reviews (MathSciNet): MR95d:68118
Zentralblatt MATH: 0804.68121
Digital Object Identifier: doi:10.1006/inco.1994.1009
19 MCDIARMID, C. 1989. On the method of bounded differences. In Surveys in Combinatorics 1989 148 188. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1036755
Zentralblatt MATH: 0712.05012
20 MERHAV, N. and FEDER, M. 1998. Universal prediction. IEEE Trans. Inform. Theory 44 2124 2147.
Mathematical Reviews (MathSciNet): MR99h:94026
Zentralblatt MATH: 0933.94008
Digital Object Identifier: doi:10.1109/18.720534
21 MERHAV, N. and WEISSMAN, T. 1999. Some results on prediction in the presence of noise. Unpublished manuscript.
22 OPPER, M. and HAUSSLER, D. 1998. Worst case prediction over sequences under log loss. In The Mathematics of Information Coding, Extracting and Distribution. Springer, New York.
23 POLLARD, D. 1989. Asymptotics via empirical processes. Statist. Sci. 4 341 366.
Mathematical Reviews (MathSciNet): MR91e:60112
Zentralblatt MATH: 0955.60517
Digital Object Identifier: doi:10.1214/ss/1177012394
Project Euclid: euclid.ss/1177012394
24 SINGER, A. C. and FEDER, M. 1997. Universal linear prediction over parameters and model orders. Unpublished manuscript.
25 SZAREK, S. J. 1976. On the best constants in the Khintchine inequality. Studia Math. 63 197 208.
Mathematical Reviews (MathSciNet): MR430667
26 TALAGRAND, M. 1996. Majorizing measures: the generic chaining. Ann. Probab. 24 1049 1103.
Mathematical Reviews (MathSciNet): MR97k:60097
Zentralblatt MATH: 0867.60017
Digital Object Identifier: doi:10.1214/aop/1065725175
Project Euclid: euclid.aop/1065725175
27 VAPNIK, V. N. and CHERVONENKIS, A. Y. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16 264 280.
Zentralblatt MATH: 0247.60005
28 VOVK, V. G. 1990. Aggregating strategies. In Proceedings of the Third Annual Workshop on Computational Learning Theory 372 383. ACM Press, New York.
29 VOVK, V. G. 1998. A game of prediction with expert advice. J. Comput. System Sci. 56 153 173.
Mathematical Reviews (MathSciNet): MR99m:68201
Zentralblatt MATH: 0945.68528
Digital Object Identifier: doi:10.1006/jcss.1997.1556

2013 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics

Turn MathJax Off
What is MathJax?