Abstract
This paper is concerned with repetitive sequential play in finite statistical games (decision problems) from the statistician's point of view. We shall assume that the statistician's move at stage $k$ may depend on the previous $k - 1$ moves of Nature as well as the random variable $\mathbf{X}_k = (X_1, \cdots, X_k)$, where the $X_i$ are independent observations (r.v.'s) (possibly vector-valued) from the sequence of statistical games, $k = 1, 2, \cdots$. The play is repetitive in the sense that each component game is identical in structure, with only the moves of the statistician and Nature changing. Furthermore, we impose no assumptions regarding the behavior of the parameter sequence of Nature's moves. The statistician does have the added disadvantage that the finite class of distributions in the component game is not fully specified. However, he does know that class in question has: either (i) all members with discrete distributions or (ii) all members with $q$-dimensional a.e. continuous Lebesgue densities. This same problem when the distributions are fully known has been treated in [6] for statistical as well as more general games in which Nature's space is finite. In the case where the distributions are completely specified but the history of the past moves is unknown to the statistician, see [20], [22], [27], and [28]. The development in this paper is closely connected to and motivated by these results, particularly those of the preceding paper [27]. If for fixed $N$, the empirical distribution $p_N$ of Nature's moves is known, then the statistician could use as a rule for each of the $N$ component games a strategy Bayes against $p_N$ having risk $\phi(p_N)$. In all the papers cited in the previous paragraph, the aim was to construct for the statistician, when $p_N$ is unknown and $N$ not specified, a sequence of randomized decision functions whose $N$th average loss minus $\phi(p_N)$ approaches zero (or has an upper bound approaching zero) in a suitable sense as the number of repetitions of play, $N$, increases. However, in the case of statistical games, all of the above results require that the finite class of distributions be fully specified. In this paper we remove that assumption by estimating the distributions sequentially based on past moves and observations. Then in the present play of the component game the statistician substitutes these estimators into a procedure which is Bayes against the empirical distribution of Nature's previous moves. The resulting sequence of procedures is shown to be "asymptotically good" in the sense that the average loss over the $N$ games $W_N$ minus the Bayes risk $\phi(p_N)$ approaches zero (in an appropriate sense) as $N$, the number of games played, increases. In Section 2 we introduce notation and preliminaries. Section 3 discusses play in repetitive games and defines the proposed sequential procedures $\mathbf{t} = \{\mathbf{t}_k\}$. In Section 4 we prove preliminary results upon which all proofs are founded. Section 5 considers the discrete case giving uniform (in sequences of Nature's moves) convergence theorems (as $N \rightarrow \infty$) for the quantity $W_N - \phi(p_N)$. Theorem 5.1 is a uniform convergence theorem of $O(N^{-\frac{1}{2}})$ of the expected value of $W_N - \phi(p_N)$ for finite discrete classes, each member of which is non-degenerate and satisfies a certain tail probability condition. Under the same conditions, Theorem 5.2 gives uniform convergence to zero in probability for the quantity $N^{\frac{1}{2}} (\log N)^{-1} \{W_N - \phi(p_N)\} \text{as} N \rightarrow \infty$. Uniform convergence of $W_N - \phi(p_N) \rightarrow 0$ in probability for general non-degenerate finite discrete class is presented in Theorem 5.3. Section 6 treats the estimation problem for densities needed to form the randomized strategy sequences $\mathbf{t}$ in the continuous case. The results stated are based on a paper by Cacoullos [3] generalizing the univariate results of Parzen [15]. In Section 7, we present results for the continuous case. Theorem 7.1 and its corollary give uniform convergence of $W_N - \phi(p_N)$ to zero in probability and of its expectation to zero, respectively. The finite continuous classes of Theorem 7.1 are very general in the sense that each member is a continuous a.e. density. Finally, in Section 8 we draw certain conclusions and relate our results to similar results obtained elsewhere. The novelty of the paper rests in the fact that through the past history of Nature's moves and the observations connected with past play, one can construct a sequential strategy, $\mathbf{t} = \{\mathbf{t}_k\}$, with very little knowledge about the finite class of distributions, which approaches asymptotic "optimal" play. The lack of knowledge on the finite class of distributions distinguishes this work from the related "repetitive type" problems in games and/or decision theory treated in [1], [2], [4], [6], [7], [8], [9], [10], [12], [17], [18], [19], [20], [21], [22], [24], [25], [26], [27], [28], and [29]. For possible applications of this work see Neyman [14], especially his Example 3 and his discussion relating to the work of Blackwell [2].
Citation
J. Van Ryzin. "Repetitive Play in Finite Statistical Games with Unknown Distributions." Ann. Math. Statist. 37 (4) 976 - 994, August, 1966. https://doi.org/10.1214/aoms/1177699377
Information