On Pseudo-Games

Alfredo Banos

doi:10.1214/aoms/1177698023

December, 1968 On Pseudo-Games

Alfredo Banos

Ann. Math. Statist. 39(6): 1932-1945 (December, 1968). DOI: 10.1214/aoms/1177698023

Abstract

In the definition of a two-person zero-sum game given by Von Neumann and Morgenstern it is assumed that both players know the rules of the game (e.g., the game tree, the information sets as well as the distributions of the ensuing payoffs for given strategy choices, etc.). We use the term pseudo-game to denote the case where at least one player does not have complete information. In this paper we restrict our attention to those pseudo-games in which player I, say, is only aware of his set of pure strategy choices (assumed to contain $m$ elements: $2 \leqq m < \infty$) and not of player II's strategy choices (assumed to have uniformly bounded second moments). Player II is assumed to have complete information. More precisely, we shall study pseudo-games $G$ that have the format given below: Let $A = \{a_1, \cdots, a_m\}$ denote the pure strategy choices of player I. Denote by $A^\ast$ the set of probability distributions $p$ over $A$ (player I's mixed strategy choices). We sometimes write $p$ in the form $(p(1), \cdots, p(m)), \sum^m_{j=1}p(j) = 1$, and $p(j) \geqq 0$, with the interpretation that when player I uses $p$ he will play $a_j$ with probability $p(j)$. Any element of $A^\ast$ that assigns mass 1 to some $a \varepsilon A$ will be simply denoted by $a$. Let $B$ denote the set (not necessarily finite) of pure strategies for player II. Let $\mathscr{B}$ be a fixed $\sigma$-field of subsets of $B$ and denote by $B^\ast$ the set of all probability distributions $q$ over $\mathscr{B}$ (player II's mixed strategies). We assume that $\mathscr{B}$ contains all single point sets of $B$, so that $B^\ast$ contains all finite probability distributions over $B$. We postulate that we are given for each pair $(a, b)$ in the product space $A \times B$ a distribution $P_{(a, b)}$ on the real line which represents the distribution of the loss incurred by player I (or gain by player II) if $a \varepsilon A$ is the strategy choice of I and $b \varepsilon B$ is the strategy choice of II. Contrary to the usual practice, the payoff for given pure strategy choices is thus allowed to be random. We do this in order that our main results may be proved in greater generality. An example of a pseudo-game with random payoffs is given in Section 2. The distributions $P_{(a,b)}$ are assumed to have uniformly bounded second moments. For each $a \varepsilon A$, and fixed Borel set $C, P_{(a,\cdot)}(C)$ is assumed to be $\mathscr{B}$-measurable. For each pair $(a, b) \varepsilon A \mathbf{\times} B$, let $X_{(a,b)}$ be a random variable having $P_{(a,b)}$ as its distribution. Suppose that players I and II are using strategies $p$ and $q$, respectively. They can determine the payoff of the pseudo-game by first selecting an $a \varepsilon A$ and a $b \varepsilon B$ according to the distributions $p$ and $q$, respectively, and then treating an observed value of $X_{(a,b)}$ as the payoff. For every pair of strategies $(p, q)$ that players I and II may use we define the expected value of the payoff $R(p, q)$ by means of the equation: \begin{equation*}\tag{1.1} R(p, q) = \sum^m_{j=1} p(j) \int \lbrack\int x dP_{(a_j,b)}(x)\rbrack dq(b).\end{equation*} We are assuming that player I is only aware of the set $A$, while player II has complete information. However, by assuming instead that player I is also aware of the set $B$ as well as the distributions $P_{(a,b)}, (a, b) \varepsilon A \times B$, we can associate with every such pseudo-game $G$ a game with complete information $G'$. Such concepts as "value" and "minimax strategy" do not carry over to pseudo-games. However by the minimax theorem, since $A$ is assumed finite, every such game $G'$ will have a value $v_G$ and player I will have a minimax strategy $p'$: \begin{equation*}\tag{1.2)} v_G = \sup_{q\varepsilon B^\ast} R(p', q) = \inf_{p\varepsilon A^\ast} \sup_{q\varepsilon B^\ast} R(p, q) = \sup_{q\varepsilon B^\ast} \inf_{p\varepsilon A^\ast} R(p, q).\end{equation*} Suppose now that players I and II are playing a sequence of identical pseudo-games of the type we have been describing; i.e., they play one game, observe their losses and play the same game again (with possibly different strategy choices), continuing in this manner ad-infinitum. We shall refer to the individual games that make up the sequence as the subgames of the sequence. When playing such a sequence of pseudo-games a strategy for player I would be a rule $P$ that would tell him for every $j$, as a function of his past plays (mixed strategy choices) and losses what mixed strategy to play during the $j$th subgame; a strategy for player II would be a rule $Q$ that would tell him for every $j$, as a function of his own, and his opponent's past plays and losses, what mixed strategy $q$ to play during the $j$th subgame of the sequence. We are thus allowing player II to know what plays player I has made, but we are not granting I the same favor. Among the rules $P$ available to player I we define a special class of rules to be called rules constant on intervals. If $x$ is any real number let $\lbrack x\rbrack$ denote the largest integer that is less than or equal to $x$. For every $\alpha > 1$, let $\Pi(\alpha) = (I_1(\alpha), I_2(\alpha), \cdots, I_n(\alpha), \cdots)$ denote the partition on the set I of positive integers defined by the equations: \begin{equation*}\tag{1.3} I_n(\alpha) = \{(\sum^{n-1}_{k=1} \lbrack k^\alpha\rbrack) + 1, (\sum^{n-1}_{k=1}\lbrack k^\alpha\rbrack) + 2, \cdots, \sum^n_{k=1} \lbrack k^\alpha\rbrack\}; n = 1, 2, 3,\cdots.\end{equation*} For example, $I_1(2) = \{1\}, I_2(2) = \{2, 3, 4, 5\}, I_3(2) = \{6, 7, \cdots, 14\}$; etc. We shall refer to $I_n(\alpha)$ as the $n$th interval of the partition $\Pi (\alpha)$. Note that the cardinality of $I_n(\alpha)$ is $\lbrack n^\alpha\rbrack$. Let us suppose that player I is using some rule $P$ that assigns, with probability 1, the same mixed strategy to the $i$th subgame as it does to the $j$th subgame whenever $i$ and $j$ belong to the same interval $I_n(\alpha), n = 1, 2, 3, \cdots$. In this case we say that $P$ is constant on intervals. Thus if we say that player I is to play a certain strategy $p$ during the $n$th interval of a partition $\Pi(\alpha)$, we mean that he is to play $p$ during every subgame whose index belongs to $I_n(\alpha)$. The particular strategy that player I uses in the $n$th interval (a random variable depending on plays and losses occurring prior to the $n$th interval) will be denoted by $p_n$. For $j = 1, 2, 3, \cdots, N, \cdots$ let $X_j$ represent the loss incurred by player I during the $j$th subgame. Note that the sequence $\{X_n\}$ is a discrete stochastic process whose index set is the set I of positive integers and whose law of evolution is determined by the distributions $P_{(a,b)}$ and by the rules $P$ and $Q$ that the players use. The first objective of this paper is to prove: THEOREM. Suppose players I and II are playing a sequence of identical pseudo-games $G$ satisfying (i) and (ii): (i) Player I has $m \geqq 2$ pure strategy choices. (ii) The distributions $P_{(a,b)}$ have uniformly bounded second moments and for each $a \varepsilon A$ and every Borel set $C, P_{(a,\cdot)}(C)$ is $\mathscr{B}$-measurable. Then there exists a class of rules $\{P\}_m$ for player I such that for all rules $Q$ that player II may use we have: $P \varepsilon \{P\}_m \Rightarrow \operatorname{Pr} (\lim \sup_{N\rightarrow\infty} N^{-1} \sum^N_{j=1} X_j \leqq v_G\mid P, Q) = 1.$ We will show, that is, that the player with incomplete information can do as well asymptotically as he could if he had complete information. The members of $\{P\}_m$ will all be constant on intervals. Our second objective will be to seek a strong convergence rate for $N^{-1} \sum^N_{j=1} X_j$. In the course of achieving this goal we will show that a good partition is obtained by setting $\alpha$ equal to $(m + 2)/m$.