On Bayes and Nash experimental designs for hypothesis testing problems

: In this communication we examine the relationship between maxi–min, Bayes and Nash designs for some hypothesis testing problems. In particular we consider the problem of sample allocation in the standard analysis of variance framework and show that the maxi–min design is also a Bayes solution with respect to the least favourable prior, as well as a solution to a game theoretic problem, which we refer to as a Nash design. In addition, an extension to tests for order is provided.


Introduction
The main focus of the literature on optimal design is on the theory of optimal estimation, usually in the context of the classical linear model. Within this framework an optimal design is a design which either minimizes (or maximizes) some functional of the variance matrix of the estimated parameter, cf., [17] and [1]. For example, an A-optimal design is a design which minimizes the sum of the eigenvalues of the variance matrix whereas a D-optimal design minimizes their product, which equals of course, to the determinant of the variance matrix. There are many other optimality criteria such as E, and MV optimality. Collectively these are known as alphabetic optimality.
Optimal estimation, however, may not be the objective of every scientific study. In fact, there are many situations in which researchers are interested in designing an experiment tailored for optimal testing of hypotheses rather than optimal estimation. In this context [20] recently proposed a maxi-min design (MM-design) for optimal testing under order restrictions. In this communication we study the relationship between their MM-design and Bayesian designs for unordered as well as ordered hypothesis testing problems. In addition we show that the MM-design can be viewed as a Nash Equilibrium when the experimental design is viewed as game theoretic problem. For a broad perspective on the Nash equilibrium see [11].

The maxi-min design for the ANOVA problem
We focus on the one way layout in which where Y ij is the response of the j th subject in the i th treatment group i = 1, . . . , K and j = 1, . . . , n i . We further assume that the errors ij are independent N (0, σ 2 ) random variables (RVs) and without any loss of generality we fix σ 2 = 1. A brief discussion of the situation where σ 2 is unknown differed to Remark 2.2 appearing near the end of this Section. Consider testing the standard ANOVA hypotheses where M 0 = {μ ∈ R K : μ 1 = μ 2 = · · · = μ K }, i.e., under the null all means are equal, whereas under the alternative at least one pair of means is different, i.e., μ i = μ j for some i = j. The standard test for (2.2) is the likelihood ratio test (LRT) given by whereȲ i = n −1 i ni j=1 Y ij is the average in the i th group when n i > 0 and 0 otherwise, N = K i=1 n i , andȲ = N −1 K i=1Ȳ i is the grand mean. It is well known that (2.3) follows a non-central chi-square distribution, denote χ 2 ν ((N/2)λ) with ν = K i=1 I {ni>0} − 1 degrees of freedom and non-centrality parameter (NCP) (N/2)λ. Further note that where Clearly under the null λ = 0 so the null distribution is the usual central chi-square. As a shorthand, and for convenience, we will often refer to λ as the NCP.
The power of the LRT (2.3), as a function of (μ, ξ), is: where the critical value c ν,α , which solves sup μ∈M0 P μ;ξ (T N ≥ c ν,α ) = α, is design dependent since ν = ν(ξ). In this communication we focus on what are known in the literature (cf., [1]) as approximate designs. An approximate design ξ ∈ Ξ, where ξ = (ξ 1 , ξ 2 , . . . , ξ K ) T is a vector of weights, i.e., ξ i ≥ 0 and K i=1 ξ i = 1, in which each ξ i represents the proportion of observations assigned to the i th treatment. Naturally the design space Ξ is the unit simplex. It is obvious that once an optimal approximate design is found, an exact design, i.e., a vector n = (n 1 , n 2 , . . . , n K ) T such that K i=1 n i = N is fixed can be obtained by efficient rounding, see [12].
A design ξ which maximizes the power function (2.5) at a given value of μ is said to be locally optimal. A pair of treatments, i and j say, is maximally separated if |μ i − μ j | = max{|μ s − μ t | : 1 ≤ s, t ≤ K}. Theorem 1 in [20] states that if (i, j) is a maximally separated pair then the optimal design is ξ ij = (e i + e j )/2, where e l is the l th standard basis of R K . In other words, given a maximally separated pair (i, j) the power of (2.3) for testing (2.2) is maximized when half of the observations are assigned to group i and the other half to group j. We refer to such a design as a two-point design. Incidentally, this design maximizes the NCP while simultaneously minimizing the degrees of freedom of the test statistic. Unfortunately, the value of μ or the identity of the maximally separated pair are rarely known in advance. Moreover, as illustrated below, a two-point design which is optimal for some vectors of means may be grossly deficient for others. of [20] the design ξ 1 = (1/2, 1/2, 0, 0) maximizes the power when μ = μ 1 . If, however, the true vector of population means is μ T 2 = (0, 0, −1, 1), then the power maximizing design is ξ 2 = (0, 0, 1/2, 1/2). Note that, for all positive N and, as expected, in both cases the power increases to unity when N → ∞. However, regardless of the total sample size N . This means that design ξ 2 has no power to detect a departure from the null if μ = μ 1 and likewise the design ξ 1 has no power to detect a departure from the null if μ = μ 2 . We conclude that locally optimal designs may perform poorly for values of μ for which they were not designed.
It is clear from Example 2.1 that there does not exist a design ξ which is globally optimal for all μ in the alternative. Moreover, locally optimal designs may perform poorly globally. Thus, it is reasonable to seek a design which is guaranteed to perform well for all values of μ; it is obvious that such a design will trade local optimality for a fair overall performance. In other words we seek a robust design which guarantees a minimal power, or equivalently, a design which maximize the power in the worst-case scenario.
Formally, we define ξ MM and μ LFC as the pair of values satisfying: The maximization in (2.6) is over all designs ξ ∈ Ξ and the minimization is over the set for some δ > 0. The restriction that μ ∈ M δ implies that the distance within the maximally separated pair is at least δ. Heuristically, one may view δ as measuring the distance from the null; for a precise explanation see Remark 2.1 below. Further note that the restriction that μ ∈ M δ is required for the existence of min μ / ∈M0 π(μ; ξ). For more details, as well as for the form of M δ in other testing problems, the reader is referred to [20]. The quantities ξ MM and μ LFC are referred to as the maxi-min design (MM-design) and the least favourable configuration (LFC), respectively. Theorem 2 in [20] states that: Theorem 2.1. The balanced design is the MM-design in the standard ANOVA setting.
By Theorem 2.1 the MM-design is the balanced design, i.e., ξ MM = (1/K, . . . , 1/K). Thus the MM-design allocates an equal proportion of observations to each of the treatment groups. It is well known that a balanced design is an A-optimal design for pairwise multiple comparison and also a maxi-min design for estimating treatment differences under squared error loss ( [26]). Further note that ξ MM is independent of δ. In the proof of Theorem 2.1 it is shown that any permutation of the vector (−δ/2, δ/2, 0, . . . , 0) is a LFC with respect to the balanced design hence the LFCs are functions of δ.
Remark 2.1 below provides a deeper understanding of the role of δ as a distance measure.
The NCPs associated with the MM-design and the means μ 1 and μ 2 are: A little algebra shows that λ(μ 1 ; ξ MM ) > λ(μ 2 ; ξ MM ) when 0 < < δ(2 − √ 3). Since the power function is monotonically increasing in the NCP when the degrees of freedom are fixed it also follows that Equation (2.8) shows that it is not necessarily true that for any given ξ the power at μ ∈ M δ1 is smaller than at μ ∈ M δ2 even though δ 1 < δ 2 . However, it is not difficult to see that where μ LFC (δ) is any permutation of (−δ/2, δ/2, 0, . . . , 0) T . We conclude that π(μ LFC (δ); ξ MM ) is monotonically increasing in δ. Armed with this perspective we can interpret δ as a "distance" from the null when the design is balanced and μ = μ LFC (δ).

Remark 2.2.
Finally, we note that we have assumed that (2.1) holds and σ 2 is known and for convenience we set its value to unity. If, however, σ 2 is unknown then instead of using the statistic T n given in (2.3) the hypotheses (2.2) will be tested using the statistic , (2.10) whose denominator is an unbiased estimator of σ 2 . The statistic S n follows an F K−1,N −K (φ) distribution where K − 1 and N − K are the degrees of freedom associated with numerator and denominator respectively and φ is the noncentrality parameter. It is well known that φ = (N/2)(λ/σ 2 ) where λ is as in (2.4). It immediately follows that the design which maximizes φ for any fixed but possibly unknown value of σ 2 will also maximize λ and visa versa. Hence our results apply in full also to situations where σ 2 is unknown and the proposed solution will also maximize the power of the F -test.
Next, in Sections 3 and 4 we examine the relationship between MM-designs with: (i) Bayes-designs; and (ii) a game-theoretic formulation of the design problem which we refer to as a Nash-designs.

Bayes designs
In this section we discuss the connection between MM-designs and Bayes designs in the context of the ANOVA testing problem (2.2). As noted earlier when the design criterion depends on an unknown parameter, e.g., μ in (2.5), it is generally impossible to find a design that is globally optimal, i.e., a design which is optimal for all values of μ. One way of addressing this issue is by using MM-designs, as discussed in Section 2, while another is to adopt a Bayesian approach. In the fully Bayesian framework a design which optimizes a functional of the posterior distribution is chosen, cf., [6]. Our approach, which is not fully Bayesian, is rooted in decision theory and referred to in the literature as the pseudo-Bayesian approach ( [25]). It is also known as average optimal design, e.g., [10]. The idea is simple; choose the design which optimizes the expected value of a design criterion with respect to some prior distribution on the unknown parameter. Formally, we define the Bayes design as where μ ∈ M ⊂ R K , Ψ(μ; ξ) is the design criterion, and Q is a given prior for μ such that M dQ(μ) = 1. For more examples, in the spirit of (3.1), see [16], [25], [21], and [22]. As shown by [20] maximizing the power of the unconstrained LRT is equivalent to maximizing the NCP. However, it is not clear that the Bayes design with respect to the NCP will coincide with the Bayes design with respect to the power. Therefore we shall explore both. Let ξ λ Q and ξ π Q denote the Bayes Q-optimal designs when Ψ(μ; ξ) in (3.1) replaced by λ(μ; ξ) and π(μ; ξ) respectively, i.e., ξ λ Q = arg max ξ∈Ξ Λ(Q; ξ) = arg max ξ∈Ξ λ(μ; ξ)dQ(μ) (3.2) and We start with the most natural prior. Then, Theorem 3.1 shows that the MM-design coincides with the NCP and power based Bayes designs when the prior Q is the distribution function of a multivariate normal exchangeable random vector. This family of priors, which when γ = 0, includes the independent components prior, is often used in applications. As shown in the proof of Theorem 3.1 the value of μ 0 is inconsequential so henceforth we set it equal to 0.
The assumption of normality can be relaxed: If Q is any distribution of an exchangeable random vector with a finite second moment, then where Q is the family of all possible priors.
and Q is the LFP on (2.7).
Thus the MM-design is also the NCP and power based Bayes design with respect to the LFP. Note that the LFP in Theorem 3.3 is the prior which assigns the same probability to each one of the least favourable configurations as discussed after the statement of Theorem 2.1.
As requested by a referee we briefly explore Bayes designs for two nonexchangeable priors; Example 3.1 deals with unequal variances and Example 3.2 with unequal means. In both cases we consider K = 3 treatment groups. We focus on NCP-based designs, i.e., ξ λ Q defined in (3.2), as these are much easier to calculate.
which we can maximize with respect to ξ. Designs for some specific values of the mean parameter η are reported in Table 1. Note that the optimal Bayes design reduces to a balanced two-point design when the maximally separated pair is unique. When η = (−1, 1, 1) there are two maximally separated pairs (η 1 , η 2 ) and (η 1 , η 3 ) and the optimal Bayes design is not unique.

Nash designs
As noted by [11] game theoretic ideas and, specifically, the notion of the Nash Equilibrium is widely and insightfully used in numerous disciplines. Applications of game-theory in statistics have a long, but far from voluminous, history. One exception is statistical decision theory which is strongly rooted in the theory of zero sum games, cf. [23] or [3] for a more modern treatment. An additional important reference is the classic book by [4] which discussed a wide array of statistical problems from a game theoretic perspective. We also mention the paper by [18] which touches on the relationship between game theory and Bayesian statistics. Unfortunately, a modern and vigorous applications of the ideas and tools developed in game theory to bear on current statistical problems is lacking. We believe that game theoretic ideas are well suited to address statistical design problems. This section provides the first steps in that direction. In particular, we explore the game theoretic formulation of the testing problem posed in (2.2). Later on we also consider the hypothesis testing problem (4.5) comparing a control with multiple treatments.

Standard ANOVA: The game theoretic view
The main result of this Section is Theorem 4.3 which is the game-theoretic equivalent of Theorem 2.1. The new formulation and proof are both mathematically simpler and more illuminating than the original ones, and moreover, may serve as a method for reasoning about complex design problems and as a blueprint for establishing future, even more demanding, results. For completeness, as well as the integrity of the exposition, we will first introduce the relevant terminology and notation with which we establish a sequence of results leading to Theorem 4.3.
We begin with some game theoretic terminology ( [14]). Consider a two person game in which Player I is the statistician and Player II is nature. Player I chooses a design ξ while Player II chooses a value μ. In the parlance of game theory the choices available to the Players are called strategies. Initially we shall assume that both Player I and II have at their disposal only a finite set of distinct choices, called pure strategies, denoted by X and M, respectively. We start by setting X The set X is the collection of all balanced two-point designs as discussed in Section 2 whereas the set M is the set of all permutations of the vector (−δ/2, δ/2, 0, . . . , 0), where we restrict the first non-zero element to be negative and the second to be positive. Note that M is a subset of the LFCs on M δ , i.e., if μ ∈ M then −μ / ∈ M. We need not consider all LFCs since λ(μ; ξ) = λ(−μ; ξ). See Example 4.1 for further clarification.
Let I 0 denote the set of pairs of indices {(i, j), (k, l)} where there are no matches, i.e., i / ∈ {k, l} and j / ∈ {k, l}, I 1 is the set of pairs with one match and I 2 is the set of pairs that are fully matched, i.e., (i, j) = (k, l). When Player I chooses ξ kl and Player II chooses μ ij we say that (μ ij , ξ kl ) is played. Suppose now that when (μ ij , ξ kl ) is played the payoff to Player I is the value of the NCP, i.e., and to Player II it is −λ(μ ij ; ξ kl ). Formally, we denote this game by G A (λ, X, M), where the subscript A indicates that we are playing the ANOVA game, λ is the payoff function and X, M describe the strategies available to Players I and II, respectively. Since the payoffs to Players I and II sum to zero the game is referred to as a zero-sum game. Similarly, G A (π, X, M) is the corresponding game associated with the power function. Since the power function π(μ; ξ) depends on (μ; ξ) only through the NCP we can write π(μ; ξ) ≡ π(N, λ(μ; ξ)) and following (4.1) we have Note that the games G A (λ, X, M) and G A (π, X, M) are motivated by our earlier observations on maximally separated means and their relationship with twopoint designs.
zeros when K > 3. When K ≤ 3, the payoff matrix contains no zeros. For example when K = 3 the payoff matrix, or game matrix, is given in Table 2. Table 2 The NCP-based game matrix for ANOVA with K = 3.
The objective of Player I is to maximize his gain, or payoff. Similarly, Player II would like to minimize his loss. Observe that i.e., the max-min and min-max values do not agree. This means that the game G A (λ, X, M) does not admit an equilibrium in pure strategies. Hence strategies μ * ij and ξ * ij such that for all μ ij ∈ M and ξ ij ∈ X do not exist; see the proof of Theorem 4.1 for more details. A comprehensive discussion on the existence of equilibria can be found in Chapters 4 and 5 of [14]. Suppose now that Player I may randomly select a pure strategy from X using a probability law p = (p 12 , . . . , p K−1,K ) where p ij is the probability of selecting the pure strategy ξ ij . Similarly, suppose that Player II may independently select a pure strategy from M using a probability distribution q. It is further assumed that Player I may choose p ∈ P where P = P (X) are all discrete distributions supported on X and Player II may choose q ∈ Q where Q = Q(M) are all discrete distributions supported on M.
The expected payoff when Players I and II choose the, so called, mixed strategies where Λ is the game matrix. The game in mixed strategies will be denoted by G A (λ, P (X), Q(M)). It is well known that finite zero-sum games have an equilibrium in mixed strategies. Thus, there are strategies p 0 and q 0 for Players I and II such that is called the value of the game. Further note that an optimal mixed strategy for Player I is a probability distribution p which guarantees (i.e., maximizes) the minimal gain; whereas an optimal mixed strategy for Player II is a probability distribution q which guarantees (i.e., minimizes) the maximal loss. The strategies p 0 and q 0 are said to be a Nash equilibrium if deviating from them is detrimental, i.e., if Γ(p 0 , Λ, q 0 ) ≤ Γ(p 0 , Λ, q) and Γ(p 0 , Λ, q 0 ) ≥ Γ(p, Λ, q 0 ) for all p ∈ P and q ∈ Q. The max-min and Nash equilibrium coincide in zero sum games. The value of the game G A (π, P (X), Q(M)) and its optimal mixed strategies are similarly defined by simply replacing λ by π in (4.3). Now,   Table 3 below where μ 23 = (0, δ/2, −δ/2). It is easy to see that the strategies μ 23 and μ 23 satisfy μ 23 = −μ 23 and are associated with the same payoffs. Thus they are equivalent from a game-theoretic perspective. It is not hard to show that the pair p =(1/3, 1/3, 1/3),   Table 4 below where μ 123 = (−δ/2, δ/2, δ/2).
The right hand side of (4.4) is minimized when q 123 = 0. Thus, the strategy μ 123 can be eliminated and the game reduces to G 3 .
Examples 4.1 and 4.2 show that adding any strategy to ν to M which satisfies max |ν i − ν j | = δ, i.e., ν lies on the boundary of the set M δ , will result in a game that is equivalent to G 3 . Example 4.3 shows that strategies in the interior of M δ are dominated and therefore will not be chosen by Player II.  Table 6. It is easily observed that ξ 23 is dominated by a balanced mixture of the strategies ξ 12 and ξ 13 . Thus, by Theorem 5.20 in [14], ξ 23 can be eliminated. The Nash Equilibrium of the resulting game (whose game matrix consists of the first two rows of Table 6) is (p, q) = ((1/2, 1/2), (1/2, 1/2)) and the payoff of the game is 5δ 2 /32 which is larger than the value of the Game G 3 (δ 2 /8).
Example 4.4 shows that removing any strategy from M results in a game which has higher value than the original game. To summarize, the preceding examples show that: Theorem 4.2 shows that the set of strategies M is complete with respect to M δ when Player I s strategies are X. By this we mean that Player II can not find any strategies in M δ which will reduce the value of the game and, in addition, the omission of any strategy from M will increase the value of the game. Now, after fixing the strategies of Player II we examine the strategies of Player I. First note that in expectation, or under repeated play, Player I allocates 1/K of the experimental subject to each treatment group. However, in any specific game Player I allocates 1/2 of the observations to group i and 1/2 to groups j for some pair (i, j). Thus, the payoff is 0 with probability K Therefore, we next consider the situation where Player I can choose any ξ ∈ Ξ. The strategy space for Player I is no longer finite and games of this type are called infinite games. As noted by a referee the action space of Player I, the set Ξ, is convex and therefore Player I need not consider any mixed strategies. We shall denote such games by G A (·, Ξ, Q(M)). We have:  it is also clear that Theorem 4.3 applies to the infinite games G A (λ, Ξ, M δ ) and G A (π, Ξ, M δ ) where the optimal strategy for Player II is to randomly select strategies from M using the probability mass function q.

Comparison of multiple treatments with a control: The game theoretic view
There are many experiments in which an inherent ordering among the experimental groups exists. For example, in dose-response studies a large response is often expected with a high dose. This ordering is referred to as the simple order. When a control is compared to multiple treatments, each of which is expected to outperform it, the resulting order is known as a tree order. There are many other orders and a rich literature describing efficient statistical analysis under order restriction. For more details see the books by [2], and [19].
In this section, we find the Nash-Design when testing under a tree order. Formally, we would like to compare a control group to K − 1 treatment groups. The hypothesis of interest in this case is: where M 0 was defined immediately after (2.2) and In this context it is natural to define Note that this M δ is different than considered in (2.7). As earlier δ can be viewed, in the sense described in Remark 2.1, as a distance from the null. Since this testing problem is location invariant we may, without any loss of any generality, assume that μ 1 = 0. i.e., the two point strategies comparing the control to treatment i. The payoff to Player I is assumed to be the NCP, λ, or the power, π, of the test (2.3). As noted by a referee the test (2.3) is not the classic constrained test for testing for the tree order [19]. However, as shown by [20] designs for constrained testing problems and unconstrained testing problems coincide asymptotically. Therefore in this manuscript we shall consider only payoffs derived from the unconstrained test. Let p and q be the probability laws by which Player I and II of choose their strategies from X and M respectively. It can be shown, by repeating the arguments made in the proof of Theorem 4.1, that when K ≥ 2 the Nash Equilibrium for the games G T (λ, P (X), Q(M)) and G T (π, P (X), Q(M)), where the subscript T on G T (·, ·, ·) indicates that these games are associated with the tree order, is The value of the game G T (λ, P (X), Q(M)) is δ 2 /4(K − 1). We now extend the strategy space of Player II by adding the strategy μ central = (0, δ/ (K −1), . . . , δ/(K −1)) to M. Note that μ central is the equal weights convex combination of the strategies X. Further assume that Player I chooses the strategies from X using the probability law p = (K − 1) −1 1 whereas Player II chooses from the strategies (μ 2 , . . . , μ central ) with probability law q = (q 2 , . . . , q K , q c ). The game matrix for this augmented game is presented in Table 7. Table 7 Game matrix for the augmented tree order.

Summary and discussion
In this article we considered optimal designs for two simple, but widely used, hypotheses testing problems, i.e., (2.2) and (4.5). Initially MM-designs for the one-way ANOVA problem were discussed. Then the corresponding Bayes and Nash designs were introduced and explored. A pseudo-Bayesian approach, in which an integrated version of the design criteria was optimized, was proposed in Section 3. It was shown that the Bayes design coincides with the maxi-min design for exchangeable priors with a finite second moment, which is a large class of reasonable priors. We also show that if the prior is not exchangeable then a balanced design is not obtained. The Bayesian approach can be extended to deal with many other testing problems. For example, consider the experiment in which multiple treatments are compared to a control, as discussed in Section 4.2. As noted this type of comparison is referred to as the simple tree order and the associated hypothesis testing problem is defined in (4.5). In this setting, it is reasonable to consider a prior Q for μ which is equally supported on {e j − δe 1 } for j = 2, . . . , K. A simple calculation shows that (5.1) Maximizing (5.1) with respect to ξ leads to Further observe that ξ 1 → 1/2 as δ → ∞. Thus, there is a sequence of priors, and Bayes optimal designs which converge to the MM-design for the tree order. which are, nothing but, Q-weighted locally optimal designs. For obvious reasons we shall refer to such designs as weighted designs. It is not hard to see that if q ij = P Q (E ij ), where E ij is the event that the means μ i and μ j are maximally separated, then any exchangeable prior would lead in the ANOVA setting to:

S. P. Singh and O. Davidov
with a similar result for ξ λ Q . Thus the weighted design coincides with the Bayes design under exchangeable priors. A similar equivalence can be shown to hold also under the tree order and even more general settings. Finally, as noted by a referee under a large class of priors Bayes factors in ANOVA settings are either exactly or asymptotically functions of the F -test statistic ( [8] and [5]). Therefore, large values of the F -test are associated with large probabilities for the alternative. Hence, maximising the power of frequentist F -test maximizes the power of these Bayesian tests.
In Section 4 the experimental design problem was formulated, and solved, as a game theoretic problem in which the Nash Equilibrium is shown to coincide with the MM-design as well as the Bayes and Weighted designs. Apart from the simplicity and elegance of the solution the proposed approach suggests that game theoretic ideas and methods should play a more prominent role in statistics, especially within the broad field of experimental design. We are not familiar with any other papers in the statistical literature which apply game theoretic ideas to experimental design problems. There are however some papers in the engineering literature cf. [9] and [7] dealing with design allocation problems which are addressed by finding the Nash equilibrium of a "design game". The game theoretic view provides a set of tools and method for reasoning in complex situations, which, we believe, will prove useful in solving variety of other problems in statistics.
It is well know that the Nash equilibrium coincides with the maxi-min solution in zero sum games. In some situations it may be difficult to compute the maxi-min solution but relatively easy, using symmetry say, to show that a particular set of strategies is a Nash equilibrium. Moreover it is possible to imagine design problems which are not maxi-min. For example, these could occur if sampling cost vary among units receiving different treatments. In such situations Nash designs are relevant whereas MM-designs are not.
Although the focus of this paper has been on the classical one-way ANOVA we believe that the proposed approach is applicable to two-way designs, as well as problems involving covariates, heteroscedastic errors and so forth. In each case the relevant payoff function needs to be defined and then a game pitting the statistician versus nature set-up, see [4] for a collection of problems viewed through this prism. We have already begun exploring various problems using this approach and believe that it can be extended to other areas of statistics. We realize that viewing nature as a strategizing agent may seem odd at first. However, it is common practice in a various disciplines including statistics ( [4]) and in particular statistical decision theory ( [3]). One benefit of the game theoretic approach is that it transforms a complicated optimization problems into a game, in which reasoning about outcomes is often easier and more natural.

Remark 1.
In the proofs of Theorem 3.1 and Theorem 3.3 we make use of an elegant result of [24] known as The Purkiss Principle. The principle, adapted to our setting, states that if f is a function which is symmetric in its arguments x 1 , . . . , x n then max{f (x 1 , . . . , x n ) : n i=1 x i = 1} is obtained when x i = x j for all i and j provided that the Hessian of f does not vanish at the optimum.

Proof of Theorem 3.1
Following lemma is required to prove Theorem 3.1. Since the latter holds for all permutation matrices S the function trace(A m ) is permutation symmetric in (ξ 1 , . . . , ξ K ) as required.
We now continue with the proof of Theorem 3.1.
Proof. Observe that the NCP given in (2.4) can be written in matrix form as where A is defined in the statement of Lemma 1. It follows from (A.1) and the definition of Λ(Q; ξ) that where E Q (μ) and V Q (μ) are the mean and variance of the random vector μ with respect to the distribution function Q. By assumption μ is distributed as N (μ 0 1, Σ) where Σ has a compound symmetry structure as described in the statement of the Theorem. It follows from (3.4) that (A.2) reduces to A simple calculation, using the method of Lagrange multipliers, shows that the maximizer of (A.3) under the constraint The average power with respect to the prior Q can be expressed as Under the assumption of normality where A = M(ξ) − M(ξ)JM(ξ). Following [13] the moment E(μ T Aμ) k are of the form where ν i are integers and the product is over all integers s j and r j such that j s j r j = k. It is easy to verify that if Σ is defined as in the statement of the Theorem then AΣ = (β − γ)A. Furthermore by Lemma 1, the quantity trace(A s ) is symmetric in (ξ 1 , . . . , ξ K ) for any positive integer s. Consequently so are (5), and Π(Q; ξ). Thus, by Remark 1, max{Π(Q; ξ) : i ξ i = 1} is attained when ξ i = ξ j for all i, j which gives a balanced design. This concludes the proof.

Proof of Theorem 4.1
Proof. First consider the game G A (λ, P (X), Q(M)). Suppose that μ ij is Player's II strategy in a Nash Equilibrium. Clearly Player I best response to μ ij is ξ ij . However, Player II best response to ξ ij is any μ lk where (l, k) = (i, j) and it follows that μ ij can not be a strategy in a Nash Equilibrium. Since the latter argument applies to all indices (i, j), it follows that Player I does not have a Nash Equilibrium in pure strategies. A similar argument shows that neither does Player II. It is straightforward to check that if (p 0 , q 0 ) is also the unique Nash Equilibrium in G A (π, P (X), Q(M)). This completes the proof.

Proof of Theorem 4.2
Proof. We start by showing that if M l ⊂ M then v(λ, X, M l ) > v(λ, X, M). Recall first that by Theorem 4.1 is the Nash equilibrium of the game G A (λ, P (X), Q(M)) whose value is v(λ, X, M) = δ 2 4 1 (K − 1) .
Let M l ⊂ M be the strategies available to Player II after removing the T pure strategies μ i1j1 , . . . , μ i T j T from M. Since the value of any zero sum game can not decrease when the strategy space for Player II is restricted we have v(λ, X, M l ) ≥ v(λ, X, M). (A.14) It follows that if (A.14) holds with a strict inequality for T = 1, i.e., when only one strategy is removed, then it will hold with a strict inequality for all T ∈ {2, . . . , K − 1}. Symmetry considerations imply that the identity of the removed strategy is immaterial. Hence, without any loss of generality, we may restrict our attention to the case where the strategy μ K−1,K is removed form M. Furthermore, it is clear that Next, we show that if M ⊂ M b ⊂ M δ then v(λ, X, M) = v(λ, X, M b ). The reasoning here is quite simple. Suppose that Player I uses the mixed strategy p = K 2 −1 1 K(K−1)/2 . Then for any strategy μ of Player II the payoff to Player I (A.24) Now, it is not hard to see that min{ i<j (μ i − μ j ) 2 : μ ∈ M δ } is attained at any permutation of the LFC (−δ/2, δ/2, 0, . . . , 0) in which case the value of i<j (μ i − μ j ) 2 is δ 2 K/2. It follows that the right hand side of (A.24) is larger or equal to δ 2 /4(K − 1) which is nothing but v(λ, X, M). Moreover if μ ∈ M δ \M then the right hand side of (A.24) is strictly larger than v(λ, X, M). However, when (A.24) holds with a strict inequality then Player II will assign 0 probability to this strategy, otherwise his loss is expected to increase.
Thus v(λ, X, M) = v(λ, X, M b ). If (A.24) holds with an equality then assigning a positive probability to it will not change the value of the game so again v(λ, X, M) = v(λ, X, M b ). This completes the proof.