The maximal difference among expert’s opinions

We prove the sharp bound for the expected spread upon opinions of n ≥ 2 experts, who have access to different information sources represented by different σ -ﬁelds. Using symmetrization argument and direct combinatorial optimization we derive an explicit optimizer. Our results may turn out to be helpful not only for probabilists, but also for statisticians and economists.


Introduction
The purpose of this paper is to establish a certain sharp maximal estimate for dependent random variables which stems from applications in statistics and information theory. We start with the motivation and the necessary background. Imagine a group of n ≥ 2 experts with an access to different knowledge, who are asked to evaluate the odds of some future event. The question is: how radically different can their opinions be? This natural and interesting problem can be formalized using the notion of conditional probability. First, our experts must agree upon a basic model of reality, which can be understood as accepting a common probability space (Ω, F, P). Inconsistent sources of information shall then be identified with different sub-σ-fields G 1 , G 2 , . . . , G n ⊂ F. Consequently, opinions on an event A ∈ F will be expressed as random variables X 1 , X 2 , . . . , X n defined by See [6,7,8,12] for applications to statistics, [1] for applications to economy, [3,4] for purely probabilistic considerations and [2] for philosophical implications. For notational convenience, hereinafter, we write (X 1 , X 2 , . . . , X n ) ∈ C, whenever we want to indicate that the vector (X 1 , X 2 , . . . , X n ) is coherent. Sometimes, in the literature, the joint distribution of (X 1 , X 2 , . . . , X n ) ∈ C is also referred to as coherent. Now, for a fixed α ∈ R + and n ≥ 2, consider the number sup (X1,X2,...,Xn)∈C E max 1≤i<j≤n |X i − X j | α , (1.1) where the supremum is taken over all probability spaces (Ω, F, P), all events A ∈ F and all sub σ-fields G 1 , G 2 , . . . , G n ⊂ F. The question about the explicit formula for this supremum can be regarded as the precise, mathematical reformulation of the initial problem, concerning the maximal spread of coherent opinions. The primary goal of this paper is to answer this question for α = 1.
Thus, the supremum behaves in a quite surprising manner: we have a few "irregular" terms corresponding to small values of n, and then, for n ≥ 5, it is given by a nice and compact expression. We would like to point out that the result in special case n = 2 has been already known in the literature. As Burdzy and Pitman showed in [4], this is a consequence of the identity |X − Y | = 2 · max(X, Y ) − X − Y and the following sharp estimate established in [10]. Theorem 1.3. For all n ∈ Z + and any (X 1 , X 2 , . . . , X n ) ∈ C satisfying EX i = p for all i, .
A more general result, concerning the calculation of (1.1) still in the special case n = 2, but for a nontrivial range of the parameters α, was established independently in [1,5] with the use of geometric techniques in Hilbert spaces.
Finally, we would like to mention that our result can also be regarded as a generalization of the martingale diameter problem, see e.g. [9,11].
In the next section we apply an appropriate symmetrization and reduce the problem of calculating the left-hand side of (1.2) to the analysis of the simpler expression sup E max 1≤i≤n X i , where the supremum is taken over all coherent vectors satisfying EJP 26 (2021), paper 105. certain symmetry constraints. Then, in Section 3, using various technical combinatorial arguments, we gradually simplify the context: we show that the extremal coherent vectors (i.e. those for which the simpler supremum is attained) can be assumed to satisfy more and more structural properties. After several steps, this allows us to express the supremum as the extremal value of a certain function of one variable, which in turn can be computed explicitly. Our approach was inspired by the paper [3] by Burdzy and Pal: in that article, a related problem for coherent vectors was also studied with the use of a certain discretization and subsequent combinatorial reductions.

Basic reductions and symmetrizations
Our starting point is the following discretization, which enables us to restrict our argument to random variables taking values in a finite set. From now on, we will often use the shorter notation and write X instead of (X 1 , X 2 , . . . , X n ).
where C(n, m) is the set of all X = (X 1 , X 2 , . . . , X n ) ∈ C such that X i takes at most m different values for every 1 ≤ i ≤ n.
Next, we have the following simple, yet very useful observation.
where C (n, m) is the subset of all X ∈ C(n, m) which satisfy P max 1≤i≤n X i = 1 ∪ min 1≤j≤n X j = 0 = 1. Proof. Fix X ∈ C(n, m). Then there is a probability space (Ω, F, P), an event A ∈ F and σ-algebras G 1 , G 2 , . . ., G n such that With no loss of generality, we may assume that the probability space is non-atomic. Now we will perform a sequence of transformations of the variables X i (or rather of the corresponding σ-algebras G i ), after which · the maximum max 1≤i≤n X i will increase to 1 on A; · the minimum min 1≤i≤n X i will decrease to 0 on A c ; · the expectation E max 1≤i<j≤n |X i − X j | will increase or stay unchanged.
This will clearly yield the claim. For a given i ∈ {1, 2, . . . , n}, the transformation of X i can be described as follows. Split the set {max 1≤j≤n X j = X i } ∩ A into the events and hence the passage X →X does not decrease the maximized expectation. Furthermore, note that max 1≤j≤nXj has increased to 1 on A i,x . The desired transformation of X i is obtained by applying the above modification for all x ∈ (0, 1) with P(A i,x ) > 0. Now, performing the above transformations of X 1 , X 2 , . . . , X n , we obtain a new vector X for which P max Furthermore, applying the above procedure to the coherent sequence (1 − X 1 , 1 − X 2 , . . . , 1 − X n ) (corresponding to the event A c ), we may also guarantee that P min which completes the proof.
Remark 2.4. It follows easily from the above argument that if X ∈ C (n, m), then the corresponding event A satisfies A = {max 1≤i≤m X i = 1} and A c = {min 1≤i≤n X i = 0}, up to sets of probability zero.
We may assume the existence of such a random variable, taking the larger, product probability space if necessary. Next, note that almost surely. Thus we can rewrite the right-hand side of (2.2) as whereX is a mixture of X and 1 − X, given by Of course,X takes values in a finite set. Furthermore, we have EX i = 1 2 for all i and, as the set of coherent distributions on [0, 1] n is convex (see [4]), we conclude thatX ∈ C. Let us be more specific here. We take the σ- Figure 1. As a direct consequence of (2.4) and the above discussion, we obtain the following Proposition 2.5. For any n ≥ 2, we have the equality where C (n, m) is the subset of all those X ∈ C(n, m) that satisfy (2.3) and (2.5).
EJP 26 (2021), paper 105. Remark 2.6. By (2.5), all the coordinates of an arbitrary vector X ∈ C (n, m) have expectation 1/2; this in particular implies that the event A which "generates" X satisfies P(A) = 1/2. Actually, (2.5) yields the stronger symmetry property of X around 1 2 : we have the equality of distributions The above remark enables the following further reduction.
which completes the proof.
By the above reductions, we see that it is enough to handle the expression sup m∈{1,2,... }, Observe that for a fixed m and an X ∈ C (n, m), we have 1 A · max 1≤i≤n X i = 1 A almost surely, due to Remark 2.4. In addition, we have the identity for 1 ≤ i ≤ n and x ∈ [0, 1], which follows from (2.5). In other words, the distributions of 1 A c max 1≤i≤n X i and 1 A − 1 A min 1≤i≤n X i coincide, and we get the following.
The main advantage of this proposition is that it allows us to restrict the analysis of the sequences X to the set A.

Combinatorial optimization
Now we are going to present a combinatorial analysis of (2.8), which will be done by some geometrical considerations in a slightly different setup. Let us start with introducing some auxiliary notation. Throughout, we assume that n ≥ 2 is a fixed integer.
EJP 26 (2021), paper 105. Let (Ω, F, P) be a non-atomic probability space and let A be a fixed event satisfying P(A) = 1/2. For an event S ⊂ A and any x ∈ (0, 1], we set In our considerations below, it will be convenient to interpret S x as horizontal line segments at level x, or finite unions of sets of this type. We assign to each S x the corresponding function S x : A → [0, 1], given by For k ∈ {1, 2, . . . }, we denote by Λ(n, k) the family of all sequences (S x1 for every x ∈ (0, 1). In this setup, we consider the following, new optimization problem:  Proof. Fix m ∈ {1, 2, . . . }, X ∈ C (n, m) and the corresponding "generating" event A.
As we have seen in Remark 2.6, we have P(A) = 1/2. For a given 1 ≤ i ≤ n, let {x i 1 , x i 2 , . . . , x i m } denote the set of values attained by X i , 1 ≤ i ≤ n (if X i takes less than m different values, we add some extra, superfluous elements to the set). Introduce the events T i,j = {ω ∈ A : X i (ω) = x i j }, for 1 ≤ i ≤ n and 1 ≤ j ≤ m. Since P(A) = 1/2, we have the straightforward equality 1≤i≤n, 1≤j≤m P(T i,j ) = n · 1 2 .
Next, for a fixed 1 ≤ i ≤ n and x ∈ (0, 1), we may also write where the first equality is due to the Lemma 2.2 and the second one is a consequence of (2.5). Adding up the above equalities for 1 ≤ i ≤ n yields the condition (3.2) for the be a partition of A such that U i ∈ F and U i ⊂ {X i = 1} for all 1 ≤ i ≤ n, up to a set of measure zero. The existence of such a partition is an obvious consequence of Remark 2.4. Define a modification (S Since P(A) = 1/2, we get  .2)). Consequently, we have (S is not smaller than the quantity in (3.3). Since m and X were arbitrary, the proof is complete.
Our plan is to solve the problem (3.3), by performing a sequence of combinatorial and geometrical reductions. We start with some simple observations. First, note that by (3.2), a sequence (S x1 1 , S x2 2 , . . . , S x k k ) ∈ Λ(n, k) enjoys a sort of skew-symmetry around 1/2: in particular, if a level x belongs to {x 1 , x 2 , . . . , x k }, then so does 1 − x. Second, obviously, the integral in (3.3) does not depend on the order of the sets S x1 1 , S x2 2 , . . . , S x k k , so we may permute them arbitrarily. Furthermore, note that if we split S x k k into two setsS x k k andS x k k+1 (of course, the level x k needs to be preserved) and replace (S x1 1 , S x2 2 , . . . , S x k k ) with (S x1 1 , S x2 2 , . . . , S x k−1 k−1 ,S x k k ,S x k k+1 ), then the integral will not change either; a similar phenomenon occurs if we splice two disjoint sets lying at the same level. In other words, given a sequence (S x1 1 , S x2 2 , . . . , S x k k ), we may cut some of the sets into pieces or merge some of them, with no effect on the optimized expression (3.3). The next step is the following.
where Λ (n, k) is the subset of all those (S x1 1 , S x2 2 , . . . , S x k k ) ∈ Λ(n, k), which satisfy Proof. Fix k ∈ {1, 2, . . . }, a vector (S x1 1 , S x2 2 , . . . , S x k k ) ∈ Λ(n, k) and assume that the condition (3.5) is not satisfied for some 0 < x i ≤ x j ≤ 1 2 . Of course, the claim will follow if we construct a modification (S x1 1 ,S x2 2 , . . . ,S x k k ) ∈ Λ (n, k) such that A min(S x1 1 , S x2 2 , . . . , S x k k ) dP ≥ A min(S x1 1 ,S x2 2 , . . . ,S x k k ) dP. There are two possible scenarios: we either have In the case of (3.7), we can simply cut off a part of S j which is already covered by a smaller value x i and transfer as much of it as possible to the set {min(S x1 See Figure 2. Observe that the modification obtained in this way satisfies (3.6). After a finite number of such transformations, we obtain a sequence (S x1 1 , S x2 2 , . . . , S x k k ) for which either (3.5) holds, or we have (3.8).  EJP 26 (2021), paper 105.
By (3.2), such (T i,m ) m , (T j,m ) m exist (since we assumed that the probability space is nonatomic). Assume further that x 0 is a number satisfying and observe that x 0 < x i : indeed, we have a = (1 − x i )c/x i < a + b + c. We shall now perform the following transformation: -see Figure 3. It is straightforward to check that the new, modified sequence (S x ) satis- Figure 3: The transformation in the case (3.8). The integral A min(S xi i )dP can only decrease. Indeed, the new line segment of length c on the right picture lies below x i ; furthermore, though the line segment of length a + b + c might lie on a higher level, the condition (3.8) guarantees that there must be a "layer" of line segments lying below it. fies (3.1), (3.2) and (3.6). After a finite number of such transformations, we guarantee the validity of (3.5).
Therefore, from now on, we may restrict our analysis of (3.3) to the class k≥1 Λ (n, k).
To proceed, consider a vector and introduce a partition of A into the following four basic components. Namely, we first Let y 1 , y 2 , . . . , y m be the collection of values taken by min 1≤i≤k S xi i on the set A 2 . For 1 ≤ j ≤ m, we set  A min(S x1 1 , S x2 2 , . . . , S x k k ) dP, (3.9) where Λ (n, k) is the subset of all (S x1 1 , S x2 2 , . . . , S x k k ) ∈ Λ (n, k) which satisfy Proof. Fix k ∈ {1, 2, . . . }, a sequence (S x1 1 , S x2 2 , . . . , S x k k ) ∈ Λ (n, k) and assume that the condition (3.10) is not satisfied. This in particular implies P(A 2 ) > 0 and hence we must also have P(A 3 ∪ A 4 ) > 0 (see the skew-symmetry of (S x1 1 , S x2 2 , . . . , S x k k ), mentioned above Proposition 3.2). Now, let us fix any i ∈ {1, 2, . . . , k} with x i ≥ 1 2 . Then, for any event we perform the following rearrangement: -see Figure 4b). The existence of (T xi m ) N m=1 follows trivially from the condition P(A 3 ∪ A 4 ) > 0: we allow the overlapping of the sets. The obtained modified sequence belongs to Λ (n, ) for some and the minimum of the corresponding functions is unchanged almost surely on A, in comparison to the initial minimum min S x . It remains to observe that we may guarantee the validity of (3.10), by performing sufficiently many such transformations.

Proposition 3.4.
In the problem (3.9), we are allowed to restrict ourselves to those vectors (S x1 1 , . . . , S x k k ) ∈ Λ (n, k), which additionally satisfy Proof. The argument follows the above pattern. Namely, we fix k ∈ {1, 2, . . . }, a sequence (S x1 1 , . . . , S x k k ) ∈ Λ (n, k) and assume that the condition (3.11) is not satisfied. Let x > 1 2 be such that P(M x ) > 0, where, as before, Moreover, recall that N 1−x is an event satisfying Note that we have  such operations, we remove A 2 and thus enforce the validity of (3.11). To guarantee the second condition, we proceed as previously. We start with an arbitrary sequence for which (3.11) is satisfied, but (3.12) is not. We may assume, performing a permutation of the indices if necessary, that x 1 , x 2 > 1 2 and s 1 := P(S 1 ) > 0, s 2 := P(S 2 ) > 0. By (3.2), we can find T 1−x1 , T 1−x2 ∈ F such that Consider the auxiliary equation that is, equivalently, (if x 1 or x 2 is equal to 1, we understand this equation as x 0 = 1). We will check that (3.15) This is obvious for x 0 = 1, for the remaining x 3 we substitute the previous identity and rewrite the estimate in the form which is evident. Having that in mind, let us consider the transformation: EJP 26 (2021), paper 105. By (3.14), the obtained new sequence still belongs to ≥1 Λ (n, ) and enjoys (3.11). Furthermore, by (3.15), the appropriate minimized integral over A does not increase. It remains to note that after a finite number of the above transformations, the condition (3.12) will become true. Theorem 3.5. For any n, the number s(n) is equal to the right-hand side of (1.2).
Proof. By the above reductions, in the definition of s(n) we may restrict ourselves to those (S x1 1 , S x2 2 , . . . , S x k k ) ∈ Λ (n, k), which additionally satisfy (3.11) and (3.12). This is a very simple context: there are at most three different levels of the sets S j . See Figure 6. Put p := P(A 1 ) and suppose that the maximal level is equal to x 1 . Note that Figure 6: There are at most three different values in the set {x 1 , x 2 , . . . , x k }; we may assume that x 1 is the largest of them. the random variable min(S x1 1 , S x2 2 , . . . , S x k k )1 A is equal to 1/2 on A 1 and to 1 − x 1 on A 4 , so we have s(n) = 1 − 2 · inf p∈[0, 1 2 ] p · 1 2 + 1 2 − p · (1 − x 1 ) .
To express x 1 in terms of p, we apply (3.1) and (3.2) with x = x 1 to obtain