A note on the distributions of the maximum of linear Bernoulli processes

We give a characterization of the family of all probability measures on the extended line $(-\infty,+\infty]$, which may be obtained as the distribution of the maximum of some linear Bernoulli process.

a n (t)ξ n , t ∈ T, generated by independent, identically distributed random variables ξ n with Eξ n = 0, Eξ 2 n = 1. The coefficients a n (t) are assumed to be arbitrary functions on the parameter set T , satisfying ∞ n=1 a n (t) 2 < +∞ for any t ∈ T , so that the series (1) is convergent a.s. Define in the usual way as the essential supremum in the space of all random variables with values in the extended real line (identifying random variables that coincide almost surely; cf. Remark 4 below). We consider the question on the characterization of the family F(L) of all possible distribution functions F (x) = P{M ≤ x} of M , assuming that the common law L of ξ n is given. In general, M may take the value +∞ with positive probability, so its distribution is supported on (−∞, +∞]. Introduce also the collection F 0 (L) of all possible distribution functions of M in (2), such that in the series (1), for all t ∈ T , When ξ n are standard normal, i.e., L = N (0, 1), we deal in (1) with an arbitrary Gaussian random process. As is well-known, for the distribution function F of M , x 0 = inf{x ∈ R : F (x) > 0} may be finite, and then it is sometimes called a take-off point of the maximum of the Gaussian process. Moreover, F may have an atom at it. But anyway F is absolutely continuous and strictly increasing on (x 0 , +∞), which follows from the log-concavity of Gaussian measures (cf. also [C], [HJ-S-D]). A complete characterization of all possible distributions F in the Gaussian case may be derived from the Brunn-Minkowski-type inequality for the standard Gaussian measure γ n on R n due to A. Ehrhard [E]. It states that, for all convex (and in fact, for all Borell measurable, cf. [Bo2]) sets A and B in R n of positive measure and for all λ ∈ (0, 1), where Φ −1 denotes the inverse to the standard normal distribution function on the line. This inequality immediately implies that, if F is non-degenerate, the function U = Φ −1 (F ) must be concave on R in the generalized sense as a function with values in [−∞, +∞). But the converse is true, as well. Indeed, suppose U = Φ −1 (F ) is concave on R, and for simplicity let F be non-degenerate and do not assign a positive mass to the point +∞. Then F is strictly increasing on (x 0 , +∞), so is its inverse . Then M is convex and finite on the whole real line, and therefore admits a representation for some coefficients a 0 (t), a 1 (t). By the construction, M has the distribution function F under the measure γ 1 , as was required. Thus, a given non-degenerate distribution function F belongs to F(N (0, 1)), if and only if the function Φ −1 (F ) is concave. A similar characterization holds true, when ξ n 's have a shifted onesided exponential distribution with mean zero. Then, F represents the distribution function of M for some coefficients a n (t), if and only if the function log F is concave. This follows from the log-concavity of the multidimensional exponential distribution (which is a particular case of Prékopa's theorem [P]; cf. also [Bo1] for a general theory of log-concave measures). In both above examples, for the "if" part it sufficies to consider simple linear processes X(t) = a 0 (t) + a 1 (t)ξ 1 . Hence, F 0 (L) = F(L). The situation is completely different, when ξ n have a symmetric Bernoulli distribution L, i.e., taking the values ±1 with probability 1 2 . This may be seen from: Theorem 1. Any distribution function F , such that F (x) = 0, for some x ∈ R, may be obtained as the distribution function of the supremum M of some linear Bernoulli process X in (1) with coefficients, satisfying the property (3).
In turn, the condition (3) ensures that all random variables X(t) in (1) are bounded from below, so is the random variable M in (2). Therefore, the distribution F of M must be onesided. Thus, we have a full description of the family F 0 (L) in the Bernoulli case. Removing the condition (3), we obtain a larger family F(L); however, it is not clear at all how to characterize it.
One should also mention that in the homogeneous case a 0 (t) = 0, much is known about various properties of M in terms of L, but the characterization problem is more delicate, and it seems no description or even conjecture are known in all above cases. For the proof of Theorem 1 one may assume that Ω = {−1, 1} ∞ is the infinite dimensional discrete cube, equipped with the product Bernoulli measure P. An important property of Ω, which will play the crucial role, is that it represents the collection of all extreme points in the cube K = [−1, 1] ∞ . More precisely, we apply the following statement.
Lemma 2. Any lower semi-continuous function f : for some family of the coefficient functions a n (t), defined on a countable set T and satisfying the property (3).
Note any function of the form (4) is lower semi-continuous.
Proof. First, more generally, let K be a non-empty, compact convex set in a locally convex space E, and denote by Ω the collection of all extreme points of K. A function f : Ω → (−∞, +∞] is representable as for some family ( This characterization follows from a theorem, usually attributed to Hervé [H]; see E. M. Alfsen [A], Proposition 1.4.1, and historical remarks. Namely, a point x is an extreme point of K, if and only ifḡ(x) = g(x), for any lower semi-continuous function g on K, whereḡ denotes the lower envelope of g (i.e., the maximal convex, lower semi-continuous function on K, majorized by g). Clearly, the equality (5) defines a function with properties a) − b). For the opposite direction one may use an argument, contained in the proof of Corollary 1.4.2 of [A]. If f is bounded and lower semi-continuous on Ω, put g(x) = lim inf y→x f (y) for x ∈ clos(Ω) and g = sup Ω f on K \ clos(Ω). Then g is lower semi-continuous on K and g = f on Ω. By Hervé's theorem, g(x) = g(x) = f (x), for all x ∈ Ω. Sinceḡ is also convex on K, one may apply to it the classical theorem on the existence of the representation for some family (f t ) t∈T of continuous, affine functions on E (cf. e.g. [A], Proposition 1.1.2, or [M], Chapter 11). Thus, restricting this representation to Ω, we arrive at (5). Finally, if f is unbounded from above, write f = sup n min{f, n} and apply (5) to the sequence min{f, n}.
In case of the infinite dimensional discrete cube, the right-hand side of (5) may further be specified. Indeed, any continuous, affine function g on E = R ∞ has the form g(x 1 , x 2 , . . . ) = a 0 + ∞ n=1 a n x n with finitely many non-zero coefficients. Therefore, (5) is reduced to the relation (4) with some coefficient functions a n = a n (t), that are defined on non-empty, perhaps, uncountable set T and satisfy the property (3). The latter implies that the sets T N = {t ∈ T : a n (t) = 0, for all n > N } are non-empty for all N ≥ N 0 with a sufficiently large N 0 . Define so that f = sup N ≥N0 f N . Since for each point v = (x 1 , . . . , x N ) in the finite dimensional discrete cube {−1, 1} N , the second supremum in (6) is asymptotically attained for some sequence of indices in T N , one may choose a countable subset Therefore, the set T ′ N = ∪ v∈{−1,1} N T N (v) is also countable, is contained in T N , and by (6), As a result, the supremum in (4) may be restricted to the countable set ∪ N T ′ N . Finally, let us note Ω is compact, so the property b) is automatically satisfied, when a) holds. This yields Lemma 2.
Proof of Theorem 1. According to Lemma 2, we need to show that distributions of lower semi-continuous functions f on {−1, 1} ∞ under the Bernoulli measure P fill the family of all one-sided distributions on (−∞, +∞]. In fact, it is enough to consider the functions of the special form f (x) = ϕ(Q(x)), where and where ϕ : [0, 1] → (−∞, +∞] is an arbitrary non-decreasing, left (or, equivalently, lower semi-) continuous function. It is allowed that for some point p ∈ [0, 1], ϕ jumps to the value +∞, and then we require that lim s→p ϕ(s) = +∞, as part of the lower semi-continuity assumption.
The map Q is continuous and pushes forward P to the normalized Lebesgue measure λ on the unit interval [0,1]. Hence, f is lower semi-continuous, and its distribution under P coincides with the distribution of ϕ under λ. It remains to see that, for any one-sided probability measure µ on (−∞, +∞], there is an admissible ϕ with the distribution µ under λ. Let us recall the standard argument (cf. e.g. [Bi], Theorem 14.1). Introduce the distribution function F (u) = µ((−∞, u]), −∞ < u ≤ +∞, and define its "inverse" ϕ(s) = min{u : F (u) ≥ s}, 0 < s ≤ 1.
Thus, ϕ has the distribution function F under λ. The proof is now complete.
Remark 3. The statement of Theorem 1 remains to hold in case of arbitrary independent random variables ξ n , taking two values, say, a n and b n with probabilities p n and q n , satisfying ∞ n=1 max{p n , q n } = 0.
In this case, the joint distribution P of ξ n 's represents a product probability measure on ∞ n=1 {a n , b n } without atoms. Let a n = −1 and b n = 1 (without loss of generality). Then, the map Q in the proof of Theorem 1 pushes P forward to a non-atomic probability measure λ on [0, 1], and a similar argument works. It is a well-known general fact that M can be represented as a pointwise supremum M = sup n X(t n ) a.s., for some sequence t n in T (cf. e.g. [K-A]). In particular, the supremum in (2) may always be taken over all t's from a countable subset of T .