SHARP MAXIMAL INEQUALITY FOR MARTINGALES AND STOCHASTIC INTEGRALS

Let X = ( X t ) t ≥ 0 be a martingale and H = ( H t ) t ≥ 0 be a predictable process taking values in [ − 1,1 ] . Let Y denote the stochastic integral of H with respect to X . We show that where β 0 = 2,0856... is the best possible. Furthermore, if, in addition, X is nonnegative, then where β + 0 = 149 is the best possible.


Introduction
Let (Ω, , ) be a complete probability space, which is filtered by a nondecreasing right-continuous family ( t ) t≥0 of sub-σ-fields of . Assume that 0 contains all the events of probability 0. Suppose X = (X t ) t≥0 is an adapted real-valued right-continuous semimartingale with left limits. Let Y be the Itô integral of H with respect to X , where H is a predictable process with values in [−1, 1]. Let ||Y || 1 = sup t≥0 ||Y t || 1 and X * = sup t≥0 X t .
The main interest of this paper is in the comparison of the sizes of Y * and |X | * . Let us first describe two related results from the literature. In [4], Burkholder introduced a method of proving maximal inequalities for martingales and obtained the following sharp estimate. Theorem 1. If X is a martingale and Y is as above, then we have where γ = 2, 536 . . . is the unique solution of the equation The constant is the best possible.
It was then proved by the author in [5], that if X is positive, then the optimal constant γ in (1) equals 2 + (3e) −1 = 2, 1226 . . .. We study here a related estimate, with Y replaced by its one-sided supremum: Let β 0 = 2, 0856 . . . be the positive solution to the equation and β + 0 = 14 9 = 1, 555 . . .. The main result of the paper can be stated as follows. Theorem 2. (i) If X is a martingale and Y is as above, then (2) holds with β = β 0 and the inequality is sharp. (ii) If X is a nonnegative martingale and Y is as above, then (2) holds with β = β + 0 and the constant is the best possible.
As usual, to prove this theorem, it suffices to establish its discrete-time version (by standard approximation argument due to Bichteler [1]; for details, see e.g. [2]). Let (Ω, , ) be a probability space, equipped with filtration ( n ) n≥0 . Let f = ( f n ) n≥0 be an adapted sequence of integrable variables and g = (g n ) n≥0 be its transform by a predictable sequence v = (v n ) n≥0 bounded in absolute value by 1. That is, for any n = 0, 1, 2, . . . we have By predictability of v we mean that v 0 is 0 -measurable (and hence deterministic) and for any k ≥ 1, v k is measurable with respect to k−1 . In the special case when each v k is deterministic and takes values in {−1, 1} we will say that g is a ±1 transform of f . Let f * n = max k≤n f k and f * = sup k f k . A discrete-time version of Theorem 2 is the following.
and the constant β 0 is the best possible.
(ii) If f is a nonnegative martingale, then and the constant β + 0 is the best possible. A few words about the organization of the paper. The proof of Theorem 3 is based on Burkholder's technique, which reduces the problem of proving a martingale inequality to finding a certain special function. The description of this technique can be found in Section 2. Then, in the following two sections we provide the special functions corresponding to (3) and (4) and study their properties. In the last section we complete the proofs of Theorem 2 and Theorem 3 by showing that the constants β 0 and β + 0 can not be replaced by smaller ones.

Burkholder's method
Throughout this section we deal with discrete-time setting. Let us start with some standard reductions. Assume f , g are as in the statement of Theorem 3. With no loss of generality we may assume that the process f is simple: for any integer n the random variable f n takes only a finite number of values and there exists a number N such that f N = f N +1 = . . . with probability 1. Furthermore, it suffices to prove Theorem 3 for ±1 transforms. To see this, let us consider the following version of the Lemma A.1 from [3]. The proof is identical as in the original setting and hence it is omitted.
Suppose we have established Theorem 3 for ±1 transforms and let β denote β 0 or β + 0 , depending on whether f is a martingale or nonnegative martingale. Lemma 1 gives us the processes F j and the functions φ j , j ≥ 1. Conditionally on 0 , for any j ≥ 1 the sequence φ j (v 0 )G j is a ±1 transform of F j and hence we may write The final reduction is that it suffices to prove that for any integer n we have To establish the above estimate, consider the following general problem. Let D = × ×(0, ∞)× and V : D → be a Borel function. Suppose we want to prove the inequality for any integer n, any martingale f and g being its ±1 transform.
The key idea is to study the family of all functions U : D → satisfying the following properties.
and, furthermore, The relation between the class and the estimate (6) is described in the following theorem. It is a simple modification of Theorems 2.2 and 2.3 in [4] (see also Section 11 in [2] and Theorem 2.1 in [3]). We omit the proof.

Theorem 4. The inequality (6) holds for all n and all pairs ( f , g) as above if and only if the class is nonempty. Furthermore, if
is nonempty, then there exists the least element in , given by Here the supremum runs over all the pairs ( f , g), where f is a simple martingale, (( f 0 , g 0 ) = (x, y)) = 1 and d g k = ±d f k almost surely for all k ≥ 1.

Theorem 5. The inequality (6) holds for all n and all pairs ( f , g) as above if and only if the class
Here the supremum runs over all the pairs ( f , g), where f is a simple nonnegative martingale, (( f 0 , g 0 ) = (x, y)) = 1 and d g k = ±d f k almost surely for all k ≥ 1.
Let us now turn to (3) and assume, from now on, that the function V is given by where β > 0 is a fixed number. Denote by (β), + (β) the classes , + corresponding to this choice of V . The purpose of the next two sections is to show that the classes (β 0 ) and + (β + 0 ) are nonempty. This will establish the inequalities (3) and (4).

The special function: a general case
We start with the class (β 0 ). Let us introduce an auxiliary parameter. The equation has a unique solution a = 0.46986 . . ., related to β 0 by the identity Let S denote the strip [−1, 1] × (−∞, 0] and consider the following subsets of S.
Introduce the special function u : S → by A function defined on the strip S is said to be diagonally concave if it is concave on the intersection of S with any line of slope 1 or −1. We have the following fact.

Lemma 2.
The function u has the following properties.
u is diagonally concave.
Proof. It is easy to check that u is of class C 1 in the interior of S. Now the condition (15) is apparent and hence so is (16). To see that (17) holds, note that We have Lemma 3. The function U belongs to (β 0 ).
Proof. The conditions (7) and (8) follow from the definition of U. The inequality (9) is equivalent to u ≥ −β 0 on the whole strip S, an estimate which follows directly from (16), (17) and (18).
Φ is continuous, The property (20) which is concave. In addition, one-sided derivatives of Φ match at − y and we are done.
To prove (23), note that the limit on the left equals −u(1, −1) = 1 + 2a, while the one on the right equals and the estimate is satisfied. Finally, let us turn to (24). The limit on the left is equal to 1 − 3a, due to (25). If −x + y ≥ −1 − β 0 , then the limit on the right is also 1 − 3a; for −x + y ≤ −1 − β 0 the inequality (24) becomes which is a consequence of the fact that the right hand side is a nonincreasing function of y and both sides are equal for −x + y = −1 − β 0 (see (13) and (14)).

The special function in the nonnegative case
Let S + denote the strip [0, 1] × (−∞, 0] and let Introduce the function u + : S + → by Here is the analogue of Lemma 2.

Lemma 4.
The function u + has the following properties.
Proof. It is not difficult to check that u + has continuous partial derivatives in the interior of S + . Now the properties (26) and (27) are easy to see. To show (28) observe that the function u + (·, 0) is concave on [0, 1] and u + (0, 0) = −β + 0 < u + (1, 0). Finally, it is obvious that u + is concave along the lines of slope 1 on D 1 ∪ D 4 , and along the lines of slope −1 on which gives F ′′ + (0) ≤ 0. This completes the proof.
Now we define the special function U + : D + → by the same formula as in (19), namely The following is the analogue of Lemma 3.

Optimality of the constants
In this section we prove that the constants appearing in (3) and (4) are the least possible. This clearly implies that the inequalities in Theorem 2 are also sharp.
The constant β 0 is optimal in (3). Suppose the inequality (5) is valid for all martingales f and their ±1-transforms g. By Theorem 4, the class (β) is nonempty; let U 0 denote its minimal element. By definition, this function enjoys the following properties.
Introduce the functions A, B : For the convenience of the reader, the proof is split into a few parts.
As previously, we divide the proof into a few intermediate steps.