A ratio inequality for nonnegative martingales and their differential subordinates

We prove a sharp inequality for the ratio of the maximal functions of nonnegative martingales and their differential subordinates. An application in the theory of weighted inequalities is given.


Introduction
Suppose that (Ω, F, P) is a complete probability space, filtered with a nondecreasing right-continuous family (F t ) t≥0 of sub-σ-fields of F, such that F 0 contains all the events of probability 0. Let X, Y be two adapted martingales, taking values in a certain separable Hilbert space H (which can be taken to be 2 ) with norm | · | and scalar product denoted by the dot ·. We impose the usual regularity properties on the trajectories of X and Y : the paths are assumed to be right-continuous and have limits from the left. Then [X, X] will denote the quadratic covariance process of X: that is, we set [X, X] = ∞ n=1 [X n , X n ], where X n is the n-th coordinate of X and [X n , X n ] is the usual square bracket of the real-valued martingale X n (see e.g. Dellacherie and Meyer [4] for details). In our considerations, X * = sup t≥0 |X t | will denote the maximal function of X, we will also use the notation X * t = sup 0≤s≤t |X s | and X p = sup t≥0 X t L p , 1 ≤ p ≤ ∞. Throughout the paper we will assume that the process Y is dominated by X in the sense of the so-called differential subordination: following Bañuelos and Wang [1] and Wang [17], we say that Y is differentially subordinate to X, if the process ([X, X] t − [Y, Y ] t ) t≥0 is nondecreasing and nonnegative as a function of t. The origins of this domination principle go back to the works of Burkholder [3] in the discrete-time case: a martingale g = (g n ) n≥0 is differentially subordinate to f = (f n ) n≥0 , if for any n ≥ 0 we have |dg n | ≤ |df n | almost surely. Here df = (df n ) n≥0 , dg = (dg n ) n≥0 are the difference sequences of f and g, respectively, uniquely determined by the requirement f n = n k=0 df k and g n = n k=0 dg k , n = 0, 1, 2, . . . .
This is a special case of the continuous-time differential subordination: simply treat given two martingales f , g as continuous-time processes (via X t = f t and Y t = g t , * University of Warsaw, Poland. E-mail: ados@mimuw.edu.pl t ≥ 0). Then the inequality |df | ≥ |dg| implies that [X, X] − [Y, Y ] enjoys the required properties.
Inequalities for martingales and their differential subordinates (in the discrete and continuous time) have been studied very intensively and have found numerous and deep applications in harmonic analysis. The literature on the subject is extremely vast and it is impossible to give even a brief review here. See e.g. [1,2,3,7,13,16,17] and the references therein. We will just present a few results which serve as a motivation for our research. A celebrated result of Burkholder [3,17] gives the following sharp L p estimate.
where p * = max{p, p/(p − 1)}. The constant is the best possible, even if H = R.
For p = 1, the above inequality fails to hold, but we have the corresponding weak-type (1, 1) estimate. In fact, there is a result for a wider range of parameters p, proved by Burkholder [3] for 1 ≤ p ≤ 2 and Suh [16] for p > 2. See also Wang [17].
Both inequalities are sharp, even if H = R.
There are also other, logarithmic, exponential, mixed-norm and other results in this direction. Moreover, there is a powerful method which enables the identification of the best constants involved: roughly speaking, it reduces the problem to the construction of a certain special function, enjoying appropriate size requirements and concavity: see [13].
We will continue the above line of research and study a novel class of estimates between X and Y . Observe that the inequalities of Theorems 1.1 and 1.2 can be regarded as comparisons of sizes of X and Y measured separately in terms of strong or weak L p norms and, as such, they do not say anything about the joint behavior of X and Y . From this point of view, a very natural functional to study is the ratio |Y |/|X|. For example, for a given time t > 0 and a level λ > 0, is there any nontrivial upper bound for the probability P(|Y t | > λ|X t |)? No. It is easy to construct, for any t and λ, a pair (X, Y ) of real-valued martingales satisfying the differential subordination such that this probability is as close to 1 as we wish. Indeed, for λ < 1 the constant pair (1, 1) does the job, while for λ ≥ 1 we take a two-dimensional Brownian motion (X, Y ) started at (1, 1) and stopped upon reaching the set {(x, y) : |y| = 2λ|x|}. Then the differential subordination is satisfied and, rescaling time if necessary, we can ensure that the probability P(|Y t | = 2λ|X t |) is arbitrarily close to 1. The above problem becomes more interesting if the maximal functions of X and Y are involved. Our first result is negative, even in the discrete time.
Theorem 1.3. For any λ > 0 and any η < 1 there is a pair (f, g) of discrete-time, real valued martingales starting from 0 such that g is differentially subordinate to f and P(g * > λf * ) ≥ η.
However, if we assume that the dominating martingale is nonnegative, there is a nontrivial bound. Here is the precise statement.

Theorem 1.4.
Suppose that X is a nonnegative martingale and Y is an 2 -valued martingale which is differentially subordinate to X. Then for any λ > 0 we have For any λ, the constant on the right cannot be improved, even in the discrete-time case.
Our approach will exploit Burkholder's method: the inequality (1.2) will be deduced from the properties of a certain special function constructed in the next section. Section 3 contains the proofs of Theorems 1.3 and 1.4. The final part of the paper is devoted to some applications of the estimate (1.2) in the theory of weighted inequalities.

A special function and its properties
Throughout this section, we assume that λ > 2 is a fixed parameter and d is a fixed dimension. Distinguish the following four sets: It is easy to check that the function u is continuous on so the formulas for u on D 1 and D 2 match appropriately on the common boundary. One checks analogously that u is continuous at each point from ∂D 2 ∩ ∂D 3 and (∂D 3 ∩ ∂D 4 ) \ {(x, y) : |y| = λ}. This gives the aforementioned continuity, since ∂D i ∩ ∂D j = ∅ for other choices of i and j. A similar calculation shows that u is of class C 1 on x + |y| = λ}: the partial derivatives of u match at the common boundaries of the sets D 1 , D 2 and D 3 . In our further considerations we will need to extend these partial derivatives to the whole Here and in what follows, we use the notation y = y/|y| if y ∈ R d \ {0} and 0 = 0.
Further important properties are studied in a sequence of lemmas below.
Proof. The first inequality is trivial: To show the majorization (2.2), fix x ∈ [0, 1] and observe first that the number u(x, 0) = e 1−λ/2 (−x 2 /4 + x) lies between 0 and 1. Furthermore, it follows directly from the formula for u that the function y → u(x, y) increases (precisely: does not decrease) as |y| increases, and it is equal to 1 for |y| ≥ λ − x. This yields the claim.
Let us now establish an important bound for the Hessian matrix of u.
Both these expressions are nonpositive, so the claim follows.
We will also need the following auxiliary estimate.
Proof. We consider separately three cases.
Case I. |y| > λ − 1. Then the estimate is obvious: the right-hand side is equal to 1, while the left-hand side is at most one (see the left inequality in (2.2)).
the point (1, (y + k)/(x + h)) belongs to D 1 or to D 2 , depending on whether |y + k|/(1 + h) is less than 1 or not. If this point lies in D 1 , the inequality (2.4) can be rewritten as and the left-hand side is not bigger than 1, while the right-hand side is at least and again: the left-hand side does not exceed 1 (see (2.5) above), while the expression on the right is 1 or more. Case III: |y| ≤ 1. We observe that Clearly, if we increase h, then the left-hand side decreases and the right-hand side increases; therefore, it suffices to show the estimate for the smallest value of h, i.e., h = |k|. Dividing by |k| and rearranging terms, we see that it is enough to show that −|y + k| 2 (|k| + 2) ≤ (1 + |k|) 2 (2 − |k|).
Combining the two lemmas above, we get the following bound.

Corollary 2.4.
For any (x, y) ∈ [0, 1] × R d , any h ∈ R and any k ∈ R d such that x + h ≥ 0 and |k| ≤ |h|, we have (2.6) Proof. We may assume that h = 0, since otherwise the claim is obvious. Consider the function G given by the formula defined for those t ∈ R, for which x + th ≥ 0. Then the inequality (2.6) is equivalent to G(1) ≤ G(0) + G (0) (for the points (x, y) at which u is not differentiable, G (0) is a one-sided derivative). On the set of those t, for which x + th ∈ [0, 1] (i.e., for t lying between −x/h and (1 − x)/h), the function G is concave. To see this, note that for these particular t we have the equality G(t) = u(x + th, y + tk). Now, if (x + th, y + tk) lies in the interior of some D i , then G is twice differentiable and by (2.3), On the other hand, if (x + th, y + tk) lies at the common boundary of two sets D i and D i+1 , then G (t−) ≥ G (t+): indeed, we have equality for i = 1, 2 (since u is of class C 1 in the interior of D 1 ∪ D 2 ∪ D 3 ), while for i = 3 this follows from the fact that u(x, y) = 1 on D 4 and u(x, y) ≤ 1 on D 3 . Putting these observations together, we get the aforementioned concavity of G and hence if x + h ≤ 1, the claim follows. If x + h > 1, then in particular h > 0 and the concavity of However, the expression on the right is not smaller than the left-hand side of (2.6): this is guaranteed by the inequality (2.4). Indeed, set Then |k| ≤h and hence By (2.7), this is bounded from above by G(0) + G (0) and the claim is proved.
The final property of u is the following.

Proof of (1.2)
We start with recalling a simple fact from stochastic analysis [4]. Namely, for any martingale X there exists a unique continuous local martingale part X c of X satisfying for all t ≥ 0. Furthermore, the bracket [X c , X c ] coincides with [X, X] c , the pathwise continuous part of [X, X]. We will also require the following statement [17, Lemma 1]. In the proof of (1.2), we may assume that λ > 2, since otherwise the claim is obvious. Suppose that X, Y are two martingales such that X is nonnegative, Y takes values in a Hilbert space H = 2 and is differentially subordinate to X. We may restrict ourselves to X which is bounded away from 0, by adding a deterministic number ε > 0 to this process and letting ε → 0 at the end (such a modification does not affect the differential subordination). For a fixed small positive number η, take the stopping time x ≤ z} and let U : D → R be given by Before we proceed to the formal proof (which involves some additional technical arguments), let us briefly sketch the idea. First we translate the properties of u into the language of U . By (2.3), if (x/z, y/z) lies in the interior of some D i , then Furthermore, by (2.6), if x ≤ z and |k| ≤ |h|, then we have Finally, Lemma 2.5 implies that for each (x, y), the function z → U (x, y, z) is nonincreasing on [x, ∞).

(3.3)
The plan is to apply Itô's formula to the composition of U with the stopped process (X τ , Y τ , (X τ ) * ). Then the three inequalities above imply that the resulting semimartingale is actually a supermartingale: the first inequality handles the second-order terms, (3.2) deals with the jump part and (3.3) enables the control of the stochastic integral with respect to X * . Therefore, we obtain where the last bound is due to (2.1) and the differential subordination. Using the right inequality in (2.2), we obtain P(|Y τ | ≥ λX * τ ) ≤ e 1−λ/2 and it remains to note that There is a small gap in the above reasoning, but the main idea goes along the path described above. The problem is that the application of Itô's formula is not permitted, since the function U lacks the necessary regularity. To overcome this difficulty, we will make use of some additional mollifying arguments. Let g : R × R d × R → [0, ∞) be a C ∞ function supported on the unit ball of R × R d × R and satisfying R×R d ×R g = 1. For a fixed parameter δ > 0, we define U δ : D → R by the convolution 1] U (x + δ + δr, y + δs, z + 4δ + δt)g(r, s, t) drdsdt.
The function u is of class C 1 in the interior of the set D 1 ∪ D 2 ∪ D 3 , so U is of class C 1 inside the set D = {(x, y, z) : x + |y| < (λ − 1)z, x < z}. Therefore, x,y U (x + δ + δr, y + δs, z + 4δ + δt)g(r, s, t) drdsdt Moreover, the inequality (3.2) extends to U δ without any modification (for any (x, y, z) ∈ D ); the same happens to the property (3.3). Now, the function U δ is of class C ∞ , so composing it with the process S τ = (X τ , Y τ , (X τ ) * ) we get, by Itô's formula, Here in I 3 we have used a shortened form for the sum of all second-order terms. Observe that the integrals in I 2 are with respect to the continuous part of the processes X * ; this implies the lack of the term U δ z (S s− )∆X * s in I 4 . By the properties of stochastic integrals, the term I 1 has zero expectation. By (3.3) we have U δ z ≤ 0, so I 2 ≤ 0. To handle I 3 , note that (S t− ) t≥0 takes values in the set D (which follows directly from the form of the stopping time τ ). Hence, if we approximate I 3 by Riemann sums and apply (3.4), we get where in the last passage we have exploited the differential subordination of Y c to X c (see Lemma 3.1) and the inequality c δ ≥ 0. The term I 4 is also nonpositive, since by (3.2) and the inequality |∆Y s | ≤ |∆X s | (see Lemma 3.1 again), so is its each summand. Putting all the above facts together, we obtain the estimate EU δ (X τ ∧T , Y τ ∧T , X * τ ∧T ) ≤ EU δ (X 0 , Y 0 , X * 0 ). Now the function U δ is continuous and bounded by 1 (since U has this property). Consequently, letting T → ∞ we get EU δ (X τ , Y τ , X * τ ) ≤ EU δ (X 0 , Y 0 , X * 0 ), by Lebesgue's dominated convergence theorem. Now, the function U is continuous on {(x, y, z) ∈ (0, ∞) × R d × (0, ∞) : x ≤ z}, so letting δ → 0, using the boundedness of U and the fact that X is bounded away from 0, we arrive at EU (X τ , Y τ , X * τ ) ≤ EU (X 0 , Y 0 , X * 0 ). As we have seen above, this implies the desired tail estimate P(Y * > λX * ) ≤ e 1−λ/2 . . For a given integer N > λ, and for any n = 0, 1, 2, . . . , N , set f n = ε 1 − ε 2 + ε 3 − . . . + (−1) n+1 ε n and g n = −ε 1 − ε 2 − ε 3 − . . . − ε n . Furthermore, for n > N , put f n = f N and g n = g N . Note that f and g are mean-zero martingales (which follows from the fact that ε i are centered). Furthermore, we have dg n = (−1) n df n , so the differential subordination is satisfied. Consider the event A = {ε 1 = ε 2 = . . . = ε N = −1}, on which we have f 0 = 0, f 1 = −1, f 2 = 0, f 3 = −1, . . . and g N = N , so f * = 1 and g * ≥ N . Consequently, P(g * > λf * ) ≥ P(A) = M M +1 N , and the latter fraction can be made arbitrarily close to 1, by taking M sufficiently large.
Sharpness of (1.2). If λ < 2, we use the martingalesf = f + 1 andg = g + 1, where f , g come from the above example with N = 1. Thenf is nonnegative,g is differentially subordinate tof and we have P(g * > λf * ) ≥ P((f * ,g * ) = (1, 2)) = M M +1 . It remains to note that the latter fraction tends to 1 as we let M → ∞. Now we will handle the case λ ≥ 2. Pick an arbitrary λ > λ, a huge positive number M , a huge positive integer N and set δ = (λ − 2)/(2N ) > 0. Consider the independent random variables ε 1 , ε 2 , . . ., ε 2N , whose two-point distributions are given as follows. If n is odd, then P(ε n = −δ) = 1 − P(ε n = M ) = M M +δ ; if n is even, then P(ε n = δ) = 1 − P(ε n = 1 − δ) = 1 − δ. Finally, consider a variable ε 2N +1 , independent of ε 1 , ε 2 , . . ., ε 2N , satisfying P(ε 2N +1 = −1) = 1 − P(ε 2N +1 = M ) = M M +1 . Introduce the stopping time τ = inf{n : |ε n | = δ}. Since the variables ε 1 , ε 2 , . . ., ε 2N +1 are centered, the sequences f n = 1 + ε τ ∧1 + ε τ ∧2 + . . . + ε τ ∧N , g n = 1 − ε τ ∧1 + ε τ ∧2 − . . . + (−1) n ε τ ∧n (n ≥ 0) are martingales; furthermore, we see that the differential subordination is satisfied. Let us gain some intuition about the behavior of the pair (f, g). It starts from the point (1, 1); then it moves to (1 + M, 1 − M ) or to (1 − δ, 1 + δ). If the first scenario occurs, the pair stops ultimately; otherwise the movement is continued and in the second step the pair jumps to (0, 2δ) or to (1, 1 + 2δ). If the first possibility takes place, the pair terminates; otherwise it continues its evolution, going to (1 + M, 1 + 2δ − M ) or to (1 − δ, 1 + 3δ); in the first case, the process stops, while in the second it evolves further. The pattern is then repeated. Note that f is nonnegative. We see that after the 2N steps we have two possibilities: either f visited the set {0, 1 + M } (and stopped there), or (f, g) has come to the point (1, 1+2N δ) = (1, λ −1). If the latter happens, the pair goes to (1+M, λ −1−M ) or to (0, λ ). We easily compute that h = (h n ) n≥0 stand for the usual, and so on, where we have identified a set with its indicator function. The Haar system is a martingale difference sequence with respect to its natural filtration (F n ) n≥0 and hence, for any given integrable function f = ∞ k=0 a k h k , the associated sequence f n = n k=0 a k h k , n = 0, 1, 2, . . ., is a martingale. For a given collection ε = (ε n ) n≥0 of vectors from a Hilbert space H, we define the corresponding Haar multiplier T = T ε by T ( ∞ n=0 a n h n ) = ∞ n=0 ε n a n h n , which gives rise to another martingale, given by g n = n k=0 ε k a k h k , n = 0, 1, 2, . . .. Observe that if ε is bounded in absolute value by 1, i.e., the terms ε 0 , ε 1 , ε 2 , . . . take values in the unit ball of H, then g is differentially subordinate to f . Let w be a weight, i.e., a nonnegative, integrable function on [0, 1). For any A ⊆ [0, 1), we will use the notation w(A) = A wdx. In 1971, Fefferman and Stein [6] established the following weighted version of the maximal weak-type (1,1) estimate: for some absolute constant c (here w * is the maximal function of the martingale (w n ) n≥0 = (E(w|F n )) n≥0 generated by w). This result gives rise to a very interesting question: does the estimate hold true if we replace the maximal function of f on the left by |T ε f |, where ε is a sequence bounded in absolute value by 1? This question, known as a dyadic Muckenhoupt-Wheeden conjecture, was open for almost forty years and was finally answered in the negative by Reguera [15]. There is a dual problem concerning the estimate w (|T ε f | ≥ w * ) ≤ c 1 0 |f |dx, see [10,11,12]. The author showed in [14] that this inequality does not hold either with any finite constant c. Actually, the counterexample is given by f = w/λ, where w is an appropriately constructed weight and λ > 0 is a constant. In other words, the inequality λw (|T ε w| ≥ λw * ) ≤ c 1 0 wdx is not valid for general positive and integrable functions. Consider the related inequality λv (|T ε w| ≥ λw * ) ≤ c 1 0 vdx, (4.1) in which v is another weight on [0, 1). It follows from the tail estimate (1.2) that if v ≡ 1, then the above bound does hold, with the sharp constant c = sup λ≥2 λe 1−λ/2 = 2. We will strengthen this result to arbitrary A p weights. Recall that a weight v satisfies Muckenhoupt's dyadic condition A p (1 < p < ∞), if [v] Ap := sup 1 where the supremum is taken over the class of all dyadic intervals I. There are versions of this condition for p = 1 and p = ∞: for p = 1, the above definition becomes [v] A1 := sup 1 |I| I v esssup I v −1 < ∞, and for p = ∞, let [v] A∞ := sup 1 v(I) I (vχ I ) * dx < ∞. See [9] for more on the subject. Theorem 4.1. Let 1 ≤ p ≤ ∞. If v is an A p weight, w is an arbitrary weight and λ > 0, then (4.1) holds with c = 4κ p [v] Ap , for some constant κ p ≥ 1 depending only on p.
Proof. If λ < 4, then the estimate is trivial, since v (|T ε w| ≥ λw * ) ≤ 1 0 v, so we assume that λ ≥ 4. We will use the following property of A p weights established by Fefferman and Pipher [5] (modify the proof of Lemma 3.6 according to the remark on page 359 of that paper): there is a constant κ p ≥ 1 depending only on p such that for any dyadic interval I ⊆ [0, 1) and any subset A of I, This completes the proof.