Concentration Inequalities via Malliavin Calculus with Applications

We use the Malliavin calculus to prove a new abstract concentration inequality result for zero mean, Malliavin differentiable random variables which admit densities. We demonstrate the applicability of the result by deriving two new concrete concentration inequalities, one relating to an integral functional of a fractional Brownian motion process, and the other relating to the centered maximum of a finite sum of Normal random variables. These concentration inequalities are, to the best of our knowledge, largely unattainable via existing methods other than those which are the subject of this paper.


Introduction
Concentration inequalities characterize the rate of decay of the tail distribution of a random variable.More specifically, if Z is a random variable, concentration inequalities are typically some variant of an upper or lower bound on the quantity P (Z ≥ z).Some classical concentration inequalities (see [2]) are Markov's inequality: where Z is any random variable and z > 0, and Bernstein's inequality: where Z 1 , ..., Z n are independent Bernoulli random variables uniformly distributed on the set {−1, +1}.
In this work, we derive new abstract concentration inequalities for Malliavin differentiable random variables.Our approach is inspired by [5], and applies the Malliavin integration by parts formula to obtain upper and lower bounds on P (Z ≥ z) which extend those currently in existence in the literature.
This paper is organized as follows: in Section 2 we derive and present concentration inequalities which are the main result of the paper; and in Section 3 we apply the main result to compute new bounds on the tail distributions of concrete random variables.

Main Result
We will briefly introduce some of the relevant elements of the Malliavin calculus; for further details, we refer the reader to [6].Let H be a separable Hilbert space, equipped with an inner product denoted by •, • H . Then an isonormal Gaussian process is defined as a Gaussian family X = {X(h), h ∈ H} where each X(h) is a centered Gaussian random variable, and such that for all h, g ∈ H, we have: E[X(h)X(g)] = h, g H .We assume that these random variables are defined on a common probability space (Ω, F, P ), and that F is the sigma-field generated by X.If we denote by {H n } ∞ n=0 the family of Hermite polynomials, and H n = {H n (X(h)) : h ∈ H, ||h|| H = 1}, it follows that L 2 (Ω, F, P } = ⊕ ∞ n=0 H n -this is known as the Wiener chaos decomposition.Finally if we denote by J n (F ) the projection of F onto H n , for any F ∈ L 2 (Ω, F, P ), then we define the Ornstein-Uhlenbeck semi-group as the family of parametrized contraction operators {T t , t ≥ 0} on L 2 (Ω), whose action is given by: There are several well known integration by parts formulae associated with the Malliavin calculus.We generalize an existing such formula to obtain (2.1).Let Z ∈ D 1,2 with E[Z] = 0, let h : R → R be globally Lipschitz and satisfy E[h(Z)] = 0, h(z) > 0 for z > 0, h(z) < 0 for z < 0, and let f : R → R be of class C 1 with bounded derivative.We have Relating the first and last terms in this chain of equalities, we conclude that Since this random variable is measurable with respect to σ(Z), there exists a measurable function g : R → R so that: Notationally, we write Assume now that the random variable Z induces an absolutely continuous measure on (R, B(R)) with respect to the Lebesgue measure.We denote the density of Z by ρ.
The calculations which follow (until the statement and proof of Theorem 2.1) mirror those in [5].Let now f : R → R be a continuous function with compact support, and let F denote any antiderivative of f .Note that F is bounded.We see that: Since the previous calculation holds for arbitrary f continuous with compact support, we have shown almost surely. (2.2) Since Z ∈ D 1,2 , it is known (e.g.[6]) that the support of ρ, denoted by supp(ρ), is a closed interval of the form [α, β], where −∞ ≤ α < β ≤ +∞.Let now φ(z) = ∞ z h(y)ρ(y)dy.From (2.2), we have g(z)ρ(z) = φ(z) almost surely on (α, β).
Theorem 4.1 in [5] is a special case of part of the following theorem; the proof was partially inspired by this result.Theorem 2.1.Let Z ∈ D 1,2 with E[Z] = 0, and assume Z admits a density ρ.Let g(Z) = E[ DZ, −DL −1 (h(Z)) |Z], where h : R → R is globally Lipschitz, non-decreasing on [0, ∞), h(0) ≥ 0, and satisfies E[h(Z)] = 0. Then the following two results hold: where the lower bound term is understood to be zero if h(∞) = ∞.
Proof.We will apply the discussion preceding the theorem.Let φ(z) = ∞ z h(y)ρ(y)dy.From (2.4) we have for every z ∈ supp(ρ): dy . (2.5) We can integrate the expression ∞ z h(y)ρ(y)dy by parts to obtain (2.6) Hence we have P (Z ≥ z) ≤ φ(z) h(z) , and plugging in the expression for φ(z) from (2.5), we obtain the upper bound on P (Z ≥ z) in Theorem 2.1 Part 1.Note now that, from (2.6), we have: from which we obtain φ(z) ≤ P (Z ≥ z)(h(∞)) by the fundamental theorem of calculus; rearranging, we obtain the lower bound in Theorem 2.1 Part 1.
We now set about proving Theorem 2.1 Part 2. For any where the inequality follows from dropping a negative term and invoking relation (2.3). ) ]. Now applying the assumption g(Z)h (Z) ≤ αh(Z) + β almost surely, we get:
Remark 2.2.Note that Theorem 2.1 Part 1 holds only for the case where z ∈ (0, ∞) ∩ supp(ρ).However, it can still be applied to estimate the left hand tail distribution of Z.
Consider for example the case that there exists a Borel measurable function t : R → R satisfying DZ, −DL −1 Z ≤ t(Z).
If we define now Y := −Z, then the linearity of the inner product and the Malliavin operators L and D imply that: Therefore we have for z ∈ (0, ∞) ∩ −supp(ρ): dx .
The next proposition (which is the generalized analog of Proposition 3.7 in [5], whose proof is similar) gives an alternate method for computing the function g(Z).
where X stands for an independent copy of X, and is such that X and X are defined on the product probability space (Ω × Ω , F ⊗ F , P × P ).
Here, E denotes the mathematical expectation with respect to P × P .
Proof.Without loss of generality, we can assume that H = L 2 (T, B, µ), where (T, B) is a measurable space and µ is a σ-finite measure without atoms.Let us consider the chaos expansion of h(Z), given by h On the other hand, we have By Mehler's formula, and since D(h(Z)) = Φ h(Z) (X) by assumption, we deduce that the desired conclusion follows.

Applications
In Section 3.1, we apply Theorem 2.1 Part 1 (with h equal to the identity) to derive Theorem 3.1, a concentration inequality for an integral functional of a fractional Brownian motion process.The upper bound we derived is, as far as we know, the first such bound in the literature, and tractable only in the context of Theorem 2.1.In Section 3.2, we apply Theorem 2.1 Part 2 to derive Theorem 3.2, a new concentration inequality for the centered maximum of a family of Normally distributed random variables.The inequality is not expressed entirely in closed form, and we don't attempt to compare its relative sharpness with the sharpest known bounds.Rather, Theorem 3.2 is a 'proof-ofconcept' result, showing that by informed choice of the function h, Theorem 2.1 can be applied to compute novel concentration inequalities.
We emphasize that the effectiveness and utility of Theorem 2.1 can be greatly enhanced by an apt choice of the function h in the theorem statement, informed by the particular nature of the random variable Z under consideration and the specific form of the desired concentration inequality.

Integral functional of fractional Brownian motion
The theory of fBm was initially introduced by Kolmogorov and considered further by Mandelbrot and Van Ness [3]; fBm has become a ubiquitous modeling tool in the sciences (for example [1]), engineering (for example [4]), and finance fields (for example [8]), among others.For details regarding the definition and construction of fractional Brownian motion, we refer the reader to [7].Let now (B t , t ∈ [0, 1]) denote a fractional Brownian motion process with Hurst index H ∈ (0, 1).Note that such a process can be realized as an isonormal Gaussian process.In particular, we can consider the Hilbert space H defined as the closure of the space of step functions on the set R ≥0 with respect to the inner product given by: In this case we have that B t := B(1 [0,t] ).We will consider obtaining concentration inequalities on the random variable Z = Z T := We can without loss of generality assume that T = 1 by the scaling property of fractional Brownian motion.Hence we will consider the random variable Z = We have decomposed Z into a sum of two components, belonging to the fourth and second Wiener chaos spaces respectively.Having expressed the integral defining Z in an equivalent limit form, in order to unravel the Wiener chaos expansion, we now repackage the resulting sum of limits back into integral form -for example, the H 4 component of Z can be expressed as follows: Repeating this repackaging process with the H 2 part of Z, we can write Z as: where the first and second components of the sum belong to H 2 and H 4 respectively.We now compute the L −1 Z: We compute also the Malliavin derivative of Z: We compute DL −1 Z in an analogous manner, and finally we see that: We now upper bound this expression in terms of Z: (by Jensen's inequality) = 6 (Z + 3 4H + 1 ) ) .

Concentration inequalities via Malliavin calculus with applications
Once we show that Z admits a density, we will be in a position to apply Theorem 2.1in particular, Theorem 2.1 Part 1 with h equal to the identity function.We could apply the result from [9], which states that an element of D 1,2 admitting a finite Wiener chaos expansion has a density.Since we showed that Z ∈ H 2 ⊕ H 4 , it has a density.We could alternately take a more hands-on approach and apply the Bouleau-Hirsch criterion from [6], which says that for any Z ∈ D 1,2 , the almost sure positivity of ||DZ|| is a sufficient condition for Z to admit a density.Adapted from a calculation in [5], we compute: where B is a fractional Brownian motion with Hurst parameter H which is independent of B. The above calculation shows that ||DZ|| = 0 if and only if 1 0 B 3 t Bt dt = 0 for almost every path of B. This is equivalent to the condition that the path (B 3 t , t ∈ [0, 1]) be identically zero, which of course has probability zero.Hence Z does admit a density, and we can apply Theorem 2.1 Part 1 with h equal to the identity to say that for z > 0: Upper bounding E[|Z|] by 6 4H+1 , and evaluating the integral inside the exponential, we can simplify this expression to conclude that for z > 0: 3 , and c = 3 4H+1 .In this case the lower bound from Theorem 2.1 Part 1 is trivial, since h is assumed to be the identity function in this case and hence h(∞) = ∞.However, we can also use direct Malliavin methods to get a lower bound on P (Z ≥ z).First note that:

Concentration inequalities via Malliavin calculus with applications
Note now that N := 1 0 B s ds is a Normally distributed random variable -this follows from expressing N as the almost sure limit of its Riemann sum approximation, or from Corollary 3.4 in [5], which says that N is normally distributed if and only if f (N ) = E[ DN, −DL −1 N |N ] is a constant.Since we can immediately compute that DN, −DL −1 N is constant, we conclude that N is normally distributed.By Fubini we see that N has zero mean, and we compute it's variance as follows: .
We can now use the classical inequality ∞ z e −y 2 /2 dy ≥ z 1+z 2 e −z 2 /2 for z > 0, to deduce that for z > 0: . Hence we have the following theorem, which characterizes both upper and lower bounds on the rate of decay of the right-hand tail distribution of Z.We could apply Remark 2.2 to obtain a bound for the left-hand tail distribution of Z, but this is less interesting since P (Z ≤ 3 4H+1 ) = 0.

Maximum of Normal random variables
Let N = (N 1 , ..., N n ) be an n-dimensional jointly Normal random vector, with positive definite covariance matrix K.We assume that each N i has the form X(h i ), for a certain centered isonormal process X (over some Hilbert space H) and certain functions h i ∈ H. Let Z = max i=1,...,n N i − E[max i=1,...,n N i ], and set where X is an independent copy of X.Then for any u ≥ 0, I u is a well-defined random element of {1, ..., n}; moreover, Z ∈ D 1,2 and we have DZ = Φ Z (N ) = h I0 (we refer the reader to [5] for the proof).We mention also the well known Borel-Sudakov inequality ( [10]), which bounds the rate of decay of the tail distribution of Z; in particular, for z > 0: where σ 2 = max i=1,...,n K i,i .
Consider the function where γ > 0 is a constant to be chosen later, and C > 0 is arbitrary.Note that h is Lipschitz and that h is given almost everywhere by In order to apply Theorem 2.1, we must have that E[h(Z)] = 0. Note that Z admits a density under assumption that K is positive definite (see [5]) -call it ρ.We have we get E[h(Z)] = 0.By the remarks at the beginning of this section, we have again that DZ = Φ Z (X) = h I0 .Also, Φ h(Z) (X) = D(h(Z)); we can apply the chain rule for the Malliavin derivative since h is a Lipschitz function.Then we get D(h(Z)) = h (Z)DZ.