Bounds for characteristic functions in terms of quantiles and entropy ∗

Upper bounds on characteristic functions are derived in terms of the entropic distance to the class of normal distributions.

Let X be a random variable with density p and the characteristic function By the Riemann-Lebesgue theorem, f (t) → 0, as t → ∞.So, for all T > 0, An important problem is how to quantify this property by giving explicit upper bounds on δ(T ).The problem arises naturally in various local limit theorems for densities of sums of independent summands; see, for example, [St], [P] or [Se] for an interesting discussion.Our motivation, which, however, we do not discuss in this note, has been the problem of optimal rates of convergence in the entropic central limit theorem for non-i.i.d.summands.Let us only mention that in investigating this rate of convergence explicit bounds on δ(T ) (also known as Cramer's condition (C)) in terms of the entropy of X are crucial.
A first possible answer may be given for random variables with finite variance, say σ 2 = Var(X), and which have a uniformly bounded density, say p. (0.1) Here, c 1 , c 2 > 0 are certain absolute constants.
A similar bound with a slightly different dependence on (M, σ) in the right-hand side of (1) was obtained in the mid 1960's by Statulevičius [St].He also considered more complicated quantities reflecting the behavior of the density p on non-overlapping intervals of the real line (cf.Remark 10 at the end of this note).
The bounds (1)-( 2) can be extended to classes of non-bounded densities, using other quantites of the distribution of the random variable p(X) with respect to the measure p(x) dx.One of the results in this note is the following assertion.
Since the median of p(X) is majorized by the maximum M = ess sup x p(x), Theorem 2 immediately implies Theorem 1.However, the constants c 1 and c 2 in (1)-( 2) can be improved in comparison with the constants in (3)-( 4).
One may further generalize Theorem 2 by removing the requirement that the second moment of X is finite.In this case, the standard deviation σ should be replaced in ( 3)-( 4) with quantiles of |X − X |, where X is an independent copy of X.This will be explained below in the proof of Theorem 2 and extended in Theorem 8. Thus, quantitative estimates for the characteristic functions, such as (3)-( 4), can be given in the class of all absolutely continuous distributions on the line.
Let us describe a few applications, where the median m of p(X) can be controled explicitly in terms of more other quantities.First, assume that the characteristic function (which implies that the random variable X must have an absolutely continuous distribution with some density p).By Chebyshev's inequality and Parseval's identity, for any λ > 0, The right-hand side is smaller than 1 2 , whenever λ > π , so this ratio provides an upper bound on any median of p(X).Hence, Theorem 2 yields: where c > 0 is an absolute constant.Moreover, in case 0 < σ|t| < π 4 , 2 appears in problems of quantum mechanics and information theory, where it is referred to as the informational energy or the quadratic entropy.
However, it is infinite for many probability distributions.The condition (5), that is, +∞ −∞ p(x) 2 dx < +∞, may be relaxed in terms of the so-called entropic distance to normality, which is defined as the difference of the entropies D denotes the (differential) entropy for a random variable X with density p (which will be assumed to have finite second moment), and Z is a normal random variable with the same mean and variance as X.The quantity h(X) is well defined in the usual Lebesgue sense and satisfies h(X) ≤ h(Z), with equality if and only if X is normal.Hence, 0 ≤ D(X) ≤ +∞.
The functional D(X) is translation and scale invariant with respect to X, and thus does not depend on the mean or variance of X.It may also be described as the shortest Kullback-Leibler distance (or the informational divergence) from the distribution of X to the class of normal distributions on the line, and thus D(X) serves as a strong measure of "non-Gaussianity".
Although the value D(X) = +∞ is still possible, the condition D(X) < +∞, or equivalently is much weaker than (5).From Theorem 2 we derive: Corollary 4. Assume a random variable X with finite variance σ 2 = Var(X) has a finite entropy.Then, for all σ|t| ≥ π 4 , the characteristic function satisfies Here, the coefficient 4 in the exponents can be improved at the expense of the constant c in (6)-( 7), and chosen to be arbtrarily close to 2.
Let us turn to the proofs.Since the argument involves symmetrization of the distribution of X, we need study how the median and other functionals of p(X) will change under convolutions.
Notations.Given 0 < κ < 1, we write m κ = m κ (ξ) to indicate that m κ is a κ-quantile of a random variable ξ (or, a quantile of order κ), which may be any number such that If κ = 1/2, the value m = m 1/2 represents a median of ξ.
Lemma 5. Let X be a random variable with density p, and let q be the density of the random variable Y = X + X , where X is independent of X.
which proves the lemma.Lemma 6.Let X be a random variable with density p, and q be the density of For example, choosing κ = 1/2 and b = 1/4, we get P{q(Y ) ≥ 4m} ≤ 2 3 , where m is a median of p(X).
On the other hand, u(t It remains to insert the two bounds in (8).
Lemma 7. If a random variable X with finite variance σ 2 = Var(X) has a density, bounded by a constant M , then M 2 σ 2 ≥ 1 12 .
This elementary inequality is known.Without proof it was already mentioned and used in [St].High dimensional variants were studied in Hensley [H] and Ball [B].Equality in the lemma is possible, and is achieved for a uniform distribution on bounded intervals.For a short argument, put H We also note that the inequality of Lemma 7 may be rewritten in an equivalent form in the space of all integrable functions q ≥ 0 on the line as the relation (0.9) Theorem 2 and its generalization.
We now turn to the basic arguments.Let q be the density of Y = X − X , where X is an independent copy of X.Then Y has the characteristic function |f (t)|2 (where f is the characteristic function of X), and we have the identity (0.10) Our task is therefore to bound the integral in (10) from below.
We start with the obvious bound where ρ(θ) denotes the shortest distance from θ to the set of all integers.Here, an equality is only possible in case θ = k/2 for an integer k.Hence, (10) gives for arbitrary measurable sets W ⊂ R. We apply (12) to the sets of the form with N = 0, 1, 2, . . . to be chosen later on.Given t = 0, split the integral (12) into the sets Changing the variable x = y + k |t| on each W k , we may also write Now, by the inequality (9), applied to the functions q k (y on the interval (−1 2|t| , 1 2|t| ), and using a uniform bound q k (y) ≤ 1 b m κ1 , we have where q k = W k q(x) dx.
Lemma 9. Let X be a random variable with finite variance σ 2 = Var(X) and finite entropy.Then, any quantile m κ (0 < κ < 1) of the random variable p(X) satisfies (0.25) Proof.Rewrite the entropic distance to normality as the Kullback-Leibler distance where ϕ a,σ is the density of the normal law N (a, σ 2 ) with a = EX and σ 2 = Var(X).
In both inequalities, the coefficient in front of D(X) can be made as close to 2, as we wish, and with the constants c 1 and c 2 , depending on (κ 1 , κ 2 ), as in Theorem 8.In particular, for κ 2 = 7/8, we have m = 4σ, so, whenever κ 1 > 1/8, X be a random variable with density p(x) and characteristic function f (t).Let p denote the density of the random variable X − X , where X is an independent copy of X.Then for any sequence {∆ i } of non-overlapping intervals on the line with lengths |∆ i |, for all constants 0 ≤ M i ≤ ∞, and for all t ∈ R, one has This may be viewed as a variant of Theorem 1.
us make the obtained bounds more quantitative by choosing an appropriate value of b.The function ψ defined in (19) may easily be maximized in the admissible interval 0 < b < b 0 = κ1+κ2−1 κ2