Rates of convergence for minimal distances in the central limit theorem under projective criteria

In this paper, we give estimates of ideal or minimal distances between the distribution of the normalized partial sum and the limiting Gaussian distribution for stationary martingale difference sequences or stationary sequences satisfying projective criteria. Applications to functions of linear processes and to functions of expanding maps of the interval are given.


Introduction and Notations
Let X 1 , X 2 , . . .be a strictly stationary sequence of real-valued random variables (r.v.) with mean zero and finite variance.Set S n = X 1 + X 2 + • • • + X n .By P n −1/2 Sn we denote the law of n −1/2 S n and by G σ 2 the normal distribution N(0, σ 2 ).In this paper, we shall give quantitative estimates of the approximation of P n −1/2 Sn by G σ 2 in terms of minimal or ideal metrics.
Let L(µ, ν) be the set of the probability laws on R 2 with marginals µ and ν.Let us consider the following minimal distances (sometimes called Wasserstein distances of order r) |x − y| r P (dx, dy) : P ∈ L(µ, ν) if 0 < r < 1 inf |x − y| r P (dx, dy) It is well known that for two probability measures µ and ν on R with respective distributions functions (d.f.) F and G, W r (µ, ν) = for any r ≥ 1. (1.1) We consider also the following ideal distances of order r (Zolotarev distances of order r).For two probability measures µ and ν, and r a positive real, let where Λ r is defined as follows: denoting by l the natural integer such that l < r ≤ l + 1, Λ r is the class of real functions f which are l-times continuously differentiable and such that 2) It follows from the Kantorovich-Rubinstein theorem (1958) that for any 0 < r ≤ 1, For probability laws on the real line, Rio (1998) proved that for any r > 1, where c r is a constant depending only on r.
Our paper is organized as follows.In Section 2, we give projective conditions for stationary martingales differences sequences to satisfy (1.6) in the case (r, p) = (1,3).To be more precise, let (X i ) i∈Z be a stationary sequence of martingale differences with respect to some σ-algebras (F i ) i∈Z (see Section 1.1 below for the definition of (F i ) i∈Z ).As a consequence of our Theorem 2.1, we obtain that if (X i ) i∈Z is in L p with p ∈]2, 3] and satisfies then the upper bound (1.6) holds provided that (r, p) = (1, 3).In the case r = 1 and p = 3, we obtain the upper bound W 1 In Section 3, starting from the coboundary decomposition going back to Gordin (1969), and using the results of Section 2, we obtain L p -projective criteria ensuring (1.6) (if (r, p) = (1, 3)).For instance, if (X i ) i∈Z is a stationary sequence of L p random variables adapted to (F i ) i∈Z , we obtain (1.6) for any p ∈]2, 3[ and any r ∈ [p − 2, p] provided that (1.7) holds and the series E(S n |F 0 ) converge in L p .In the case where p = 3, this last condition has to be strengthened.Our approach makes also possible to treat the case of non-adapted sequences.
Section 4 is devoted to applications.In particular, we give sufficient conditions for some functions of Harris recurrent Markov chains and for functions of linear processes to satisfy the bound (1.6) in the case (r, p) = (1, 3) and the rate O(n −1/2 log n) when r = 1 and p = 3.Since projective criteria are verified under weak dependence assumptions, we give an application to functions of φ-dependent sequences in the sense of Dedecker and Prieur (2007).These conditions apply to unbounded functions of uniformly expanding maps.

Preliminary notations
Throughout the paper, Y is a N(0, 1)-distributed random variable.We shall also use the following notations.Let (Ω, A, P) be a probability space, and T : Ω → Ω be a bijective bimeasurable transformation preserving the probability P. For a σ-algebra F 0 satisfying F 0 ⊆ T −1 (F 0 ), we define the nondecreasing filtration (F i ) i∈Z by We shall denote sometimes by E i the conditional expectation with respect to F i .Let X 0 be a zero mean random variable with finite variance, and define the stationary sequence (X i ) i∈Z by X i = X 0 • T i .
2 Stationary sequences of martingale differences.
In this section we give bounds for the ideal distance of order r in the central limit theorem for stationary martingale differences sequences (X i ) i∈Z under projective conditions.Notation 2.1.For any p > 2, define the envelope norm .1,Φ,p by where Q X denotes the quantile function of |X|, and Φ denotes the d.f. of the N(0, 1) law.
Theorem 2.1.Let (X i ) i∈Z be a stationary martingale differences sequence with respect to and Then, for any r ), and for p = 3 ).This bound was obtained by Sakhanenko (1985) in the independent case.For p < 3, we have W 1 ).This bound was obtained by Ibragimov (1966) in the independent case.
Taking r = p − 2, it follows that under the assumptions of Theorem 2.1, 2), we have that Wu and Zhao (2006)).Applying then the result in Heyde and Brown (1970), we get that if (X i ) i∈Z is a stationary martingale difference sequence in L p such that (2.2) is satisfied then .
where F n is the distribution function of n −1/2 S n and Φ σ is the d.f. of G σ 2 .Now Consequently the bounds obtained in (2.3) improve the one given in Heyde and Brown (1970), provided that (2.1) holds.
then the conditions (2.1) and (2.2) hold for p = 3.Consequently, if (2.4) holds, then Remark 2.3 gives . This result has to be compared with Theorem 6 in Jan (2001), which states that Remark 2.5.Notice that if (X i ) i∈Z is a stationary martingale differences sequence, then the conditions (2.1) and (2.2) are respectively equivalent to To see this, let . We first show that A n and B n are subadditive sequences.Indeed, by the martingale property and the stationarity of the sequence, for all positive i and j Proceeding as in the proof of (4.6), p. 65 in Rio (2000), one can prove that, for any σ-field A and any integrable random variable X, E(X|A) 1,Φ,p ≤ X 1,Φ,p .Hence By stationarity, it follows that A i+j ≤ A i + A j .Similarly B i+j ≤ B i + B j .The proof of the equivalences then follows by using the same arguments as in the proof of Lemma 2.7 in Peligrad and Utev (2005).

Rates of convergence for stationary sequences
In this section, we give estimates for the ideal distances of order r for stationary sequences which are not necessarily adapted to F i .
Theorem 3.1.Let (X i ) i∈Z be a stationary sequence of centered random variables in L p with p ∈]2, 3[, and let σ Then the series k∈Z Cov(X 0 , X k ) converges to some nonnegative σ 2 , and Remark 3.1.According to the bound (5.35), we infer that, under the assumptions of Theorem 3.1, the condition (3.2) is equivalent to The same remark applies to the next theorem with p = 3.
Remark 3.2.The result of item 1 is valid with σ n instead of σ.On the contrary, the result of item 2 is no longer true if σ n is replaced by σ, because for r ∈]2, 3], a necessary condition for ζ r (µ, ν) to be finite is that the two first moments of ν and µ are equal.Note that under the assumptions of Theorem 3.1, both W r (P n −1/2 Sn , G σ 2 ) and W r (P n −1/2 Sn , G σ 2 n ) are of the order of n −(p−2)/2 max (1,r) .Indeed, in the case where r ∈]2, p], one has that and the second term is of order |σ − In the case where p = 3, the condition (3.1) has to be strengthened.
Theorem 3.2.Let (X i ) i∈Z be a stationary sequence of centered random variables in L 3 , and let Assume in addition that Then the series k∈Z Cov(X 0 , X k ) converges to some nonnegative σ 2 and

Martingale differences sequences and functions of Markov chains
Recall that the strong mixing coefficient of Rosenblatt (1956) between two σ-algebras A and B is defined by Let Q be the quantile function of |X 0 |, that is the cadlag inverse of the tail function x → P(|X 0 | > x).According to the results of Section 2, the following proposition holds.
Proposition 4.1.Let (X i ) i∈Z be a stationary martingale difference sequence.Assume moreover that the series This condition is always strictly stronger than the condition (4.1) when p = 3.We now give an example.Consider the homogeneous Markov chain (Y i ) i∈Z with state space Z described at page 320 in Davydov (1973).The transition probabilities are given by p This chain is irreducible and aperiodic.It is Harris positively recurrent as soon as n≥2 Π n−1 k=1 a k < ∞.In that case the stationary chain is strongly mixing in the sense of Rosenblatt (1956).
Denote by K the Markov kernel of the chain (Y i ) i∈Z .The functions f such that K(f ) = 0 almost everywhere are obtained by linear combinations of the two functions f 1 and f 2 given by f which holds as soon as , where P 0 is the probability of the chain starting from 0, and 2) is satisfied and the conclusion of Theorem 2.1 holds.
Remark 4.2.If f is bounded and K(f ) = 0, the central limit theorem may fail to hold for ).We refer to the Example 2, page 321, given by Davydov (1973), where S n properly normalized converges to a stable law with exponent strictly less than 2.
Proof of Proposition 4.1.Let B p (F 0 ) be the set of F 0 -measurable random variables such that Z p ≤ 1.We first notice that Applying Rio's covariance inequality (1993), we get that , which shows that the convergence of the second series in (4.1) implies (2.2).Now, from Fréchet (1957), we have that Applying again Rio's covariance inequality (1993), we get that which shows that the convergence of the first series in (4.1) implies (2.1).

Linear processes and functions of linear processes
Theorem 4.1.Let (a i ) i∈Z be a sequence of real numbers in ℓ 2 such that i∈Z a i converges to some real A. Let (ε i ) i∈Z be a stationary sequence of martingale differences in then we have Remark 4.3.If the condition given by Heyde (1975) holds, that is then A n = O(1), so that it satisfies all the conditions of items 1-4.On the other and, one has the bound Proof of Theorem 4.1.We start with the following decomposition: implies that σ n converges to σ.We now give an upper bound for R n p .From Burkholder's inequality, there exists a constant C such that The result follows by applying Theorem 2.1 to the martingale A n k=1 ε k (this is possible because of (4.3)), and by using Lemma 5.2 with the upper bound (4.7).To prove Remark 4.3, note first that It follows easily that A n = O(1) under (4.4).To prove the bound (4.5), note first that Since n(T 2 n+1 + Q 2 n+1 ) ≤ B n , (4.5) follows.In the next result, we shall focus on functions of real-valued linear processes where (ε i ) i∈Z is a sequence of iid random variables.Denote by w h (., M) the modulus of continuity of the function h on the interval [−M, M], that is Theorem 4.2.Let (a i ) i∈Z be a sequence of real numbers in ℓ 2 and (ε i ) i∈Z be a sequence of iid random variables in L 2 .Let X k be defined as in (4.8) and σ Assume that h is γ-Hölder on any compact set, with w h (t, M) ≤ Ct γ M α , for some C > 0, γ ∈]0, 1] and α ≥ 0. If for some p ∈]2, 3], then the series k∈Z Cov(X 0 , X k ) converges to some nonnegative σ 2 , and ] and (r, p) = (1, 3), Proof of Theorem 4.2.Theorem 4.2 is a consequence of the following proposition: Proposition 4.2.Let (a i ) i∈Z , (ε i ) i∈Z and (X i ) i∈Z be as in Theorem 4.2.Let (ε ′ i ) i∈Z be an independent copy of To prove Theorem 4.2, it remains to check (4.10).We only check the first condition.Since w h (t, M) ≤ Ct γ M α and the random variables ε i are iid, we have From Burkholder's inequality, for any β > 0, j≥i a j ε −j Applying this inequality with β = γ or β = α + γ, we infer that the first part of (4.10) holds under (4.9).The second part can be handled in the same way.
Proof of Proposition 4.2.
We shall first prove that the condition (3.2) of Theorem 3.1 holds.We write We first control the second term.Let ε ′ be an independent copy of ε, and denote by E ε (•) the conditional expectation with respect to ε. Define Hence, By subadditivity, In the same way provided that the first condition in (4.10) holds.
We turn now to the control of . We first write that Using the same arguments as before, we get that In the same way, provided that (4.10) holds.This completes the proof of (3.2).Using the same arguments, one can easily check that the condition (3.1) of Theorem 3.1 (and also the condition (3.4) of Theorem 3.2 in the case p = 3) holds under (4.10).

Functions of φ-dependent sequences
In order to include examples of dynamical systems satisfying some correlations inequalities, we introduce a weak version of the uniform mixing coefficients (see Dedecker and Prieur (2007)).

Definition 4.1. For any random variable
For a sequence Y = (Y i ) i∈Z , where Then the conclusions of Theorem 4.2 hold.
Proof of Proposition 4.3.Let B p (F 0 ) be the set of F 0 -measurable random variables such that Z p ≤ 1.We first notice that According to Corollary 6.2 and since φ(σ(Z), Y k ) ≤ φ 1,Y (k) , we get that Proof of Lemma 4.1.Since, we infer that there exists C > 0 such that We shall bound up E(X i X k+i |F 0 ) − E(X i X k+i ) p/2 in two ways.First, using the stationarity and the upper bound (4.12), we have that Next, using again Corollary 6.2, From (4.13) and the above upper bounds, we infer that the conclusion of Lemma (4.1) holds provided that Here, note that for some D > 0. Since φ 1,Y (k) ≤ φ 2,Y (k), the conclusion of lemma (4.1) holds provided One can prove that the second series converges provided the first one does.

Application to Expanding maps
Let BV be the class of bounded variation functions from [0, 1] to R. For any h ∈ BV , denote by dh the variation norm of the measure dh.

.14)
A Markov Kernel K is said to be BV -contracting if there exist C > 0 and ρ ∈ [0, 1[ such that The map T is said to be BV -contracting if its Perron-Frobenius operator is BV -contracting.Let us present a large class of BV -contracting maps.We shall say that T is uniformly expanding if it belongs to the class C defined in Broise (1996), Section 2.1 page 11.Recall that if T is uniformly expanding, then there exists a probability measure µ on [0, 1], whose density f µ with respect to the Lebesgue measure is a bounded variation function, and such that µ is invariant by T .Consider now the more restrictive conditions: 3.
] for some a > 0. For a = 1, this transformation is known as the Gauss map.
) converges to some nonnegative σ 2 , and Proof of Proposition 4.4.Let (Y i ) i≥1 be the Markov chain with transition Kernel K and invariant measure µ.Using the equation (4.14) it is easy to see that (Y 0 , . . ., Y n ) it is distributed as (T n+1 , . . ., T ).Consequently, to prove Proposition 4.4, it suffices to prove that the sequence

Proofs of the main results
From now on, we denote by C a numerical constant which may vary from line to line.Notation 5.1.For l integer, q in ]l, l + 1] and f l-times continuously differentiable, we set

Proof of Theorem 2.1
We prove Theorem 2.1 in the case σ = 1.The general case follows by dividing the random variables by σ.Since ζ r (P aX , P aY ) = |a| r ζ r (P X , P Y ), it is enough to bound up ζ r (P Sn , G n ).We first give an upper bound for ζ p,N := ζ p (P S 2 N , G 2 N ).Proposition 5.1.Let (X i ) i∈Z be a stationary martingale differences sequence.Let M p = E(|X 0 | p ). Then for any p in ]2, 3] and any natural integer N, where Proof of Proposition 5.1.The proof is done by induction on N. Let (Y i ) i∈N be a sequence of N(0, 1)-distributed independent random variables, independent of the sequence Then, from the independence of the above sequences, Next, from the Taylor integral formula at order two, for any two-times differentiable function g and any q in ]2, 3], |h| q |g| Λq . (5.3) Moreover, if f belongs to Λ p , then the smoothed function f n−m belongs to Λ p .Hence, summing on m, we get that Suppose now that n = 2 N .To bound up D ′ , we introduce a dyadic scheme.
Notation 5.2.Set m 0 = m − 1 and write m 0 in basis 2: For the sake of brevity, let Since m N = 0, the following elementary identity is valid Note that {m : m L = k2 L } = I L,k .Now by the martingale property Since (X i ) i∈N and (Y i ) i∈N are independent, we infer that By using (1.2), we get that We now prove (5.1) by induction on N. Assume that ζ p,L satisfies (5.1) for any L in [0, N − 1].
Starting from (5.13), using the induction hypothesis and the fact that ∆ ′ K ≤ ∆ ′ N , we get that which implies (5.1) for ζ p,N .
In order to prove Theorem 2.1, we will also need a smoothing argument.This is the purpose of the lemma below.
Proof of Lemma 5.1.Throughout the sequel, let Y be a N(0, 1)-distributed random variable, independent of the σ-field generated by the random variables (X i ) i and (Y i ) i .
For r ≤ 2, since ζ r is an ideal metric with respect to the convolution, which implies Lemma 5.1 for r ≤ 2. For r > 2, from (5.3), for any f in Λ r , Taking the expectation and noting that E|Y | r ≤ r − 1 for r in ]2, 3], we infer that Obviously this inequality still holds for T n instead of S n and −f instead of f , so that adding the so obtained inequality, which is derived from the Taylor formula at orders two and three.From (5.17) and Lemma 6.1, we have that It remains to consider the case r = p − 2 and r < 1. Applying Lemma 6.1, we get that for i ≥ 2, (5.20) It follows that Consequently for r = p − 2 and r < 1, We now bound up Using the dyadic scheme as in the proof of Proposition 5.1, we get that Notice first that Hence using Lemma 6.1, we get that Proceeding as to get (5.12),we have that Using Remark 2.5, (2.1) and (2.2) entail that Z (0) ).Hence, for some ǫ(N) tending to 0 as N tends to infinity, one has Next, proceeding as in the proof of (5.7), we get that If r > p − 2 or (r, p) = (1, 3), from Lemma 6.1, the stationarity of (X i ) i∈N and the above inequality, It follows that L if r = 1 and p = 3. (5.24) In the case r = p − 2 and r > 1, we have .
Applying (5.20) to i = 2 and i = 3, we obtain Proceeding as to get (5.21),we have that It follows that Consequently, combining (5.26) with the upper bounds (5.23), (5.24) and (5.25), we obtain that  Let N 0 be such that Cǫ(N) ≤ 1/2 for N ≤ N 0 , and let K ≥ 1 be such that ζ * p,N 0 ≤ K2 N 0 .Choosing K large enough such that K ≥ 2C, we can easily prove by induction that ζ * p,N ≤ K2 N for any N ≥ N 0 .Hence Theorem 2.1 is proved in the case r = p.
For r in [p − 2, p[, Theorem 2.1 follows by taking into account the bound ζ * p,N ≤ K2 N , valid for any N ≥ N 0 , in the inequalities (5.28) and (5.29).

Proof of Theorem 3.1
By (3.1), we get that (see Volný (1993)) where which is bounded under the second part of the condition (3.4).Next, since the random variable by using (3.4) and (5.40).Hence, since by taking into account (5.40).Consequently (5.50) will follow as soon as which holds under the second part of the condition (3.4), by applying Lemma 5.3 with h This ends the proof of the theorem.

Covariance inequalities.
In this section, we give an upper bound for the expectation of the product of k centered random variables Π k i=1 (X i − E(X i )).
Definition 6.1.For a quantile function Q in L 1 ([0, 1], λ), let F (Q, P X ) be the set of functions f which are nondecreasing on some open interval of R and null elsewhere and such that Q |f (X)| ≤ Q.Let C(Q, P X ) denote the set of convex combinations ∞ i=1 λ i f i of functions f i in F (Q, P X ) where ∞ i=1 |λ i | ≤ 1 (note that the series ∞ i=1 λ i f i (X) converges almost surely and in L 1 (P X )).
Then applying (6.3) on the right hand side of (6.7), we derive that Recall that for any p ≥ 1, the class C(p, M, P X ) has been introduced in the definition 4.2.Corollary 6.2.Let X = (X 1 , • • • , X k ) be a random variable with values in R k and let the φ (i) 's be defined by (6.1).Let a k-tuple (p 1 , . . ., p k ) such that 1/p 1 + . . .+ 1/p k = 1 and let (f i ) 1≤i≤k be k functions from R to R, such that f i ∈ C(p i , M i , P X i ).We have the inequality
(a) T is uniformly expanding.(b) The invariant measure µ is unique and (T, µ) is mixing in the ergodic-theoretic sense.(c) 1 f µ 1 fµ>0 is a bounded variation function.Starting from Proposition 4.11 in Broise (1996), one can prove that if T satisfies the assumptions (a), (b) and (c) above, then it is BV contracting (see for instance Dedecker and Prieur (2007), Section 6.3).Some well known examples of maps satisfying the conditions (a), (b) and (c) are: 1. T (x) = βx − [βx] for β > 1.These maps are called β-transformations. 2. I is the finite union of disjoint intervals (I k ) 1≤k≤n , and T (x) = a k x+b k on I k , with |a k | > 1.
satisfies the condition (4.11) of Proposition 4.3.According to Lemma 1 in Dedecker and Prieur (2007), the coefficients φ 2,Y (i) of the chain (Y i ) i≥0 with respect to F i = σ(Y j , j ≤ i) satisfy φ 2,Y (i) ≤ Cρ i for some ρ ∈]0, 1[ and some positive constant C. It follows that (4.11) is satisfied for s = p.
and the result follows by a change-of-variables.