A total variation version of Breuer--Major Central Limit Theorem under $\mathbb{D}^{1,2}$ assumption

In this note, we establish a qualitative total variation version of Breuer--Major Central Limit Theorem for a sequence of the type $\frac{1}{\sqrt{n}} \sum_{1\leq k \leq n} f(X_k)$, where $(X_k)_{k\ge 1}$ is a centered stationary Gaussian process, under the hypothesis that the function $f$ has Hermite rank $d \geq 1$ and belongs to the Malliavin space $\mathbb D^{1,2}$. This result in particular extends the recent works of [NNP21], where a quantitative version of this result was obtained under the assumption that the function $f$ has Hermite rank $d= 2$ and belongs to the Malliavin space $\mathbb D^{1,4}$. We thus weaken the $\mathbb D^{1,4}$ integrability assumption to $\mathbb D^{1,2}$ and remove the restriction on the Hermite rank of the base function. While our method is still based on Malliavin calculus, we exploit a particular instance of Malliavin gradient called the sharp operator, which reduces the desired convergence in total variation to the convergence in distribution of a bidimensional Breuer--Major type sequence.


Framework and main result
Let us consider X = (X n ) n≥1 a real-valued centered stationary Gaussian sequence with unit variance, defined on an abstract probability space (Ω, F , P).Let ρ : N → R be the associated correlation function, in other words ρ(|k − ℓ|) = E[X k X ℓ ], for all k, ℓ ≥ 1.We will also classically denote by N (0, σ 2 ) the law of a centered normal variable with variance σ 2 .Set γ(dx) := (2π) −1/2 e −x 2 /2 dx the standard Gaussian measure on the real line and γ d = ⊗ d k=1 γ its analogue in R d .We then denote by (H m ) m≥0 the family of Hermite polynomials which are orthogonal with respect to γ, namely H 0 ≡ 1 and We denote by L 2 (R, γ) the space of square integrable real functions with respect to the Gaussian measure.Recall that a real function f ∈ L 2 (R, γ) is said to have Hermite rank d ≥ 0 if it can be decomposed as a sum of the form For integers k, p ≥ 1, we further denote by D k,p (R, γ) the Malliavin-Sobolev space consisting of the completion of the family of polynomial functions q : R → R with respect to the norm , where q (ℓ) is the ℓ-th derivative of q.Given a real function f , let us finally set In this framework, the celebrated Central Limit Theorem (CLT) by Breuer and Major gives sufficient conditions on ρ and f so that the sequence S n (f ) satisfies a CLT.
then the sequence (S n (f )) n≥1 converges in distribution as n goes to infinity to a normal distribution N (0, σ 2 ), where the limit variance is given by with c m being the coefficients appearing in the Hermite expansion of f .
Recently, under mild additional assumptions, a series of articles has reinforced the above convergence in distribution into a convergence in total variation, with polynomial quantitative bounds, see e.g.[KN19, NPY19, NZ21, NNP21].Recall that the total variation distance between the distributions of two real random variables X and Y is given by d TV (X, Y ) := sup where the supremum runs over B(R), the Borel sigma field on the real line.To the best of our knowledge, the best statement so far in this direction is the following Theorem 2 (Theorem 1.2 in [NNP21]).Assume that f ∈ L 2 (R, γ) has Hermite rank d = 2 and that it belongs to D 1,4 (R, γ).Suppose that ρ ∈ ℓ d (N) and that the variance σ 2 of Theorem 1 is positive.Then, there exists a constant C > 0 independent of n such that The goal of this note is to establish that the convergence in total variation in fact holds as soon as the function f is in the Malliavin-Sobolev space D 1,2 (R, γ) and has Hermite rank d ≥ 1.
Theorem 3. Suppose that f ∈ D 1,2 (R, γ) has Hermite rank d ≥ 1. Suppose moreover that ρ ∈ ℓ d (N) and that the variance σ 2 of Theorem 1 is positive.Then, as n goes to infinity Note that, for the sake of simplicity, we only consider here a real Gaussian sequence (X n ) n≥1 and a real function f but our method is robust and would yield, under similar covariance and rank assumptions, a convergence in total variation for a properly renomalized sequence of the type The detailed proof of Theorem 3 is the object of the next section and the rest of the paper.Unsurprisingly, we use the Malliavin-Stein approach to establish the CLT in total variation.However, our approach differs from the other works mentioned above in that we make use of the so called "sharp gradient", whose definition and main properties are recalled in Section 2.2.With this tool at hand and in view of using Malliavin-Stein equation to characterize the proximity to the normal distribution, we shall see that the convergence in total variation in fact reduces to two rather simple steps i) a two-dimensional version of the classical Breuer-Major CLT (i.e. in distribution not in total variation), see Section 2.3 ; ii) some elementary uniform integrability estimates, allowing to pass from a convergence in probability to a convergence in L 1 , see Section 2.4.

Proof of the main result
As mentioned just above, the setting of the proof of Theorem 3 is the one of Malliavin-Stein calculus.Note that for each fixed n ≥ 1, the quantity of interest S n (f ) involves only a finite number of Gaussian coefficients.So let us sketch the framework of Malliavin-Stein method in the finite dimensional setting, and we refer to [Nua09] or [NP12] for a more general introduction.

A glimpse of Malliavin calculus
Let us fix an integer n ≥ 1 and let us place ourselves in the product probability space (R n , B(R n ), γ n ) with γ n := ⊗ n k=1 γ, the n-dimensional standard Gaussian distribution on R n .Consider the classical Ornstein-Ulhenbeck operator L n := ∆ − x • ∇ which is symmetric with respect to γ n .We have then the standard decomposition of the L 2 −space in Wiener chaoses, namely The square field or "carré du champ" operator Γ n is then defined as the bilinear operator As a glimpse of the power of Malliavin-Stein approach in view of establishing total variation estimates, recall that if F ∈ Ker (L n + kI) is such that E[F 2 ] = 1, then for some constant C k only depending on k, the total variation distance between the variable F and a standard Gaussian can be upper bounded by Via the notion of isonormal Gaussian process, the finite dimensional framework for Malliavin-Stein method sketched above can in fact be extended to the infinite dimensional setting giving rise to an Ornstein-Uhlenbeck operator L and an associated "carré du champ" Γ, see e.g.Chapter 2 in [NP12].

The sharp gradient
A detailed introduction to the sharp gradient can be found in Section 4.1 of the reference [AP20].We only recall here the basics which will be useful to our purpose.Let us assume that (N k ) k≥1 is an i.i.d.sequence of standard Gaussian variables on (Ω, F, P) which generate the first Wiener chaos.Without loss of generality, we shall assume that F = σ(N k , k ≥ 1).We will also need a copy ( Ω, F , P) of this probability space as well as ( Ni ) i≥1 a corresponding i.i.d.sequence of standard Gaussian variables such that F = σ( Nk , k ≥ 1).We will denote by Ê the expectation with respect to the measure P. For any integer m ≥ 1 and any function Φ in the space C 1 b (R m , R) of continuously differentiable functions with a bounded gradient, we then set In Sections 4.1.1 and 4.1.2 of [AP20], it is shown that this gradient is closable and extends to the Malliavin space D 1,2 , where The last space D 1,2 is naturally the infinite dimensional version of the Malliavin-Sobolev space D 1,2 (R, γ) introduced in Section 1 in the one-dimensional setting.In particular, Proposition 8 in the latter reference shows that Given F ∈ D 1,2 , taking first the expectation Ê with respect P and using Fubini inversion of sums yields the following key relation, for all ξ ∈ R By essence, via their Laplace/Fourier transforms, this key equation allows to relate the asymptotic behavior in distribution (or in probability if the limit is constant) of the carré du champ Γ[F, F ] with the one of the sharp gradient ♯ F .
Finally, let us remark that by definition, the image ( ♯ X k ) k≥1 of our initial stationary sequence (X k ) k≥1 by the sharp gradient is an independent copy of (X k ) k≥1 .We will write ( ♯ X k ) k≥1 = ( Xk ) k≥1 in the sequel.

Convergence in probability via a two dimensional CLT
Let us suppose that f satisfies the assumptions of Theorem 3, namely f ∈ D 1,2 (R, γ) with Hermite rank d ≥ 1, so that it can be decomposed as f = ∞ m=d c m H m in L 2 (R, γ).Let L −1 denote the pseudo-inverse of the Ornstein-Uhlenbeck operator and consider the pre-image To simplify the expressions in the sequel, we set Now, take (s, t, ξ) ∈ R 3 and let us apply the above key relation (2) with the random variable tF n + sG n , we get On the one hand, by bilinearity of the carré du champ operator, we have On the other hand, the right hand side of Equation ( 3) is simply the characteristic function under P ⊗ P of the couple is a "Breuer-Major type" sequence with respect to the R 2 −valued centered stationary Gaussian process ( Xk , X k ) k≥1 and the R 2 −valued functional ) and ( Xk ) k≥1 and (X k ) k≥1 are independent, therefore the functional Ψ is in L 2 (R 2 , γ 2 ) and the multivariate counterpart of the classical Breuer-Major Theorem applies, see Theorem 4 of [Arc94].
As a result, the bidimensional sequence ( ♯ F n , ♯ G n ) converges in distribution, under P ⊗ P, towards a bidimensional centered Gaussian vector with a symmetric semi-positive covariance matrix Σ.Therefore, from Equations (3) and (4) and via the characterization of convergence in distribution in terms of Fourier transform, there exists real numbers λ, µ, ν (depending on the limit covariance matrix Σ) such that for any (s, t, ξ) ∈ R 3 , as n goes to infinity, we have Since the above convergence is valid for any ξ ∈ R, this shows in particular that for any fixed (s, t) ∈ R 2 , the sequence Γ[tF n +sG n , tF n +sG n ] converges in distribution (and thus in probability) towards the constant variable λt 2 + µs 2 + 2νts .Choosing s = t = 1, we thus get that Γ[F n + G n , F n + G n ] converges in probability towards (λ + µ + 2ν).Choosing s = 0 and t = 1, then t = 0 and s = 1, one deduce in the same manner that Γ[F n , F n ] and Γ[G n , G n ] both converge in probability towards λ and µ respectively.Finally, by Equation (4), one can conclude that the cross term also converges in probability towards the constant limit variable ν.

Gaining some uniform integrability
Since our goal is to derive convergence in total variation of F n = S n (f ), the convergence in probability of the term Γ[F n , −L −1 F n ] is not sufficient.Indeed, with Stein's Equation in mind, the lack of uniform integrability is a problem to deduce the following required asymptotic behavior for any φ ∈ C 1 b (R), as n goes to infinity In order to bypass this problem, let us go back to the two-dimensional classical Breuer-Major theorem associated with the functional Ψ used in the last section.For any integer p ≥ 1, let us denote by Ψ p the projection of Ψ on the first p − th chaoses.Applying Theorem 4 and Equation (2.43) of [Arc94], we get that there exists a constant C > 0 (which depends only on the covariance structure of the underlying Gaussian process) such that Since Ψ belongs to L 2 (R 2 , γ 2 ), the last term on the right hand side goes to zero as p goes to infinity.As a result, uniformly in n ≥ 1, the two-dimensional process can be approximated arbitrarily closely in L 2 (P ⊗ P) by the following process which is finitely expanded on the Wiener chaoses Therefore, choosing p ≥ 1 large enough, uniformly in n ≥ 1, the product ♯ F n × ♯ G n can be approximated arbitrarily closely in L 1 (P ⊗ P) by ∆ p n := Z p,1 n × Z p,2 n .In other words, for any ε > 0 and p ≥ 1 large enough, we have But mimicking the proof detailed in the previous Section 2.3 for the convergence in probability of Γ[F n , G n ] towards the constant variable ν, one would then similarly get here that Ê[∆ p n ] converges in probability under P towards a constant random variable ν p ∈ R, and by construction lim p→+∞ ν p = ν.The crucial point here is that both random variables ∆ p n and Ê[∆ p n ] are now finitely expanded on the Wiener chaoses under P ⊗ P and P respectively.Therefore, by hypercontractivity, the convergence in probability can be freely upgraded to the convergence in L q for every q ≥ 1.In particular, as n goes to infinity, the sequence Ê[∆ p n ] converges in L 1 to the constant variable ν p .

Conclusion
We go back to Stein's Equation.Let φ ∈ C 1 b (R) and ε > 0. Integrating by parts, for p ≥ 1 large enough and by the results of the last section, we have As a result, letting first n and then p go to infinity, we get that uniformly in φ such that One can then classically conclude using Stein's approach for the convergence in total variation.