Gaussian Approximations of Multiple Integrals

Fix an integer k, and let I(l), l=1,2,..., be a sequence of k-dimensional vectors of multiple Wiener-It\^o integrals with respect to a general Gaussian process. We establish necessary and sufficient conditions to have that, as l diverges, the law of I(l) is asymptotically close (for example, in the sense of Prokhorov's distance) to the law of a k-dimensional Gaussian vector having the same covariance matrix as I(l). The main feature of our results is that they require minimal assumptions (basically, boundedness of variances) on the asymptotic behaviour of the variances and covariances of the elements of I(l). In particular, we will not assume that the covariance matrix of I(l) is convergent. This generalizes the results proved in Nualart and Peccati (2005), Peccati and Tudor (2005) and Nualart and Ortiz-Latorre (2007). As shown in Marinucci and Peccati (2007b), the criteria established in this paper are crucial in the study of the high-frequency behaviour of stationary fields defined on homogeneous spaces.


Introduction
Let U (l) = (U 1 (l) , ..., U k (l)), l ≥ 1, be a sequence of centered random observations (not necessarily independent) with values in R k . Suppose that the application l → EU i (l) 2 is bounded for every i, and also that the sequence of covariances c l (i, j) = EU i (l) U j (l) does not converge as l → +∞ (that is, for some fixed i = j, the limit lim l→∞ c l (i, j) does not exist). Then, a natural question is the following: is it possible to establish criteria ensuring that, for large l, the law of U (l) is close (in the sense of some distance between probability measures) to the law of a Gaussian vector N (l) = (N 1 (l) , ..., N k (l)) such that EN i (l) N j (l) = EU i (l) U j (l) = c l (i, j)? Note that the question is not trivial, since the asymptotic irregularity of the covariance matrix c l (·, ·) may in general prevent U (l) from converging in law toward a k-dimensional Gaussian distribution.
In this paper, we shall provide an exhaustive answer to the problem above in the special case where the sequence U (l) has the form U (l) = I (l) = I d 1 f where the integers d 1 , ..., d k ≥ 1 do not depend on l, I d j indicates a multiple stochastic integral of order d j (with respect to some isonormal Gaussian process X over a Hilbert space H -see Section 2 below for definitions), and each f (j) l ∈ H ⊙d j , j = 1, ..., k, is a symmetric kernel. In particular, we shall prove that, whenever the elements of the vectors I (l) have bounded variances (and without any further requirements on the covariance matrix of I (l)), the following three conditions are equivalent as l → +∞: (i) γ (L (I (l)) , L (N (l))) → 0, where L (·) indicates the law of a given random vector, N (l) is a Gaussian vector having the same covariance matrix as I (l), and γ is some appropriate metric on the space of probability measures on R k ; (ii) For every j = 1, ..., k, (iii) For every j = 1, ..., k and every p = 1, ..., d j −1, the sequence of contractions (to be formally defined in Section 2) f Some other conditions, involving for instance Malliavin operators, are derived in the subsequent sections. As discussed in Section 5, our results are motivated by the derivation of high-frequency Gaussian approximations of stationary fields defined on homogeneous spaces -a problem tackled in [9] and [10].
Note that the results of this paper are a generalization of the following theorem, which combines results proved in [13], [14] and [15].
The techniques we use to achieve our main results are once again the DDS Theorem, combined with Burkholder-Davis-Gundy inequalities and some results (taken from [4, Section 11.7]) concerning 'uniformities' over classes of probability measures.
The paper is organized as follows. In Section 2 we discuss some preliminary notions concerning Gaussian fields, multiple integrals and metrics on probabilities. Section 3 contains the statements of the main results of the paper. The proof of Theorem 1 (one of the crucial results of this note) is achieved in Section 4. Section 5 is devoted to applications.

Preliminairies
We present a brief review of the main notions and results that are needed in the subsequent sections. The reader is referred to [6] We denote by L 2 (X) the (Hilbert) space of the real-valued and square-integrable functionals of X.
Isometry, chaoses and multiple integrals. For every d ≥ 1 we will denote by I d the isometry between H ⊙d equipped with the norm √ d! · H ⊗d and the dth Wiener chaos of X. In the particular case where H = L 2 (A, A, µ), (A, A) is a measurable space, and µ is a σ-finite and non-atomic measure, then H ⊙d = L 2 s A d , A ⊗d , µ ⊗d is the space of symmetric and square integrable functions on A d and for every f ∈ H ⊙d , I d (f ) is the multiple Wiener-Itô integral (of order d) of f with respect to X, as defined e.g. in [12,Ch. 1]. It is well-known that a random variable of the type I d (f ), where d ≥ 2 and f = 0, cannot be Gaussian. Moreover, every F ∈ L 2 (X) admits a unique Wiener chaotic decomposition , and the convergence of the series is in L 2 (X).
Malliavin derivatives. We will use Malliavin derivatives in Section 3, where we generalize some of the results proved in [13]. The class S of smooth random variables is defined as the collection of all functionals of the type where h 1 , ..., h m ∈ H and f is bounded and has bounded derivatives of all order. The operator D, called the Malliavin derivative operator, is defined on S by the relation where F has the form (3). Note that DF is an element of L 2 (Ω; H). As usual, we define the domain of D, noted D 1,2 , to be the closure of S with respect to the norm F 1,2 E F 2 +E DF 2 H . When F ∈ D 1,2 , we may sometimes write DF = D [F ], depending on the notational convenience. Note that any finite sum of multiple Wiener-Itô integrals is an element of D 1,2 .
Contractions. Let {e k : k ≥ 1} be a complete orthonormal system of H. For any fixed f ∈ H ⊙n , g ∈ H ⊙m and p ∈ {0, ..., n ∧ m}, we define the pth contraction of f and g to be the element of H ⊗n+m−2p given by We stress that f ⊗ p g need not be an element of H ⊙n+m−2p . We denote by f ⊗ p g the symmetrization of f ⊗ p g. Note that f ⊗ 0 g is just the tensor product f ⊗ g of f and g. If n = m, then f ⊗ n g = f, g H ⊗n .
Metrics on probabilities. For k ≥ 1 we define P R k to be the class of all probability measures on R k . Given a metric γ (·, ·) on P R k , we say that γ metrizes the weak convergence on P R k whenever the following double implication holds for every Q ∈ P R k and every {Q l : l ≥ 1} ⊂ P R k (as l → +∞): γ (Q l , Q) → 0 if, and only if, Q l converges weakly to Q. Some examples of metrizing γ are the Prokhorov metric (usually noted ρ) or the Fortet-Mounier metric (usually noted β). Recall that where A ǫ = {x : x − y < ε for some y ∈ A}, and · is the Euclidiean norm. Also, where · BL = · L + · ∞ , and · L is the usual Lipschitz seminorm (see [4, p. 394] for further details). The fact that we focus on the Prokhorov and the Fortet-Mounier metric is due to the following fact, proved in where L (·) indicates the law of a given random vector, and · is the Euclidean norm.

Main results
Fix integers k ≥ 1 and d 1 , ..., d k ≥ 1, and consider a sequence of k-dimensional random vectors of the type where, for each l ≥ 1 and every j = 1, ..., k, f (j) l is an element of H ⊙d j . We will suppose the following: .., k and every l ≥ 1.
a centered k-dimensional Gaussian vector with the same covariance matrix as for every 1 ≤ i, j ≤ k. For every λ = (λ 1 , ..., λ k ) ∈ R k , we also use the compact notation: l . The next result is one of the main contributions of this paper. Its proof is deferred to Section 4.
Theorem 1 Let the above notation and assumptions prevail, and suppose that, for every j = 1, ..., k, the following asymptotic condition holds: for every p = 1, ..., d j − 1, Then, as l → +∞ and for every compact set M ⊂ R k , We now state two crucial consequences of Theorem 1. The first one (Proposition 2) provides a formal meaning to the intuitive fact that, since (11) holds and since the variances of I (l) do not explode, the laws of I (l) and N (l) are "asymptotically close". The second one (Theorem 3) combines Theorem 1 and Proposition 2 to obtain an exhaustive generalization "without covariance conditions" of Theorem 0 (see the Introduction). Note that in the statement of Theorem 3 also appear Malliavin operators, so that our results are a genuine extension of the main findings by Nualart and Ortiz-Latorre in [13]. We stress that multiple stochastic integrals of the type Proposition 2 Let the assumptions of Theorem 1 prevail (in particular, (10) holds), and denote by L (I (l)) and L (N (l)), respectively, the law of I (l) and N (l), l ≥ 1. Then, the two collections {L (N (l)) : l ≥ 1} and {L (I (l)) : l ≥ 1} are tight. Moreover, if γ (·, ·) metrizes the weak convergence on P R k , then lim l→+∞ γ (L (I (l)) , L (N (l))) = 0.
Proof. The fact that {L (N (l)) : l ≥ 1} and {L (I (l)) : l ≥ 1} are tight is a consequence of the boundedness of the sequence (8) and of the relation . The rest of the proof is standard, and is provided for the sake of completeness. We shall prove (12) by contradiction. Suppose there exist ε > 0 and a subsequence {l n } such that γ (L (I (l n )) , L (N (l n ))) > ε for every n. Tightness implies that {l n } must contain a subsequence {l n ′ } such that L (I (l n ′ )) and L (N (l n ′ )) are both weakly convergent. Since (11) holds, we deduce that L (I (l n ′ )) and L (N (l n ′ )) must necessarily converge to the same weak limit, say Q. The fact that γ metrizes the weak convergence implies finally that thus contradicting the former assumptions on {l n } (note that the inequality in (13) is just the triangle inequality). This shows that (12) must necessarily take place.
Remarks. (i) A result analogous to the arguments used in the proof of Corollary 2 is stated in [4, Exercise 3, p. 419]. Note also that, without tightness, a condition such as (11) does not allow to deduce the asymptotic relation (12). See for instance [4, Proposition 11.7.6] for a counterexample involving the Prokhorov metric on P(R).
Proof. The implication 1. =⇒ 2., is a consequence of Theorem 1 and Proposition 2. Now suppose (15)  , l ≥ 1, is uniformly integrable. To see why {A * l } is uniformly integrable, one can use the fact that, since each I * ,(j) l has the same law as an element of the d j th chaos of X and each N * ,(j) l is Gaussian, then (see e.g. [6, Ch. VI]) for every p ≥ 2 there exists a universal positive constant C p,j (independent of l) such that The equivalence 1. ⇐⇒ 4. is an immediate consequence of the previous discussion.
To conclude the proof, we shall now show the double implication 1. ⇐⇒ 5.. To do this, we first observe that, by performing the same caclulations as in [13, Proof of Lemma 2] (which are based on an application of the multiplication formulae for multiple integrals, see [12, Proposition 1.1.3]), one obtains that , the last relation implies immediately that 1. ⇒ 5.. To prove the opposite implication, first observe that, due to the boundedness of (8) and the Cauchy-Schwarz inequality, there exists a finite constant M (independent of j and l) such that f This implies that, for every sequence {l n }, there exists a subsequence {l n ′ } such that the se- are convergent for every j = 1, ..., k and every p = 1, ..., d j − 1 (recall that, by assumption, there exists a constant η > 0, such that f (j) l n ′ H ⊗d j ≥ η, for every j and l). We shall now prove that, whenever (17) is verified, then necessarily Indeed, Theorem 4 in [13] implies that, if (17) takes place and where N (0, c) stands for a centered Gaussian random variable with variance c. But Theorem 1 in [14] implies that, if (19) is verified, then f (j) → 0, thus proving our claim. This shows that 5. ⇒ 1.. The next result says that, under the additional assumption that the variances of the elements of I (l) converge to one, the asymptotic approximation (15) is equivalent to the fact that each component of I (l) verifies a CLT. The proof is elementary, and therefore omitted.
Corollary 4 Fix k ≥ 2, and suppose that the sequence I (l), l ≥ 1, is such that, for every j = 1, ..., k, the sequence of variances appearing in (8) converges to 1, as l → +∞. Then, each one of  in the statement of Theorem 3 is equivalent to the following: for every j = 1, ..., k, where N (0, 1) is a centered Gaussian random variable with unitary variance.
Remark. The results of this section can be suitably extended to deal with the Gaussian approximations of random vectors of the type (F l (X), j = 1, ..., k, is a general square integrable functional of the isonormal process X, not necessarily having the form of a multiple integral. See [10, Th. 6] for a statement containing an extension of this type.

Proof of Theorem 1
We provide the proof in the case where where dx stands for Lebesgue measure. The extension to a general H is obtained by using the same arguments outlined in [14,Section 2.2]. If H is as in (21), then for every d ≥ 2 one has that indicates the class of symmetric, real-valued and square-integrable functions (with respect to the Lebesgue measure) on [0, 1] d . Also, the isonormal process X coincides with the Gaussian space generated by the standard Brownian motion This implies in particular that, for every d ≥ 2, the Wiener-Itô integral I d (f ), f ∈ L 2 s [0, 1] d , can be rewritten in terms of an iterated stochastic integral with respect to W , that is: We also have that 1]). Note that the RHS of (22) is just an iterated adapted stochastic integral of the Itô type. Finally, for every f ∈ L 2 s ([0, 1] d ), every g ∈ L 2 s ([0, 1] d ′ ) and every p = 0, ..., d ∧ d ′ , we observe that the contraction f ⊗ p g is the (not necessarily symmetric) element of L 2 ([0, 1] d+d ′ −2p ) given by: × g y d−p+1 , ..., y d+d ′ −2p , a 1 , ..., a p da 1 ...da p .
In the framework of (21), the proof of Theorem 1 relies on some computations contained in [15], as well as on an appropriate use of the Burkholder-Davis-Gundy inequalities (see for instance [16, Ch. IV §4]). Fix λ = (λ 1 , ..., λ k ) ∈ R k , and consider the random variable where, for every d ≥ 1, every t ∈ [0, 1] and every f ∈ L 2 we also use the conventional notation J t 0 (c) = c). We start by recalling some preliminary results involving Brownian martingales. Start by setting, for l (u, ·) 1 [0,u] d j −1 , and observe that the random application defines a (continuous) square-integrable martingale started from zero, with respect to the canonical filtration of W , noted F W t : t ∈ [0, 1] . The quadratic variation of this martingale is classically given by t → t 0 φ λ,l (u) 2 du, and a standard application of the Dambis, Dubins and Schwarz Theorem (see [16, Ch. V §1]) yields that, for every l ≥ 1, there exists a standard Brownian motion (initialized at zero) Note that, in general, the definition of W (λ,l) strongly depends on λ and l, and that W (λ,l) is not a F W t -Brownian motion. However, the following relation links the two Brownian motions W (λ,l) and W : there exists a (continuous) filtration G where C is some universal constant independent of λ and l. To see how to obtain the inequality (24), introduce first the shorthand notation T (λ, l) 1 0 φ λ,l (u) 2 du (recall that T (λ, l) is a G (λ,l) t -stopping time), and then write In particular, relation (24) yields that the proof of Theorem 1 is concluded, once the following two facts are proved: (A) W (λ,l) q(λ,l) = λ, N (l) k , for every λ ∈ R k and every l ≥ 1; (B) the sequence converges to zero, uniformly in λ, on every compact set of the type M = [−T, T ] k , where T ∈ (0, +∞). The proof of (A) is immediate: indeed, W (λ,l) is a standard Brownian motion and, by using the isometric properties of stochastic integrals and the fact that the covariance structures of N (l) and I (l) coincide, To prove (B), use a standard version of the multiplication formula between multiple stochastic integrals (see for instance [12, Proposition 1.5.1]) where the index D (i, j) is defined as Formula (25) implies that, for every λ ∈ [−T, T ] k (T > 0), (note that the RHS of (26) does not depend on λ). Finally, a direct application of the calculations contained in [15, p. 253-255] yields that, for every i, j = 1, ..., k and every p = 0, ..., D (i, j), as l → +∞. This concludes the proof of Theorem 1.
Remark. By inspection of the calculations contained in [15, p. 253-255], it is easily seen that, to deduce (27) from (10), it is necessary that the sequence of variances (8) is bounded.

Concluding remarks on applications
Theorem 1 and Theorem 3 are used in [10] to deduce high-frequency asymptotic results for subordinated spherical random fields. This study is strongly motivated by the probabilistic modelling and statistical analysis of the Cosmic Microwave Background radiation (see [7], [8], [9] and [10] for a detailed discussion of these applications). In what follows, we provide a brief presentation of some of the results obtained in [10].
Let S 2 = x ∈ R 3 : x = 1 be the unit sphere, and let T = {T (x) : x ∈ S 2 } be a realvalued (centered) Gaussian field which is also isotropic, in the sense that T (x) Law = T (Rx) (in the sense of stochastic processes) for every rotation R ∈ SO (3). The following facts are well known: (1) The trajectories of T admit the harmonic expansion T (x) = ∞ l=0 l m=−l a lm Y lm (x), where {Y lm : l ≥ 0, m = −l, ..., l} is the class of spherical harmonics (defined e.g. in [17,Ch. 5]); (2) The complex-valued array of harmonic coefficients {a lm : l ≥ 0, l ≥ 0, m = −l, ..., l} is composed of centered Gaussian random variables such that the variances E |a lm | 2 C l depend exclusively on l (see for instance [1]); (3) The law of T is completely determined by the power spectrum {C l : l ≥ 0} defined at the previous point.
Proof. Since T (q) l (x) is a linear functional involving uniquely Hermite polynomials of order q (written on the Gaussian field T ) one deduces that there exists a real Hilbert space H such that (in the sense of stochastic processes) where the class of symmetric kernels f (q,l,x) : l ≥ 0, x ∈ S 2 is a subset of H ⊙q , and I q f (q,l,x) stands for the qth Wiener-Itô integral of f (q,l,x) with respect to an isonormal Gaussian process over H, as defined in Section 2. Since the variances of the components of the vector (T l (x k )) are all equal to 1 by construction, we can apply Theorem 3 and Proposition 2. Indeed, by Theorem 3 we know that (29) implies that, for every p = 1, ..., q − 1 and every j = 1, ..., k, f (q,l,x j ) ⊗ p f (q,l,x j ) → 0 in H ⊙2(q−p) .
The derivation of sufficient conditions to have (29) is the main object of [10]. In particular, it is proved that sufficient (and sometimes also necessary) conditions for (29) can be neatly expressed in terms of the so-called Clebsch-Gordan coefficients (see again [17]), that are elements of unitary matrices connecting reducible representations of SO (3).