Fine Gaussian fluctuations on the Poisson space, I: contractions, cumulants and geometric random graphs

We study the normal approximation of functionals of Poisson measures having the form of a finite sum of multiple integrals. When the integrands are nonnegative, our results yield necessary and sufficient conditions for central limit theorems. These conditions can always be expressed in terms of contraction operators or, equivalently, fourth cumulants. Our findings are specifically tailored to deal with the normal approximation of the geometric $U$-statistics introduced by Reitzner and Schulte (2011). In particular, we shall provide a new analytic characterization of geometric random graphs whose edge-counting statistics exhibit asymptotic Gaussian fluctuations, and describe a new form of Poisson convergence for stationary random graphs with sparse connections. In a companion paper, the above analysis is extended to general $U$-statistics of marked point processes with possibly rescaled kernels.


Introduction
This paper concerns the normal approximation of random variables living inside a fixed sum of Wiener chaoses associated with a Poisson measure over a Borel measure space.Our main theoretical tools come from the two papers [26,27], respectively by Peccati et al. and Peccati and Zheng, where the normal approximation of functional of Poisson measures is studied by combining two probabilistic techniques, namely the Stein's method and the Malliavin calculus of variations.
We shall focus on conditions implying that a given sequence of random variables satisfies a central limit theorem (CLT), where the convergence in distribution takes place in the sense of the Wasserstein distance (see Section 1.1 for definitions).Our main concern is to provide analytic conditions for asymptotic normality, that is, conditions only involving expressions related to the kernels in the chaotic expansion of a given random variable.In particular, our approach does not involve computations based on the method of moments and cumulants (with the exception of Theorem 4.14, where we deal with Poisson approximations).
The main contributions of our paper are the following: -In Theorem 3.5, we shall prove that conditions for asymptotic normality can be expressed in terms of norms of contraction operators (see Section 2.2).These analytic objects already appear in CLTs living inside a fixed Wiener chaos (see [26,27]), and are a crucial tool in order to effectively assess bounds based on Malliavin operators.One further important point is that the use of contraction operators allows one to neatly distinguish the contribution of each chaotic projection to the CLT, as well as to deduce joint CLTs for these projections starting from the asymptotic normality of their sum (see Proposition 3.14).
1 arXiv:1111.7312v3[math.PR] 23 Jun 2012 -In Theorem 3.12 we shall prove that, when specialized to random variables such that each kernel in the Wiener-Itô representation has a constant sign, our results yield necessary and sufficient conditions for asymptotic normality.The main tools in order to show such a result are two new analytic bounds, stated in Proposition 3.8 and Proposition 3.9.These findings extend to the Poisson framework the 'fourth moment theorem' proved by Nualart and Peccati (in a Gaussian setting) in [21], a result that has been the starting point of a new line of research in stochastic analysis -see the book [19], as well as the constantly updated webpage http : //www.iecn.u− nancy.fr/nourdin/steinmalliavin.htm.
-As discussed below, random variables having Wiener-Itô kernels with constant sign appear quite naturally in problems arising in stochastic geometry.In particular, we shall use our results in order to provide an exhaustive characterization of stationary geometric random graphs whose edge counting statistics exhibit asymptotic Gaussian fluctuations (see Theorem 4.11).This family of geometric graphs contains e.g.interval graphs and disk graphs -see e.g.[6,7,8,15,17,29].Our characterization of geometric random graphs involves 'diagonal subsets' of Cartesian products, that are reminiscent of the combinatorial conditions for CLTs used by Blei and Janson in [4], in the framework of CLTs for finite Rademacher sums (see also [20,Section 6]).As a by-product of our analysis (see Theorem 4.14), we shall illustrate a new form of Poisson convergence for random graphs with sparse connections.
We stress that one of our main motivations comes from a remarkable paper by Reitzner and Schulte [31], laying the foundations of a general theory for CLTs involving U -statistics based on Poisson point processes.In particular, one of the crucial insights of [31] concerns the use of a formula by Last and Penrose (see [12]), providing explicit expressions for Wiener-Itô chaotic decompositions in terms of difference operators (see Theorem 2.9).It is interesting to note that Last and Penrose's formula is the Poisson analogous of the so-called 'Stroock formula' of Malliavin calculus, which is in turn an important tool for proving CLTs involving non-linear functionals of Gaussian measures (see e.g.[19,Corollary 2.7.8] for a discussion of this point).We shall see that our findings complement and extend the results proved in [31] in several directions.See also Decreusefond et al. [7], Ferraz and Vergne [8], Last et al. [13], Minh [16], Schulte [33], Schulte and Thaele [34,35], for several new findings pertaining to this line of research.
In order to keep the length of this paper within bounds, in Section 4 we will present applications that are related to a very specific setting, namely edge-counting in random geometric graphs with possibly large connections.The power and flexibility of the results proved in the present work are further illustrated in the companion paper [11], where the following applications are developed in full detail: (i) analytic bounds for the normal approximation of U -statistics based on marked point processes, in particular U -statistics with rescaled kernels; (ii) bounds for general subgraph counting in the disk graph model under any regime; (iii) an exhaustive characterization of the asymptotic behavior of geometric U -statistics; (iv) applications to the boolean model, and to subgraph counting in disk graph models with random radius.
The rest of this section is devoted to the formal presentation of the main problems that are addressed in this paper.

Poisson measures
Throughout the paper (Z, Z , µ) is a measure space such that Z is a Borel space, Z is the associated Borel σ-field, and µ is a σ-finite Borel measure with no atoms.We write Z µ = {B ∈ Z : µ(B) < ∞} to denote the subclass of Z composed of sets with finite measure.Also, we shall write η = {η(B) : B ∈ Z µ } to indicate a Poisson measure on (Z, Z) with control µ.In other words, η is a collection of random variables defined on some probability space (Ω, F , P ), indexed by the elements of Z µ and such that: (i) for every B, C ∈ Z µ such that B ∩ C = ∅, the random variables η(B) and η(C) are independent; (ii) for every B ∈ Z µ , η(B) has a Poisson distribution with mean µ(B).We shall also write η(B) = η(B) − µ(B), B ∈ Z µ , and η = {η(B) : B ∈ Z µ }.A random measure verifying property (i) is usually called "completely random" or "independently scattered" (see e.g.[24] for a general introduction to these concepts).
Remark 1.1 As it is customary, by a slight abuse of notation, we shall often write x ∈ η in order to indicate that the point x ∈ Z is charged by the random measure η(•).
In this paper, we shall focus on sequences of random variables {F n : n 1} having a finite Wiener-Itô chaotic decomposition, that is, such that where the symbol I q i indicates a multiple Wiener-Iô integral of order q i with respect to η, the integer k does not depend on n, and each f is a non-zero symmetric kernel from Z q i to R (see Section 2.1 below for details).We will be specifically concerned with the forthcoming Problem 1.2.Recall that, given random variables U, Y ∈ L 1 (P ), the Wasserstein distance between the law of U and the law of Y is defined as the quantity where Lip(1) indicates the class of Lipschitz real-valued function with Lipschitz constant 1.It is well-known that the topology induced by d W , on the class of probability measures on the real line, is strictly stronger than the one induced by the convergence in distribution.
Problem 1.2 Find analytic conditions on the kernels {f (n) i } ensuring that the sequence converges in distribution, as n → ∞, to a standard Gaussian random variable N ∼ N (0, 1), in the sense of the Wasserstein distance.Determine under which assumptions these conditions are also necessary, and find explicit upper bounds for the sequence We will deal with Problem 1.2 in Section 3, where it is shown that a convenient solution can be deduced by using contraction operators.Among other features, these operators provide a neat way to deal with the product of multiple stochastic integral, and virtually replace the use of diagram formulae -see e.g.[24].As anticipated, we will see that, in the specific case of random variables as in (1.1) such that f (n) i 0, our results lead to necessary and sufficient conditions that are analogous to the so-called 'fourth moment theorems' for sequences of multiple integrals in a Gaussian setting -see [21].
Remark 1.3 Problem 1.2 is also explicitly studied in [31,Section 4].In particular, Theorem 4.1 in [31] provides bounds in the Wasserstein distance for random variables having a finite chaotic decomposition, where the bounds are expressed in terms of expectations of inner products of multiple integral stochastic processes.On the other hand, Theorem 4.7 in [31] provides an analytic bound, involving sums over partitions, for the normal approximation of absolutely convergent U -statistics.Here, we call 'analytic bound' any upper bound only involving deterministic transformations of the kernel determining the U -statistic, without any additional probabilistic component.

Random graphs
As anticipated, we shall now apply our main theoretical results to the study of geometric random graphs whose edge-counting statistics satisfy a CLT.The class of geometric random graphs considered below allow for long connections, in the sense that the geometric rule used to define edges is based on the use of arbitrarily large sets and therefore is not local.It is worth noting by now that our setting represents a natural generalization of the so called Gilbert graphs -see Example 1.5 below.Also, as explained in Remark 1.8 below, part of the models we consider cannot be dealt with by directly using the powerful theory of stabilization (see e.g.[14]).Now let the notation introduced in the previous section prevail.In what follows, we shall denote by W (as in 'window') a measurable subset of Z such that µ(W ) < ∞.We first introduce the notion of a geometric random graph based on the restriction of the Poisson measure η to W , and on some symmetric set H ⊂ Z × Z.
is symmetric (that is, for every (x, y) ∈ H, one also has (y, x) ∈ H) and H is non-diagonal (that is, H does not contain any pair of the type (x, x)).
(a) The random geometric graph based on η, W and H is the undirected random graph such that: (i) the vertices of G are given by the class V = η ∩ W = {x ∈ η : x ∈ W }, and (ii) a pair {x, y} belongs to the set E of the edges of G if and only if (x, y) ∈ H.We observe that, since H is non-diagonal, G has no loops, that is: G does not contain any edge of the type {x, x}.
(b) Assume in addition that Z is a vector space.The random geometric graph at Point (a) is said to be stationary if there exists a set H ⊂ Z such that Note that, since H is symmetric, one has necessarily that H = −H; moreover, since H has no diagonal components, 0 / ∈ H.

Example 1.5 (i)
The class of random geometric graphs introduced above generalizes the notion of a Gilbert graph, obtained by taking Z equal to some metric space (endowed with a distance d) and then the corresponding geometric random graph is stationary with H = B(0, δ)\{0}, where B(0, δ) ⊂ Z stands for the open ball of radius δ centered at the origin.Graphs of this type are customarily called interval graphs when Z = R, and disk graphs when Z = R 2 -see e.g.[6,7,8,15,17] for recent developments on the subject.
In Section 4, we shall use our general results in order to deal with the following problem.
Remark 1.7 (i) At this stage, the role of the window W might seem immaterial, and indeed the substance of Problem 1.6 does not change if one takes W = Z.However, the above formulation allows for the more general case of a window W = W λ possibly depending on λ.Moving windows of this type appear in Section 4, as well as in the paper [11].
(ii) In many examples and applications, one considers sets H λ such that α(H λ ∩(W ×W )) ↓ 0, as λ → ∞, for some fixed measure α on W × W . Heuristically, the fact that µ λ (W ) ↑ ∞ and α(H λ ∩ (W × W )) ↓ 0 ensures that the following phenomenon takes place: as λ grows, more and more vertices and edges are added to the geometric graph, whereas old edges are deleted as a consequence of the asymptotic negligibility of H λ ∩(W ×W ).Solving Problem 1.6 in this framework is equivalent to characterizing all sequences of random geometric graphs such that the addition of vertices and the cancellation of edges compensate, thus generating asymptotic Gaussian fluctuations.
When specialized to the case of Gilbert graphs on Z = R d , Problem 1.6 is tackled in the classic reference [29,Chapter 3] as a special case of general subgraph counting.A comparison with the results of [29,Chapter 3] is provided in Section 4.3.1 below.A complete solution of Problem 1.6 for general subgraph counting in Gilbert graphs, based on the techniques developed in this paper, is presented in [11,Section 3].See also [31,Section 6.2].
Remark 1.8 Assume that, for every x ∈ η, there exists a random radius R x such that all the y connected to x in the random graph lie in the ball with center x and radius R x .Then, the variable F = F (1, W ; η λ , H λ ) in (1.2) is stabilizing, meaning that F can be written in the form where ξ is such that ξ(x, η) is not modified by adding or removing a finite number of points to η outside the ball with center x and radius R x (see [14] for more details on this topic).In our case, to fit the framework of formula (1.2) in the case g = 1, ξ(x, η) should be defined as where #A indicates the cardinality of A. The CLTs presented for instance in [1,30] cover well this case.Remark that in this particular framework of a deterministic connection rule, stabilization theory only allows for a bounded length, while we consider here models where points can have arbitrarily long connections.
The rest of the paper is organized as follows.In Section 2, we discuss several background results concerning Poisson measures, Wiener chaos and U -statistics.Section 3 contains our main abstract results concerning the normal approximation of random variables having a finite chaotic decomposition.Section 4 focuses on random graphs and on several analytical characterizations of associated CLTs.An Appendix (see Section 5) provides some basic definitions and results of Malliavin calculus.

Multiple integrals and chaos
As before, (Z, Z , µ) is a non-atomic Borel measure space, and η is a Poisson measure on Z with control µ.
Remark 2.1 By virtue of the assumptions on the space (Z, Z , µ), and to simplify the discussion, we will assume throughout the paper that (Ω, F , P ) and η are such that where δ z denotes the Dirac mass at z, and η is defined as the canonical mapping Also, the σ-field F will be always supposed to be the P -completion of the σ-field generated by η.
Throughout the paper, for p ∈ [1, ∞), the symbol L p (µ) is shorthand for L p (Z, Z , µ).For an integer q 2, we shall write L p (µ q ) := L p (Z q , Z ⊗q , µ q ), whereas L p s (µ q ) stands for the subspace of L p (µ q ) composed of functions that are µ q -almost everywhere symmetric.Also, we adopt the convention L p (µ) = L p s (µ) = L p (µ 1 ) = L p s (µ 1 ) and use the following standard notation: for every q 1 and every f, g ∈ L 2 (µ q ), f, g L 2 (µ q ) = Z q f (z 1 , ..., z q )g(z 1 , ..., z q )µ q (dz 1 , ..., dz q ), f L 2 (µ q ) = f, f For every f ∈ L 2 (µ q ), we denote by f the canonical symmetrization of f , that is, where σ runs over the q! permutations of the set {1, . . ., q}.Note that f L 2 (µ q ) f L 2 (µ q ) (to see this, use for instance the triangular inequality) .Definition 2.2 For every deterministic function h ∈ L 2 (µ), we write to indicate the Wiener-Itô integral of h with respect to η.For every q 2 and every f ∈ L 2 s (µ q ), we denote by I q (f ) the multiple Wiener-Itô integral, of order q, of f with respect to η.We also set I q (f ) = I q ( f ), for every f ∈ L 2 (µ q ) (not necessarily symmetric), and I 0 (b) = b for every real constant b.
The reader is referred for instance to the monograph [24], by Peccati and Taqqu, for a complete discussion of multiple Wiener-Itô integrals and their properties (including the forthcoming Proposition 2.3 and Proposition 2.4).

Proposition 2.3
The following equalities hold for every q, m 1, every f ∈ L 2 s (µ q ) and every g ∈ L 2 s (µ m ): The Hilbert space composed of the random variables with the form I q (f ), where q 1 and f ∈ L 2 s (µ q ), is called the qth Wiener chaos associated with the Poisson measure η.The following well-known chaotic representation property is an essential feature of Poisson random measures.Recall that F is assumed to be generated by η.
where the series converges in L 2 (P ) and, for each i 1, the kernel f i is an element of L 2 s (µ i ).

Star contractions and multiplication formulae
We shall now introduce contraction operators, and succinctly discuss some of their properties.
As anticipated in the Introduction, these objects are at the core of our main results.

About the Malliavin formalism
For the rest of the paper, we shall use definitions and results related to Malliavin-type operators defined on the space of functionals of the Poisson measure η.Our formalism coincides with the one introduced by Nualart and Vives in [22].In particular, we shall denote by D, δ, L and L −1 , respectively, the Malliavin derivative, the divergence operator, the Ornstein-Uhlenbeck generator and its pseudo-inverse.The domains of D, δ and L are written domD, domδ and domL.The domain of L −1 is given by the subclass of L 2 (P ) composed of centered random variables.For the convenience of the reader we have collected some crucial definitions and results in the Appendix (see Section 5).Here, we just recall that, since the underlying probability space Ω is assumed to be the collection of discrete measures described in Remark 2.1, then one can meaningfully define the random variable ω → F z (ω) = F (ω + δ z ), ω ∈ Ω, for every given random variable F and every z ∈ Z, where δ z is the Dirac mass at z.One can therefore prove that the following neat representation of D as a difference operator is in order.

U -statistics
Following [31, Section 3.1], we now introduce the concept of a U -statistic associated with the Poisson measure η.
where the symbol η k = indicates the class of all k-dimensional vectors (x 1 , ..., x k ) such that x i ∈ η and x i = x j for every 1 i = j k.As made clear in [31, Definition 3.1], the possibly infinite sum appearing in (2.8) must be regarded as the L 1 (P ) limit of objects of the type Plainly, a U -statistic of order one is just a linear functional of η, with the form for some f ∈ L 1 (µ).The following statement, based on the results proved by Reitzner and Schulte in [31], collects two crucial properties of U -statistics.Theorem 2.9 (See [31]) Let F ∈ L 1 (P ) be a U -statistic as in (2.8).Then, the following two properties hold.
(a) The expectation of F is given by (2.9) (b) If F is also square-integrable, then necessarily f ∈ L 2 s (µ k ), and the Wiener-Itô representation (2.4) of F is such that f i = 0, for i k + 1, and ) One should note that formula (2.10) follows from an application of the results proved by Last and Penrose in [12].

U -statistics and random graphs
In this paper, we will be interested in characterizing the Gaussian fluctuations of U -statistics having a specific support.In particular this allows one to deal with the set of 'local U -statistics" introduced by Reitzner and Schulte in [31, Section 6].Recall that a set H ∈ Z k is called symmetric if the following implication holds: if (x 1 , ..., x k ) ∈ H, then (x σ(1) , ..., x σ(k) ) ∈ H for every permutation σ of {1, ..., k}.Definition 2.10 (Support of a U -statistic) Let k 2, and let H ⊂ Z k be a measurable symmetric set.A U -statistic F as in (2.8) is said to have support in H if the function f is such that Example 2.11 (Local U -statistics) Let Z be a metric space.Then, the class of local Ustatistics, as defined in [31, Section 6], coincides with the family of U -statistics having support in a set of the type H = (x 1 , ..., x k ) : diam({x 1 , ..., x k }) < δ for some δ > 0. Here, the symbol diam(B) is shorthand for the diameter of B.
We shall now point out a well-known connection between U -statistics and hypergraphs.Recall that a hypergraph of order k 2 is a pair (V, E), where V = (v 1 , ..., v m ) is a set of vertices, and E = (E 1 , ..., E s ) is a collection of (possibly non-disjoint) subsets of V (called edges), such that each E i contains exactly k elements; in particular a hypergraph of order 2 is an undirected graph.
Remark 2.12 (U -statistics as graph statistics) (i) Let k 2, let F be a U -statistic as in (2.8), and assume that f = 1 W k × 1 H , where W ⊂ Z is some set (usually called a 'window') such that µ(W ) < ∞.Then, the random variable 1 k! F counts the number of edges in the random hypergraph (V, E), obtained as follows: V = η ∩ W , and the class of edges E is composed of all subsets {x 1 , ..., F counts the number of edges in the undirected graph whose vertices V are given by the points of W charged by η and such that two vertices v 1 , v 2 are connected by an edge if and only if 0 < d(v 1 , v 2 ) < δ.These are the 'Gilbert random graphs' discussed in Example 1.5(i).
To conclude, we present the notion of a stationary U -statistic.It will play an important role in Section 4. Definition 2.13 (Stationary U -statistics) Fix k 2, assume that Z is a vector space, and let F be a U -statistic of the type (2.8), having support in a symmetric set H).We shall say that F is stationary if there exists H ⊂ Z k−1 such that (2.11) Then, the corresponding U -statistic F is stationary, with H = B(0, δ), where B(0, δ) ⊂ Z stands for the open ball of radius δ centered at the origin.See Example 1.5(ii).
3 Normal approximations for finite chaotic expansions

Framework
We shall tackle Problem 1.2, by focussing on the normal approximation of random variables F having the form where: k 1 is an integer; -the integers q i , i = 1, ..., k, are such that 1 q 1 < q 2 < • • • < q k ; -the symbol I q indicates a multiple Wiener-Itô integral of order q, with respect to a centered Poisson measure η = η − µ, where η is a Poisson measure on the Borel measurable space (Z, Z ), with deterministic and σ-finite control measure µ; -each kernel f i is a nonzero element of L 2 s (µ q i ), and the class {f i : i = 1, ..., k} verifies in addition the forthcoming Assumption 3.1.Assumption 3.1 (Technical assumptions on integrands) Let the notation of Section 2.2 prevail.Every random variable of the type (3.12) considered in the sequel of this paper is such that the following properties (i)-(iii) are verified.
Remark 3.3 Assumption 3.1 imply that Assumptions A-B-C in [27] are verified, so that the computations therein can be directly applied in our framework.

A general bound
Let F be a random variable as in (3.12) such that E[F 2 ] = σ 2 > 0 (σ > 0) and E[F ] = m ∈ R, and consider a Gaussian random variable N ∼ N (m, σ 2 ) with the same mean and variance.Then, a slight modification of [26, Theorem 3.1] (the modification resides in the fact that we consider an arbitrary variance σ 2 ) yields the following estimates: where The next statement shows that B 2 (F ; σ 2 ) can be further bounded in terms of the contractions introduced in Section 2.2.Theorem 3.5 Let F and N be the random variables appearing in (3.13)- (3.15).Then, there exists a universal constant C 0 = C 0 (q 1 , ..., q k ) ∈ (0, ∞), depending uniquely on q 1 , ..., q k , such that where In the previous expression, max 1 ranges over all 1 i k such that q i > 1, and all pairs (r, l) such that r ∈ {1, ..., q i } and 1 l r ∧ (q i − 1), whereas max 2 ranges over all 1 i < j d and all pairs (r, l) such that r ∈ {1, ..., q i } and l ∈ {1, ..., r}.When in the previous bound by the smaller quantity

.18)
Proof of Theorem 3.5.Without loss of generality, we can assume that m = 0. Also, throughout this proof, we write so that one can directly apply [27, Proposition 5.5] and deduce that there exists a constant a = a(q 1 , ..., q k ) such that To conclude, observe that so that [27, Proposition 5.6] implies that there exists a constant b = b(q 1 , ..., q k ) such that Taking C 0 = a + b yields the desired conclusion.The last assertion in the statement comes from the fact that, when Remark 3.6 According to [27, Lemma 2.9], for every quadruple (i, j, r, l) entering the expression of max 2 in (3.17), the following estimate holds:

Estimates for positive kernels
We shall now specialize Theorem 3.5 to the case on random variables having the form (3.12) and such that f i 0. In particular, we shall unveil some useful connections between the quantity B 3 (F ; σ) and the fourth cumulant of F .Remark 3.7 Random variables admitting a Wiener-Itô chaotic expansion with positive kernels appear rather naturally in stochastic geometry.For instance, an application of (2.10) shows that any U -statistic with a positive kernel admits a Wiener-Itô chaotic expansion of this type.Note that many counting statistics have the form of U -statistics with an integer-valued (and therefore nonnegative) kernel -such as for instance the subgraph-counting statistics in random geometric graphs (see e.g.[29,Chapter 3] and the references therein), or the statistics associated with hyperplane tessellations considered in [9].
The following statement concerns random variables of the form (3.12), in the special case where E[F ] = 0, k = 1 and the multiple stochastic integral has a nonnegative kernel.Proposition 3.8 (Fourth moment bound, I) Consider a random variable F as in (3.12), with the special form F = I q (f ), q 1, where f 0 and and there exist universal constants c 1 = c 1 (q) < C 1 = C 1 (q), depending uniquely on q, such that Proof.Using the multiplication formula (2.7), together with (2.6) and the positivity assumptions on f , one sees that F 2 can be written in the following form where R is a random variable orthogonal to the constants and to I 2q (f 0 0 f ) and such that for some universal positive constants J, K depending uniquely on q.The conclusion is obtained by using the relation where we have used [24, formula (11.6.30)].
The following general bound deals with random variables of the form (3.12) and with positive kernels.Proposition 3.9 (Fourth moment bound, II) Let F be as in (3.12), with k 1, and assume moreover that E[F ] = 0, and E[F 2 ] = σ 2 > 0, and f i 0 for every i.Then, there exists a universal constant C 2 = C 2 (q 1 , ..., q k ), depending uniquely on q 1 , ..., q k , such that max Proof.Write as before F i = I q i (f i ), and where V d stands for the collection of those (i 1 , i 2 , i 3 , i 4 ) ∈ {1, ..., k} 4 , such that one of the following conditions is satisfied: (a) the elements of (i 1 , i 2 , i 3 , i 4 ) are all distinct.Applying the multiplication formula (2.7) and exploiting the fact that each f i is nonnegative, we immediately deduce that Y 0 and Z 0, so that the desired conclusion follows from Proposition 3.8.

Conditions for asymptotic Gaussianity
This section contains a general statement (Theorem 3.12) about the normal approximation of random variables admitting a finite chaotic decomposition.The first part of such a result provides sufficient conditions for Central Limit Theorems, that are directly based on Theorem 3.5.As indicated in the subsequent parts, these conditions become necessary whenever the involved kernels are nonnegative.Theorem 3.12 is one of the main results of the paper, and is the main tool used to deduce the CLTs appearing in Section 4 and in [11].
More precisely, in what follows we shall fix integers k 1 and 1 q 1 < q 2 < ... < q k (not depending on n), and consider a sequence {F (n) : n 1} of random variables with the form each verifying the same assumptions as the random variable F appearing in (3.12) (in particular, Assumption 3.1 is satisfied for each n).We also use the following additional notation: (i) Remark 3.10 One of the main achievements of the forthcoming Theorem 3.12 is the 'fourth moment theorem' appearing at Point 3 in the statement, which only holds for random variables such that the kernels in the chaotic decomposition are nonnegative.As first proved in [21] (see also [19,Chapter 5]) an analogous result holds for general sequences of multiple Wiener-Itô integrals with respect to a Gaussian process.In particular, in the Gaussian framework one does not need to assume that the integrands have a constant sign.Proving analogous statements in a Poisson setting is quite a demanding task, one of main reasons being the rather intricate multiplication formula (2.7).Some previous partial findings in the Poisson case can be found in Peccati and Taqqu [23] (for sequences of double integrals) and in Peccati and Zheng [28] (for sequences of multiple integrals having the form of homogeneous sums).

Remark 3.11
In the statement of Theorem 3.12, we implicitly allow that the underlying Poisson measure η also changes with n.In particular, one can assume that the associated control measure µ = µ n explicitly depends on n.This general framework is needed for the geometric applications developed in Section 4.

Assume that f
(n) i 0 for every i, n.Then, a sufficient condition in order to have that

Assume that f
(n) i 0 for every i, n, and also that the sequence (F (n) ) 4 , n 1, is uniformly integrable.Then, the following conditions (a)-(c) are equivalent, as n → ∞: (a) Then, one has that (see e.g.[19, Proposition 3.6.1]) so that the desired conclusion follows from Theorem 3.5 as well as the inequality Using (3.19), we see that the last relation implies that B 3 (F (n) ; σ(n)) → 0, so that the desired conclusion follows from Point 1 in the statement.
3. In view of Point 1 and Point 2 in the statement, we shall only prove that (a) ⇒ (c).To prove this implication, just observe that 4 , so that the conclusion follows from the fact that σ 2 (n) → σ 2 .Remark 3.13 A sufficient condition in order to have that the sequence {(F (n) ) 4 } is uniformly integrable is the following: there exists some > 0 such that We shall use some estimates taken from [27] (see, in particular, Table 2, p. 1505, therein).Given a thrice differentiable function ϕ : R k → R, we set Proposition 3.14 Let the assumptions and notation of Theorem 3.12 prevail, and suppose that B 3 (F (n) ; σ) → 0, as n → ∞.Let N n,i ∼ N (0, σ 2 i (n)), i = 1, ..., k, Then, for every thrice differentiable function ϕ : R k → R, such that ϕ ∞ , ϕ ∞ < ∞, one has that Proof.According to [27], the following estimate takes place: so that the conclusion follows from (3.16).
4 Edge-counting in random geometric graphs: from Gaussian fluctuations to clustering

Framework
Our aim is now to tackle Problem 1.6.Throughout this section we shall work under the following slightly more restrictive setting (we use the notation of Problem 1.6).
-The symmetric function g : W × W → R is bounded (this assumption can be relaxedsee the discussion below).
In the forthcoming Section 4.2, we will show that the normal approximation of the random variables F (g, W ; η λ , H λ ) (as defined in (1.2)) can be completely characterized in terms of some diagonal restrictions of Cartesian products of the sets H λ .Among several consequences, this remarkable phenomenon implicitly provides a new geometric interpretation of contraction operators.
The following expressions are obtained by setting g = 1: λ );

General conditions and bounds
We start with a general estimate.
Theorem 4.2 (General bound for geometric graphs) Let the previous assumptions and notation prevail, and let N ∼ N (0, 1).Then, there exists a universal constant C, not depending on λ, such that, for every λ > 0, If the class { F (g, W ; η λ , H λ ) 4 : λ > 0} is uniformly integrable, then the RHS of (4.24) converges to zero, as λ → ∞, if and only if the CLT (1.3) takes place.
Proof.In what follows, we write F λ = F (g, W ; η λ , H λ ) and F λ = F (g, W ; η λ , H λ ) to simplify the notation.Last and Penrose's formula (2.10) implies that the random variable F λ admits the following chaotic decomposition where Routine computations imply then that It follows that The upper bound (4.24) is now obtained by using (3.17), as well as the following relations, that can be proved by a standard use of the Fubini Theorem: The last assertion in the statement follows from a direct application of Theorem 3.12. 2 The next two statements provide simplified bounds in case one of the two elements of the chaotic decomposition of F (g, W ; η λ , H λ ) converges to zero, as λ → ∞.The proof (which is standard and left to the reader) uses (4.26) as well as the following basic estimate: if Q, R, S are three random variables in L 1 (P ), then then there exists a constant C 1 , independent of λ, such that, for λ large enough, × max{A λ (g)} . (4.28) If the class { F (g, W ; η λ , H λ ) 4 : λ > 0} is uniformly integrable, then the RHS of (4.28) converges to zero, as λ → ∞, if and only if the CLT (1.3) takes place.

Edge counting in stationary graphs
For the rest of the section, we fix an integer d 1.For every λ > 0, we define the set . We now specialize the framework of the previous two sections to the following setting where is the Lebesgue measure on R d .We shall assume that, for every λ > 0, the symmetric non-diagonal set H λ has the form for some set H λ verifying Remark 4.5 We insist that the novelty here (with respect to the usual setting of disk graphs -see e.g.[29, Chapter 3] and the references therein) is that H λ need not be bounded, allowing for arbitrarily distant points to be connected.This is especially relevant whenever where α λ is a scaling factor and H 1 is a fixed unbounded geometric connection rule.Unlike in the classical literature of stochastic geometry, e.g. in stabilization theory, this allows for models with unbounded interactions, such as between distant particles.As already recalled, our approach is further applied in [11], where U -statistics with general stationary kernels (not only taking values 0 or 1), and general order k 2, are considered.
For every λ > 0, we shall write where we used the notation introduced in (1.2)- (1.3).With this notation, each 1 2 F λ is a stationary U -statistic (see Definition 2.13), counting the number of edges in the stationary random graph based on H λ (see Definition 1.4).The chaotic decomposition of F λ is written where we have adopted the same notation as in (4.25).Since g = 1, Problem 1.6 becomes the following: characterize all collections of sets {H λ } such that the CLT (1.3) takes place, and assess the rate of convergence in the Wasserstein distance.
Remark 4.6 For every λ > 0, one has the equality in law where η is a random Poisson measure with Lebesgue intensity, and G λ is a measurable subset of R d defined by the relation so that Remark 4.7 (Asymptotic equivalence notation) Given two mappings λ → γ λ , λ → δ λ , we write γ λ δ λ if there are two positive constants C, C > 0 such that Cγ λ δ λ C γ λ for λ sufficiently large.We write γ λ ∼ δ λ if δ λ > 0 for λ sufficiently large and α λ /δ λ → 1.
One of the main points developed in the present section is that the asymptotic Gaussianity of the class { F λ } results can be effectively studied by using the occupation coefficient of H λ , defined as We also write In order to obtain necessary and sufficient conditions for asymptotic normality, we will often work under the additional assumption that In this case, one has trivially that ψ(λ) ψ(λ) ψ(λ), and the value of ψ is only relevant up to a fixed multiplicative constant.Remark 4.8 (O-regularity) Assume that the geometric rule defined by G λ does not depend on λ, i.e.: G λ = G for some fixed measurable set G, in such a way that each set H λ is obtained by rescaling G by a factor λ −1/d .Then, condition (4.40) is implied by the following stronger assumption: ψ(aλ) ψ(λ) for every a > 0. In the terminology of [3, Section 2.2], this is equivalent to saying that ψ is O-regular.
In view of using the bounds appearing in Theorem 4.2, we have the following crucial estimates: Theorem 4.9 Let the previous notation and assumption prevail, set The following estimates are in order for every fixed λ > 0: Proof.We introduce the changes of variables denoted by ϕ (i) , i = 0, 1, 2, 3, 4, where Using the notation introduced in Definition 4.1 we have 1 u∈H λ dx 1 du, Using the inclusions 4   and the result follows.
The next statement provides one of the main results of this section: it gives an exhaustive characterization of the asymptotic behavior of F λ , whenever (4.40) is in order.In order to allow for a comparison with the existing literature, we classify the asymptotic behavior of F λ according to four regimes, denoted by (R1)-(R4).Such a classification is based on the proportion ψ(λ) of space occupied by H λ in the observation window, determining the influence area of a given point of the Poisson measure.This coefficient has to be compared with λ −1 , which corresponds to the total window measure divided by the mean number of points.The four regimes are the following: (R4) The mapping λ → λ ψ(λ) is bounded.
The thermodynamic regime corresponds (after rescaling) to the usual models where the geometry of the interactions does not change as the window of observation grows to the whole space (see Remark 4.6).We will see in Section 4.3.1 that, when specialized to Poissonized disk graphs, our asymptotic approximations and variance estimates concur with those obtained in [29,Chapter 3].Under regimes (R2) and (R3), there is asymptotic normality with convergence at speed λ −1/2 in the Wasserstein distance.Under (R1) the convergence to the normal law is slower, and under (R4) the asymptotic normality is lost: for any converging subsequence the limit is either Poisson or zero.Remark 4.10 One interesting contribution of Theorem 4.11 appears at the end of Point (ii), where it is stated that, under the thermodynamic regime, both chaotic projections of the random variable Fλ contributee to the limit and satisfy a joint CLT.This kind of phenomenon is an example of the "fine Gaussian fluctuations" appearing in the title of the paper.Theorem 4.11 Let {H λ : λ > 0} be a family of subsets of R d satisfying (4.32) and let ψ, ψ be defined according to (4.38)-(4.39).Assume in addition that (4.40) is satisfied, and consider a random variable N ∼ N (0, 1).The quantities introduced in Section 4.1 satisfy the following relations: there exist constants 0 < k < K < ∞, independent of λ, such that A λ λ −1/2 , and Furthermore, one can choose K in such a way that the following properties (i)-(iii) are verified.
(i) (Regime (R2)) If λψ(λ) → ∞, the first chaos projection F 1,λ dominates and In this case one has also that, as λ → ∞, the pair converges in distribution to a two-dimensional Gaussian vector (N 1 , N 2 ), such that N i ∼ N (0, 1) and N 1 , N 2 are independent.
(ii) Applying again (4.24), the conclusion is deduced from Point (i), because λψ(λ) c for some constant c > 0 and for λ large enough.The last statement at Point (ii) follows from an application of Proposition 3.14.
(iii) Using (4.24) again yields for λ large enough, because λψ(λ) → 0. To conclude the proof, we have to show that, if λ 2 ψ(λ) does not diverge to infinity, then Fλ does not converge in distribution to N .
To prove this negative result, one could apply the product formula (2.7) to prove that, whenever λ 2 ψ(λ) is not diverging to infinity and is bounded away from zero, there exists a sequence λ n , n 1, such that λ n → ∞ and sup n E[ F 6 λn ] < ∞, so that the desired conclusion is deduced from the last part of Theorem 4.2 (the case when λ 2 ψ(λ) is not bounded away from zero can be dealt with by a direct argument).However, the statement of the forthcoming Theorem 4.14 is much stronger, and it is therefore not necessary to spell out the details of these computations.Corollary 4.12 Assume that the geometric rule defined by G λ (see (4.35)) does not depend on λ, in such a way that G λ = G for some fixed measurable set G. Assuming (4.40) (see Remark 4.8), one has that Fλ converges in distribution to N ∼ N (0, 1), with a rate at most of the order λ −1/2 with respect to d W .
Proof.We are in one of the following situations: It corresponds to the case (ii) in Theorem 4.11, meaning the two chaoses codominate.It follows that Fλ converges to the normal law with a rate at most of the order of λ −1/2 with respect to d W .
2. If G does not have finite measure, λψ(λ) (G ∩ Q λ ) → ∞ and we are in the situation of Point (i) of Theorem 4.11, that is: the first chaos dominates.We therefore deduce that d W ( Fλ , N ) Kλ −1/2 , for some K > 0, and the conclusion follows.
As announced, we shall now deal more thoroughly with the case where λ 2 ψ(λ) does not diverge to infinity.In the proof of the next statement we shall use the following notation: if X is a random variable with finite moments of every order, then we write {χ m (X) : m 1} to indicate the sequence of its cumulants (see [24,Chapter 3] for an introduction to this concept).For instance, χ 1 (X) = E[X], χ 2 (X) = Var(X), and so on.
Remark 4. 13 The proof of Theorem 4.14 provided below is based on diagram formulae and the method of moments and cumulants.An alternate proof could be deduced from the classic results by Silverman and Brown [36], combined with a Poissonization argument.Another proof of this result, complete with explicit bounds in the total variation distance, appears in [25].The proof provided below has the merit of illustrating an application of diagram formulae (that are typically used to deduce CLTs) to a non-central result.In particular, the family { F λ } does not verify a CLT as λ → ∞.
Proof.Since Var(F λ ) ∼ Var(F 2,λ ) λ 2 ψ(λ), Point (i) is immediately deduced from the Bolzano-Weierstrass theorem.Point (ii) follows from a direct application of Campbell's Theorem (see [32,Theorem 3.1.3]),yielding that, as n → ∞, We shall prove Point (iii) by using the method of cumulants.First of all, we observe that since λ 2 n ψ(λ n ) is bounded and bounded away from zero, one has that V 2 1,λn λ 3 n ψ(λ n ) 2 → 0, that is: as n → ∞, the limits of F 2,λn and F λn − E[F λn ] coincide.We recall that the law of the random variable X = 2P (c/2) is determined by its moments or, equivalently, by its cumulants (see e.g.[24, pp. 42-43]).Standard computations imply that χ 1 (X) = 0 and, for every m 2, χ m (X) = 2 m−1 c.We are therefore left to show that, for every m 3, given by the juxtaposition of m copies of f 2,λ , and (2) identify two variables x i , x j in the argument of Φ if and only if i and j are in the same block of π.According to [24,Corollary 7.4.1],one has therefore that where the symbol M m stands for the class of those partitions π of On the other hand, if π ∈ M m and |π| 3, a change of variables analogous to the ones defined in the proof of Theorem 4.9 yields that, for some constant C independent of n, thus concluding the proof.

Two examples
We now present some explicit examples.The notation of Section 4.3 will prevail throughout this section.for some r λ > 0, meaning that two points of η in Q λ are connected whenever their distance is smaller than r λ .It yields ψ(λ) r d λ /λ (it is easy to verify that (4.40) is satisfied).Then Fλ is asymptotically normal iff λr d λ → ∞, and ).
According to the classification based on the four regimes (R1)-(R4), the above result yields the following exhaustive description of the asymptotic behavior of F λ (note how we are able to distinguish the contribution of each chaotic projection) : (R1) If r λ → 0 and λr d λ → ∞, then Var(F λ ) λr d λ , Fλ satisfies a CLT with an upper bound of the order of (λr d λ ) −1/2 on the Wasserstein distance, and the projection of Fλ on the second Wiener chaos dominates in the limit.
, Fλ satisfies a CLT with an upper bound of the order of λ −1/2 on the Wasserstein distance, and the projection of Fλ on the first Wiener chaos dominates.
(R3) If r λ 1, then Var(F λ ) λ, Fλ satisfies a CLT with an upper bound of the order of λ −1/2 on the Wasserstein distance, and the projections of Fλ on the first and second Wiener chaos both contribute to the limit and satisfy a joint CLT.
Remark 4.17 For every fixed λ, the U -statistic 1  2 F λ has the same law as the random variable counting the number of edges in a disk graph, with radius δ λ = λ −1/d r λ , based on random points of the form {Y 1 , ..., Y N (λ) }, where {Y i } indicates a collection of i.i.d.random variables uniformly distributed on Q 1 = [− 1 2 , 1 2 ] d , and N (λ) is an independent Poisson random variable with parameter λ.As such, each 1  2 F λ is just a subgraph counting statistic based on a Poissonized random geometric graph, and enters the general framework of [29,Section 3.4], where general m-dimensional CLTs are obtained for these objects.It is immediately checked that our variance estimates coincide with those stated in [29, p. 56] (for the case k = 2), whereas our estimates in the Wasserstein distance refine the findings of [29, Theorems 3.9 and 3.10] (in the case k = 2 and m = 1), where no information on the rate of convergence is given.Previous references for CLTs for Poissonized disk graphs are [2,10], where no explicit rates of convergence are provided either.A generalization of the previously described findings to general subgraph counting in a disk graph model can be found in [11,Section 3].In all cases, the convergence to a normal law goes hand in hand with the almost sure convergence of the number of connections to infinity, and with the convergence of the variance to infinity.In the case (d), the convergence is very slow, the number of connections behaves asymptotically like a Poisson law with parameter log(λ), due to the long-range connections within the point process.In the case (e), the asymptotic properties of G λ do not yield long range connections and the number of connections converges towards a Poisson-type limit.We now define some Malliavin-type operators associated with a Poisson measure η, on the Borel space (Z, Z ), with non-atomic control measure µ.We follow the work by Nualart and Vives [22].
The derivative operator D.
For every F ∈ L 2 (P ), the derivative of F , DF is defined as an element of L 2 (P ; L 2 (µ)), that is, of the space of the jointly measurable random functions u : Ω × Z → R such that E Z u 2 z µ(dz) < ∞.
Definition 5.1 1.The domain of the derivative operator D, written domD, is the set of all random variables F ∈ L 2 (P ) admitting a chaotic decomposition (2.4) such that 2. For any F ∈ domD, the random function z → D z F is defined by The divergence operator δ.
Thanks to the chaotic representation property of η, every random function u ∈ L 2 (P, L 2 (µ)) admits a unique representation of the type where the kernel f k is a function of k+1 variables, and f k (z, •) is an element of L 2 s (µ k ).The divergence operator δ(u) maps a random function u in its domain to an element of L 2 (P ).Definition 5.2 1.The domain of the divergence operator, denoted by domδ, is the collection of all u ∈ L 2 (P, L 2 (µ)) having the above chaotic expansion (5.42) satisfied the condition: 2. For u ∈ domδ, the random variable δ(u) is given by where fk is the canonical symmetrization of the k + 1 variables function f k .
As made clear in the following statement, the operator δ is indeed the adjoint operator of D. The proof of Lemma 5.3 is detailed e.g. in [22].
The Ornstein-Uhlenbeck generator L.
Definition 5.4 1.The domain of the Ornstein-Uhlenbeck generator, denoted by domL, is the collection of all F ∈ L 2 (P ) whose chaotic representation verifies the condition: 2. The Ornstein-Uhlenbeck generator L acts on random variable F ∈ domL as follows: The pseudo-inverse of L. Definition 5.5 1.The domain of the pseudo-inverse of the Ornstein-Uhlenbeck generator, denoted by L −1 , is the space L 2 0 (P ) of centered random variables in L 2 (P ).

Remark 3 . 4
For instance, Assumption 3.1 is verified whenever each f i is a bounded function with support in a rectangle of the type B × • • • × B, where µ(B) < ∞.
[2m] satisfying the following properties: (a) every block of π contains at least two elements, (b) given any two blocks b 0 ∈ π 0 and b 1 ∈ π, the intersection b 0 ∩ b 1 contains at most one element, and (c) the diagram Γ(π 0 , π), as defined in [24, Section 4.1], is connected in the sense of [24, p. 47].There are exactly 2 m−1 partitions π ∈ M m such that |π| = 2, and for any such partition one has that