Journal URL

We give new exponential inequalities and Gaussian approximation results for sums of weakly dependent variables. These results lead to generalizations of Bernstein and Hoeffding inequalities, where an extra control term is added; this term contains conditional moments of the variables.


Introduction
In the whole paper (X i ) 1≤i≤n is a sequence of centered random variables.Our objective is to give new exponential inequalities and Gaussian approximation results for the sum S = X 1 + • • • + X n (and other functions of (X 1 , . . .X n )) in the case where first and second order mixing conditions are assumed (first order mixing conditions involve conditionnal means and second order ones involve conditionnal covariances).
The essential application of exponential inequalities is to give small event probabilities; typically, we would like here to extend the Hoeffding and Bernstein inequalities to mixing processes, that is where ρ = 0 if the variables are independent.We shall obtain here a value of ρ which depends on conditional moments of the variables.We also want to provide inequalities which generalize what is already known for martingales.Actually Equation ( 2) is not satisfied for a martingale (E[S 2 ] has to be replaced with a bound on the total variation); this will lead us to two different alternatives where v is a bound on some kind of quadratic variation and ρ = 0 in the case of a martingale (cf.Theorem 4 for a precise statement), or where ρ 1 is a third order quantity (i.e. if S is replaced with tS, ρ 1 becomes t 3 ρ 1 ) which involves conditional moments and is in the independent case ρ 1 = Var(X k ) X k ∞ + X 3 k ∞ /2 (Theorem 7 and Theorem 9).For instance, if we are close to the independent case and S is a normalized sum, that is X k = U k / k where (U k ) k is a sequence of weakly dependent bounded random variables, ρ 1 has order 1/ n and the second term in the denominator is residual as long as A ≪ n.
Bounds like Equation (3) will be obtained through what we call here the first order approach whereas (4) will require the second order approach.We present the main ideas below.
First order approach.Considering the sequence X 1 , . . .X n as a time series, for instance a martingale, it is natural to introduce the σ-fields F k = σ(X 1 , . . .X k ). (5) It will appear that the remainder ρ will involve essentially the L ∞ norm of the conditional expectations E[X j |F k ], k < j.If the sequence is a random field this filtration will generally not be very useful because of the arbitrariness of the order on the variables, and we shall need to proceed as follows: to each index k we associate a reordering of the sequence which corresponds (hopefully) to increasing dependence with X k ; this brings on a new sequence, depending on k: X k j .More precisely: For any 1 ≤ k ≤ n is given a sequence X k j , j = 1, . . .k, which is a reordering of (X j , j = 1, . . .k) with X k k = X k .We attach to each k a family of σ-algebras (F k j ) j≤k such that In particular, we have If (X i ) is a time series, it is natural to set X k i = X i and F k j = F j = σ(X i , i ≤ j): the superscripts can be dropped.Later on, the term "times series" will refer to to this situation, whereas the general case will be rather refered to as "random fields".
When dealing with mixing random fields of d , each index k corresponds to some point P k of the space where X k sits; for each k, the sequence (X k j ) will be typically obtained by sorting the original sequence (X j ) j≤k in decreasing order of the distance d(P j , P k ).A simple example is the case of m-dependent fields indexed by d , that is, a process X a , a ∈ d , such that the set of variables X A = {X a : a ∈ A} is independent of X B if the sets A and B have distance at least m; they are typically fields of the form X a = h(Y a+C ) where Y a is an i.i.d.random field, C a finite neighborhood of 0 in d , and h a measurable function of |C| variables.
We would like to point out that this framework covers quite different situations.For instance, in the Erdös-Rényi model of an unoriented random graph with n vertices, edges are represented by with the convention Y ab = Y ba and Y aa = 0 (see [2] and references therein).The number of triangles (for instance) in such a model is The process X is here an m-dependent process on the set of three element subsets of {1, . . .n}.We shall treat this example in Section 3.5.As pointed out by Lumley [14], a similiar situation appears in the case of U-statistics; we shall however treat them with a martingale method.
Within this framework, we are able to control exponential moments of S with the help of formulas which generalize the Hoeffding and Bernstein inequalities for independent variables (Theorem 2 and Theorem 3).The bound of Theorem . This is slightly unsatisfactory since it is known that the key quantity in the case of a martingale is the quadratic variation , and in most cases effective bounds will actually involve 〈X 〉 ∞ , which is smaller than v.This is corrected in Theorem 1, where we give a result which generalizes what is known from martingale theory and improves on classical papers concerned with mixing [6].However, inspection of the bounds shows that this improvement is really effective only if the conditional expectations |E[X k |F k i ]| are significantly smaller than |X k |; if not, the only way to improve accuracy is to use the second order approach of Section 3 briefly discussed below.
Second order results.By this terminology, we mean the following fact: the Hoeffding inequality (Equation 1 with ρ = 0) for instance is obtained from the exponential inequality One obvious drawback of this upper bound is that when t tends to 0, it does not look like 1+t 2 E[S 2 ].
One would rather expect something like which has more interesting scaling properties; this approach would hopefully lead to significant improvements in a moderate deviation domain; this is what has been done in [5], but there S is an arbitrary function of independent variables, or of a Markov chain.In order to get closer, like in Equation ( 34) and ( 41) below, we have to pay with higher order extra terms: the remainder terms will not only contain conditional expectations E[X k |F k i ] but also conditional covariances; this will force us to consider for each pair of indices (i, j) another reordering of the sequence which corresponds to increasing dependence with the pair (X i , X j ), and to introduce σ-fields H i j k ; we postpone details to Section 3.
In this context we shall give exponential inequalities and Gaussian approximation; we give in particular a bound for |E[h(X )]− E[h(N )]| where h is any function of n variables with all third derivatives bounded and N is a Gaussian random variable with same covariance matrix as X .
The paper is organized as follows.The two forthcoming sections deal with first order and second order exponential inequalities.A classical use of the exponential inequalities leads to Theorem 4 which generalizes the Bernstein and Hoeffding inequalities.An application to concentration inequalities and triangle counts is given in Section 2.2.Section 3 is concerned with the second order approach, with applications to bounded difference inequalities and triangle counts.
In Section 4 we give some estimates under mixing assumptions.

First order approach 2.1 General bounds
This section is devoted to bounds for the Laplace transform of S. The corresponding deviation probabilities will be obtained in Section 2.2.1 through classical arguments.
In Theorem 1 we present bounds which generalize known results concerning martingales.Since they only use the linear sequence of σ-fields F k , they are essentially interesting for time series.
In Theorem 2 we give a Hoeffding bound which is valid in both cases (time series and random fields), and Theorem 3 gives a Bennett bound for random fields which does not exactly generalizes (9) because the quadratic variation 〈X 〉 is changed into the more drastic upper bound v.
The applicability of the following theorem depends on the way one can bound the quadratic variations involved.In the forthcoming examples, we shall consider only Equation ( 9) through a bound on 〈X 〉 ∞ ; however Equations (10) and (11) have the advantage of not involving m.Theorem 1.We are in the setting described in the introduction, with the filtration defined by (5).The variables X k are centered.We define where the notation x 2 + (resp.x 2 − ) stands for x 2 1 x>0 (resp.x 2 1 x<0 ).Then REMARK.In the martingale case q = 0. We recommend [1] for an account on recent work concerning exponential inequalities for martingales.
Proof.Consider a pair of functions θ (x) and ψ(x) such that These functions are meant to be O(x 2 ) in the neighborhood of 0. Three examples of such functions are Inequality (12) for these functions is proved in Proposition 12 of the Appendix.Set Then In the martingale case, the first two terms are zero; here we have The above defined function ψ is convex with ψ(0) = 0, ψ(−1) ≤ 1 and ψ(1) ≤ 1. Hence |ψ(x)|∧1 ≤ |x| and therefore the second remainder is bounded as follows: Finally and we get by induction that In particular This leads to the three bounds by using the three pairs of functions and by noticing that for m ≥ 0 and x ≤ m which is a consequence of L'Hospital's rule for monotonicity [? ], and that for x ≥ 0 Theorem 2. Assume that we are in the setting described in the introduction, with a family of σ-fields satisfying (6).The variables X k are centered.We define now q as q = (this is consistent with the definition in Theorem 1).If the variables are lower and upper bounded with probability one: the following inequality holds In the martingale case (F k i = F i and E[X i |F i−1 ] = 0 for all i and k), this inequality remains true if we allow a i and b i to be an F i−1 -measurable random variables.
Proof.We assume first that a i and b i are deterministic.We start, as in the proof of the Hoeffding inequality, with the following inequality based on the majoration of the exponential function by the chord over the curve on [a, a + b]: It is well known that the first term of the right hand side, e c on the figure, is smaller than exp(b 2 /8) independently of a (this a key step for proving the Hoeffding inequality, see for instance Appendix B of [? ]).On the other hand, it is clear that c ≤ a + b (see the figure or bound e a with e a+b in the expression of e c ).Hence, if we define c i and d i with the equations we have Now let the random variables T j and T k j be defined as where (c k i ) i≤k is the corresponding reordering of the sequence (c i ) i≤k .We obtain In the martingale case, the term involving d n vanishes and this equation gives immediately the result.We assume now that we are not necessarily in this case but a i and b i are deterministic.We can assume in addition, without loss of generality, that a i and b i are chosen so that ( 14) is tight.Notice that in this case we also have and since (b n i is the difference between the essential supremum and the essential infimum) we get that On the other hand since a i ≤ 0 the Equation (55) in Propostion 13 of the Appendix leads to d n e −c n ≤ 4 and ( 16) becomes finally For any sequence α ∈ {0, 1} n set The bound we obtained is still obviously valid for T n (α) since the replacement of X i with α i X i does not increases ρ n , hence and we get We obtain (15) by using that c i ≤ b 2 i /8 in the expression of T n .
Theorem 3. Assume that we are in the setting described in the introduction, with a family of σ-fields satifying (6).The variables X k are centered.We define m as in Theorem 1, and q as in Theorem 2. Then for any t > 0 Proof.We set and S 0 = 0. Equation (13) implies that where q n and v n are the terms corresponding to k = n in the definition of q and v.This proves the result by induction.

Deviation bounds
In this section we give the deviation inequalities that can be deduced from the preceding exponential inequalities.We generalize the Bernstein inequality in Equations ( 18) and ( 22), and the Hoeffding inequality in Equation (21); one could get Bennett inequalities through a similar process, we refer to Appendix B of [? ].In the martingale case, Equations ( 19) and (20) do not assume that the variables are bounded, but sums of squares are involved.
Theorem 4. With the notations of Theorem 1 we have for any A, y > 0 With the notations of Theorem 2 and 3 we have for any A, y > 0 In the martingale case, (21) remains true if we allow a i and b i to be an F i−1 -measurable random variable.
Proof.Applying the bound (9) to the variables t X i for some t > 0, we get The optimization of this expression w.r.t.t ≥ 0 is classical in the theory of Bennett and Bernstein inequalities and delivers (18); see for instance the Appendix B of [? ].The second inequality is deduced from (10) with the same method: for − tA and we take t = A/( y + 8q).

Bounded difference inequalities
The above results lead straightforwardly to bounded difference inequalities by using a classical martingale argument of Maurey k is an independent copy of Y k .We assume the measurability of Φ k .Then for any A, y > 0 2 y (24) In particular REMARK.Let us mention that if f has the form f (Y ) = sup g∈Γ g(Y ) for some finite class of functions Γ then, with obvious notations, . This is a classical argument in the theory of concentration inequalities.
Proof.We shall utilize (21) and (18) with We have already pointed out that q = 0 since X k is a martingale difference.Let us define the random variables The equation and since U k − L k = Φ k we can apply (21) with b k = Φ k and get (24).

Inequalities for suprema of U-statistics
For some problems of adaptive estimation and testing, it is very important to be able to control the supremum of U-statistics [9].We give here a bound in this direction.
Consider a sequence of i.i.d.random variables Y 1 , . . .Y n with values on some measurable space E and a finite familly H of measurable symmetric functions on E d and set for h ∈ H where the sum is restricted to the subsets with cardinality d; since the kernel h is symmetric there is no ambiguity regarding the notation h(Y A ).We assume that h is centered: then the variance of Z h has order n −1 , cf [13] p.12.We give in the following corollary a deviation bound for S which corresponds to a Gaussian approximation with variance of the same order of magnitude.Then, for any A > 0 Proof.With the notation of Theorem 5, we have with and and the result is now just a consequence of Theorem 5, noticing that

Second order approach
As mentionned in the introduction, in order to get better results, we shall have to control conditional expectations of products X k X j ; actually, our procedure will rather lead to products like X k X k j ; hence we are led to introduce for any pair (k, j) a sequence X k, j i corresponding to increasing dependence with (X k , X k j ).More precisely: INHOMOGENEOUS SETTING (i) For any 1 ≤ k ≤ n is given a sequence X k j , j = 1 . . .k, which is a reordering of (X j , j = 1, . . .k) with X k k = X k .We attach to each k the σ-algebras i ) i< j and (H k, j i ) i< j are defined as above from the sequence (X k i ) i< j .
In other words the σ-field F k j is made by taking off F k j+1 the "closest" variable to X k .The σ-field H k, j i is made by continuing this process after F k j (hence i < j) in a way which may depend on k and j.
This setup is essentially, in a somewhat more general context, what is considered in [6].For time series, we have X k i = X i and It happens commonly in the theory of random fields that there is no natural order on the variables (this is however typically not the case if there is a time variable).In those cases, instead of building the F k i by reordering the variables which are "before" X k , it is not more restrictive to assume that for any k there exist a reordering of all variables for which the conditional expectations will decrease rapidly enough.In this case the F k j will be replaced by the k j defined as follows HOMOGENEOUS SETTING (i) For any 1 ≤ k ≤ n is given a sequence X k j , j = 1 . . .n, which is a reordering of (X j , j = 1 . . .n) with X k n = X k .We attach to each k the σ-algebras (ii) For any 1 ≤ j, k ≤ n is given a sequence (X k, j i ) i< j , which is a reordering of (X k i ) i< j .We attach to each such pair (k, j) the σ-algebras This setting is adequate for dealing with mixing random fields in which case each index k corresponds to some point P k of the space; for each k, the sequence (X k j ) j will be obtained by sorting the original sequence (X j ) in decreasing order of the distance d(P j , P k ), and k j will be the σ-field generated by the random variables setting on the j more distant points from P k , say P k 1 ,. . .P k j ; it is natural to define X k, j i as X l where P l is the i-th more distant point from {P k , P k j }, but the choice = k i , is typically good enough for random fields over the Euclidean space (see Section 4).

The homogeneous setting
The following theorem states a Gaussian approximation result (33) and an exponential bound (34).Equation (33) may seem rather overtechnical; the reader can think first of the case p = 1, q = ∞, which is natural if h has bounded third derivatives; however the proof of (34) utilizes the case p = ∞, q = 1.
Let h be a three time differentiable function of n variables, and N be a centered Gaussian vector independent of X = (X 1 , . . .X n ) with the same covariance matrix.Then for any where q = p/(p − 1), and X is the set of processes Y i = α i X i where α is allowed to be any sequence of [0, 1] n .
The sum S = k X k satisfies for any t > 0 Hence under strong enough mixing assumptions, r p /σ 2 * is expected to tend to 0 as the number of variables n tends to infinity; in this case, (33) leads in particular to the central limit theorem for S/σ * .
For the sake of clarity, we shall start the proof with a preparatory lemma: We know that if (Y, Z) is a Gaussian vector and Y is scalar centered, for any differentiable function g one has under appropriate integrability conditions.We shall see that, when the variables are not Gaussian, bounds of the difference between both sides when Y varies among the coordinates of Z provide an efficient measure, on some sense, of how far Z is from a Gaussian vector of the same covariance (this will appear in Equation (37) below).
Next lemma provides a way to estimate such a bound: Lemma 8. Let Y be a centered random variable and Z be a random vector on n .For any 1 ≤ j ≤ n is given and sequence (Z j i ) i< j which is a reordering of (Z i ) i< j .Let A j and B j i , i < j, be σ-algebras such that A j ⊃ σ(Z 1 , . . .Z j ) and Then for any function g two times differentiable on n , with first (resp.second) order derivatives denoted by g ′ j (resp.g ′′ i j ), and where the supremum over α is taken over all sequences of [0, 1] n having no more than one term different from 0 or 1.
Proof.We set Zj = (Z 1 , . . .Z j , 0, . . .0) and Z j i is defined by setting to zero all entries in Zj which are not a Z j l for some l ≤ i; hence Z j j−1 = Zj−1 and Z j 0 = 0.The key observation is that if all terms of the sums below are integrable we have Clearly indeed, for any j, the factors of E[Y Z j ] are the same on both sides, except a residual E[Y Z j ]g ′ j (0) on the right hand side which compensates for a remaining term from the otherwise vanishing sum of the E[Y Z j g ′ j ( Z j i )] (namely the term i = 0); finally the terms Y g( Zj ) of the right hand side sum up to The general identities imply that (e i is the ith vector of the canonical basis of n ) hence by the Minkowski inequality and ( 35) is now only a consequence of (36) and the Hölder inequality.
Proof of theorem 7. Set for 0 ≤ t ≤ 1 Using that for any differentiable function f with bounded derivatives Applying where X is the family of the processes of the form Y i = α i X i , α ∈ [0, 1] n .This leads to (33).
Concerning (34), notice first that an elementary scaling argument reduces to the case t = 1.We consider now the function h(x) = exp( x i ): and Equation (37) rewrites: by application of Lemma 8 since h ′′′ i jk = h.Hence it we set (the reason for the choice of σ * rather than σ will appear soon) Equation (38) implies The variance of any Y ∈ X is smaller than σ 2 * due to the fact that the convex function on attains necessarily its maximum on extremal points of [0, 1] n .Hence (39) is again valid if instead of X we put any Y ∈ X in the definition of ϕ since r ∞ and σ * would be smaller, thus which implies via the Gronwall lemma that ).If we set t = 1 in this equation, we get (34) for t = 1.

The inhomogeneous setting
In this section we give a companion to Theorem 7 in the inhomogeneous case.Theorem 9. Set Then for any a REMARK.In the case of a martingale, Proof.Let λ ∈ .Set S(t) = S n−1 + t X n , and t) ].
The derivative of ϕ is : where, thanks to Lemma 8 with and X is the family of the processes of the form Y i = α i X i where α is any decreasing sequence of [0, 1] n with no more than one term different from 0 or 1, and w n q is the term corresponding to k = n in the expression of w q .Integrating (42) we get which implies (40) by induction on n.Now for a fixed real λ ∈ , let Equations ( 43) and ( 44) with p = 1 imply We have for t ≥ s ϕ(t) ≤ϕ * (0)e λ 2 (u(t)−u(0)) + |λ| 3 w n ∞ t 0 e λ 2 (u(t)−u(s)) ϕ * (s)ds.
For any Y ∈ X, the same bound holds if X is replaced by Y in the definition of ϕ, since the corresponding values of w n ∞ and u(t) − u(s) will be smaller (either α n = 0 and the corresponding value of u(t and by Gronwall's Lemma: This proves (41) by induction on n.

Applications to deviation bounds
In this section we give the deviation inequalities that can be deduced from the preceding exponential inequalities.The bound (46) is analogous to the bound or Corollary 3 (b) in [6] p.85, but in this paper the variance v is amplified with an extra factor 2e 2 and w ∞ takes a slightly different value.
Theorem 10.With the notations of Theorem 7 and Theorem 9, we have for any A > 0 Proof.Using (34) we get . Equation ( 46) is obtained similarly on the basis of (41).

More on bounded difference inequalities
We give a second order variant of Equation (26).
Theorem 11.With the notation of Theorem 5, the following inequality holds true Proof.As in the proof of Theorem 5, we notice that S is the sum of the following martingale increments but now we use Equation (46).As pointed out right after Theorem 9, one has in the martingale case: and we conclude by noticing that X 3 k ∞ ≤ δ k .

Triangle counts
We shall show that the standard Gaussian approximation is asymptotically valid for triangle counts in the moderate deviation domain.
In the Erdös-Rényi model of an unoriented random graph with n vertices, edges are represented by where T is the set of subsets of {1, . . .n} with three elements, |T | = n 3 .Recall that r ∞ rewrites, with these notations For any τ = {a, b, c}, define A τ the set elements of T such that at least two points are in common with τ (τ, {a, b, d}, {a, d, c}, . . .); this makes n 1 = 1 + 3(n − 3) elements, and (X σ ) σ / ∈A τ is independent of X τ .We define the ordering X τ j by taking first ( j large) these X σ for which σ ∈ A τ and define τ j according to (31).Since τ j is independent of X τ for j ≤ |T | − n 1 , the first term in r p will be 0 unless and for these j, the second term vanishes, independently of the construction of H τ, j i .We consider now the case j > |T | − n 1 .In this case, and if j = |T |, X τ j = X τ j for some τ j which has two points in common with τ, for instance {a, b, d}.We define the σ-fields H τ, j i by excluding first the σ which has at least two points in common with τ or τ j , this makes at most i , and this contributes to at most n 1 non-zero terms in the middle sum; finally: and Equation (45) rewrites In this case it is easily verified that σ 2 * = Var(S) since covariances are non-negative.Let us recall that (see [2]) This has order n 4 due to the covariance terms.Let us briefly compare with the bound of [2]; this paper delivers a bound for P(S ≥ A) which is sligthly larger than exp (the actual formula is much more complicated; we have used that min(a −1 , b −1 ) ≤ 2/(a + b) to obtain this from Theorem 18 of [2]).One has 6nE[Z] ≥ 2 p 2 −p 3 2σ 2 * .For p fixed and n large, the square root term in (48) is residual if A ≪ n 3 ; this is the moderate deviation case since the centering term in (47) has order n 3 (notice that S ≤ n 3 w.p.1), and we get the right variance.In (49), a change occurs when n 5/2 ≪ A, and if we set A = Bn 5/2 with B large, (49) leads to exp(−cnB) while (47) behaves like exp(−cnB 2 ).

Evaluation of constants under mixing assumptions
We give here informally some arguments to convince the reader that under standard ϕ-mixing assumptions the constant q has the same order as the variance of the sum, and that w ∞ and r ∞ will be small; under β-mixing assumptions one can control only r s for s < ∞ and w 1 .

ϕ-mixing
The ϕ-mixing constant between two σ-fields A and B is defined as It is well known, see reference [? ] p.27 or [12] p.278, that this implies that if Z is a zero-mean B-measurable random variable Assume that X is a field over a part of d : each variable X k of the field sits on some P k ∈ d .k j is the σ-field generated by the j more distant points from P k , and we take for simplicity H k, j i = k i .Notice that the distance of P k to P k j , the jth closest point, is at least c j1/d , for some constant c.This implies that standard ϕ-mixing assumptions between σ(X k ) and k j rewrite for some decreasing function ϕ ∞,1 (for example exponential decay holds for finite range 1 shiftinvariant Gibbs random fields [10] pp.158-159; this contains a lot of examples).The subscripts ∞ and 1 on ϕ mean that there is no restriction on the number of random variables containned in the first σ-field, k j , and there is only 1 variable in the second, σ(X k ); we use this traditionnal notation, in particular for compatibility with [6].
On the other hand for i < j, j −i is smaller than the number of points in the annulus {x : P k j − P k ≤ x − P k ≤ P k i − P k }, in particular, for some c j − i ≤ c P k i − P k d−1 P k j − P k i .
Hence P k j − P k i is at least c( j − i)(n − i) 1/d−1 for some c.This implies that standard ϕ-mixing assumptions between σ(X k , X k j ) and k i , rewrite for some decreasing function ϕ ∞,2 .
Equations ( 50) and (51) will imply, for i ≤ j ≤ k, and any measurable bounded functions f and g The first equation leads to with m = sup i X i ∞ , and the second ), ϕ ∞,1 ((n − j) 1/d )).
We get with m 1 = X i ∞ , and setting ϕ(x) = ϕ([x]) The integral is essentially the quantity B(φ) of [6], and Equations ( 21) and ( 22) may be seen as improvements over (b)(i) and (ii) of Corollary 4 of [6].We get for r ∞ The same estimates can be obtained for w ∞ .

β-mixing
ϕ-mixing is not always a realistic assumption: for a Markov chain, ϕ-mixing implies typically a Doeblin condition; it is satisfied for ergodic finite state Markov chain, but on the other hand, a non-trivial Gaussian autoregressive process is not ϕ-mixing.
β-mixing is a much more satisfactory measure of independence, cf [3].The β-mixing constant between two σ-fields A and B represent the total variation between the actual measure on A⊗B and the independence (product of marginal measures).A non-singular autoregressive process, as most Markov chains, is β-mixing with exponentially decreasing coefficients [3].If X is A-measurable and β(A, B) is the β-mixing constant of these σ-fields, one has s (elementary consequence of Theorem 1.4 (a) of [? ]).This implies that w 1 as well as r s , s < ∞, can be bounded as above using the β-mixing constants.

n 2 i
.i.d.Bernoulli variables Y ab , 1 ≤ a < b ≤ n, with the convention Y ab = Y ba and Y aa = 0.The number of triangles in such a model is Z = {a,b,c} Y ab Y bc Y ac .We set p = E[Y 12 ] and X abc = Y ab Y bc Y ac − p 3 S = {a,b,c}