Fluctuations of the Gromov-Prohorov sample model

In this paper, we study the fluctuations of observables of metric measure spaces which are random discrete approximations $X_n$ of a fixed arbitrary (complete, separable) metric measure space $X=(\mathcal{X},d,\mu)$. These observables $\Phi(X_n)$ are polynomials in the sense of Greven-Pfaffelhuber-Winter, and we show that for a generic model space $X$, they yield asymptotically normal random variables. However, if $X$ is a compact homogeneous space, then the fluctuations of the observables are much smaller, and after an adequate rescaling, they converge towards probability distributions which are not Gaussian. Conversely, we prove that if all the fluctuations of the observables $\Phi(X_n)$ are smaller than in the generic case, then the measure metric space $X$ is compact homogeneous. The proofs of these results rely on the Gromov reconstruction principle, and on an adaptation of the method of cumulants and mod-Gaussian convergence developed by F\'eray-M\'eliot-Nikeghbali.


Introduction
Let X = (X , d, µ) be a metric space which we assume to be complete, separable and equipped with a probability measure µ over the Borel algebra of X ; and (X n ) n∈N be a sequence of random independent variables with the same law µ. We study here the approximation of X = (X , d, µ) by the random discrete metric space X n = X n = {X 1 , . . . , X n }, d, 1 n n i=1 δ X i 1 for the Gromov-weak topology; we call this discrete approximation the Gromov-Prohorov random sample model. The Gromov-weak topology is based on the idea that a sequence of metric measure spaces converges if and only if all finite subspaces sampled from these spaces converge. This is formalized by using real-valued observables called polynomials and introduced by Greven, Pfaffelhuber and Winter in [GPW09]: they are the functions Φ defined by Φ((X , d, µ)) = X p ϕ((d(x i , x j )) 1≤i<j≤p ) µ(dx 1 ) · · · µ(dx p ), where ϕ : R ( p 2 ) → R is an arbitrary continuous bounded function. By using the theorem of convergence of empirical measures (see [Var58, Theorem 3]), one proves readily the almost sure convergence of X n toward X (see Theorem 2.6). In this paper, we will study the fluctuations of the polynomials Φ(X n ) with respect to their limits Φ(X ). The evaluation of a polynomial Φ on the space X n is a sum of dependent random variables where we abbreviate ϕ(d(Xī)) := ϕ((d(X ia , X i b )) 1≤a,b≤p ) for a sequence of indicesī = (i 1 , . . . , i p ). This dependency between the random variables is sparse: if we associate to these variables a graph describing the dependency between those variables, then when n goes to infinity the maximal degree of a vertex of this graph becomes negligible against the number of vertices (variables). This sparse dependency leads to central limit theorems, but the limiting distribution is not necessarily Gaussian, and it depends on the size of the variance of Φ(X n ), for which there are two cases.
We shall see that the variance var(S n (ϕ, X )) with S n (ϕ, X ) = n p Φ(X n ) is a polynomial in the variable n with coefficients depending on the function ϕ and the space X ; this variance is at most of order n 2p−1 and therefore, var(Φ(X n )) is of order at most 1/n.
• In a first part, we study the case where the variance of Φ(X n ) is of order exactly 1/n. We call this setting the generic case, and it corresponds to fluctuations which are asymptotically normal. We study the combinatorics of the cumulants of the variable S n (ϕ, X ) by using the theory of dependency graphs and mod-Gaussian convergence developed recently by Féray, Méliot and Nikeghbali (see [FMN16]); and we prove the mod-Gaussian convergence of the sequence S n (ϕ, X ) adequately renormalized. This leads to a central limit theorem for the variables var(Φ(X n )) ; the limiting distribution is the standard Gaussian distribution, and we also obtain the normality zone of this approximation, moderate deviation estimates and a Berry-Esseen inequality (Theorem 4.4). In [FMN17], similar techniques were used in the study of the fluctuations of observables of random graph, random permutation and random integer partition models parametrised respectively by the space of graphons, the space of permutons and the Thoma simplex.
• In a second part, we study the case where the variance of Φ(X n ) is at most of order 1/n 2 for any polynomial Φ. We call this setting a globally singular point X of the Gromov-Prohorov sample model. It corresponds to the following condition: for any p ≥ 1 and any ϕ ∈ C b (R ( p 2 ) ), 1≤i,j≤p cov ϕ(d(X 1 , . . . , (i) X i , . . . , X p )), ϕ(d(X ′ 1 , . . . , (j) X i , . . . , X ′ p )) = 0, where (X ′ n ) n∈N is an independent copy of (X n ) n∈N , and where in each summand the second vector contains all the variables X ′ 1 , . . . , X ′ p , except X ′ j which is replaced by X i . This identity is difficult to analyse: therefore, we shall study the simpler case where each of the covariances in the sum vanishes. In particular, cov ϕ(d(X 1 , X 2 , . . . , X p )), ϕ(d(X 1 , X ′ 2 , . . . , X ′ p )) = 0. It turns out that this second identity is equivalent to X being a compact homogeneous space (in the space of metric measure spaces); see Theorem 5.1. We are thus able to relate a probabilistic condition to a geometric condition on the space; this result is a bit surprising, and for instance it ensures that when approximating an ellipse and a circle by the Gromov-Prohorov sample model, the convergence is much faster for the circle and does not have the same kind of asymptotic fluctuations. The proof of the equivalence relies notably on Gromov's reconstruction theorem [Gro07]. Now, in this situation, we cannot directly use the theory of mod-Gaussian convergence and dependency graphs in order to prove all the probabilistic results that we obtained in the generic case. However, by using the symmetry of the space, we are able to obtain for this singular case a better upper bound of the cumulants. It allows us to prove a central limit theorem for the random variables Y n (ϕ, X ), but the limit is not necessarily the Gaussian distribution; see Theorem 5.7.
The paper is organized as follows. In Section 2, we will recall some definitions and facts about metric measure spaces. Section 3 introduces the method of cumulants, the theory of dependency graphs and all the probabilistic results that we can obtain from this method. In Section 4, we apply this theory to the generic case of the random sample model to get several probabilistic results about the model including a central limit theorem, the normality zone, moderate deviations and a Berry-Esseen bound for the random variables Y n (ϕ, X ).
Section 5 details the singular case, and we prove the equivalence between having a small variance for the model, and X being a compact homogeneous space. We obtain also in this case a finer bound on the cumulants, a non-Gaussian central limit theorem for the observables Φ(X n ), and concentration inequalities for these random variables. Finally, we provide an explicit counterexample for the asymptotic normality of observables of the sample model of an homogeneous space in Section 6.

Metric measure spaces
In this section, we recall the theory of metric measure spaces and of the Gromov-Prohorov topology, following very closely [GPW09, Section 2].
2.1. Definitions. For any topological space X , we denote C b (X ) the set of continuous bounded functions X → R; C (X ) the set of continuous functions X → R; B(X ) the set of Borel subsets of X ; and M 1 (X ) the set of Borel probability measures over X . A measurable map f : X → Y between two topological spaces induces a map f * : Definition 2.1. A metric measure space is a complete and separable metric space (X , d) which is endowed with a probability measure µ ∈ M 1 (X ). We say that two metric measure spaces (X , d, µ) and (X ′ , d ′ , µ ′ ) are measure-preserving isometric if there exists an isometry ψ between the supports of µ on (X , d) and of µ ′ on (X ′ , d ′ ), such that µ ′ = ψ * µ.
We denote M the space of metric measure spaces (in short, mm-spaces) modulo measurepreserving isometries. In the sequel, unless explicitly stated, given a mm-space (X , d, µ), we will always suppose that the space X is exactly the support of the measure µ. Let X = (X , d, µ) ∈ M and the space of infinite pseudo-distance matrices. We introduce the maps: Definition 2.2. We define the distance matrix distribution of X by ν X := (ι X ) * µ ⊗N , and the pointed distance matrix distribution by The distance matrix distribution characterizes the metric measure space in M. It means that if ν X 1 = ν X 2 , then X 1 is measure-preserving isometric to X 2 . This follows from Gromov's reconstruction theorem for metric measure spaces [Gro07, Paragraph 3 1 2 .5].

2.2.
Polynomials and the Gromov-Prohorov distance. We associate to any bounded continuous map ϕ ∈ C b (R ( p 2 ) ) a map Φ = Φ p,ϕ : M → R called a polynomial on M and defined by We denote Π the real algebra of polynomials on M. Applying the definition of the distancematrix distribution as a pushed-forward measure, we have Definition 2.3. The Gromov-weak topology is the initial topology on M associated to the family of polynomials (Φ p,ϕ ) p,ϕ . In the sequel we endow M with this topology.
Remark 2.4. We can restrict the family of polynomials (Φ p,ϕ ) p,ϕ to polynomials associated to functions ϕ : R ( p 2 ) → R with compact support, and still get the Gromov-weak topology. As a consequence, there exists a countable family H of polynomials which define the Gromov-weak topology.
The Gromov-weak topology can be metrized by the Gromov-Prohorov distance, where we optimally embed the two metric measure spaces into a common mm-space and then take the Prohorov distance between the image measures. Given µ and ν two probability measures on a metric space (Z, d Z ), their Prohorov distance is Definition 2.5. The Gromov-Prohorov distance between two mm-spaces X = (X , d X , µ X ) where the infimum is taken over all pairs of isometric embeddings ψ X and ψ Y from (X , d X ) and This is a distance on M that induces the Gromov-weak topology. Furthermore, the metric space (M, d GPr ) is complete and separable, so the space M is polish; see [GPW09, Theorem 1]). Further details on the Gromov-Prohorov metric are provided by [Lö13].
2.3. Almost sure convergence of the sample model. Let X = (X , d, µ) in M and (X n ) n∈N be a sequence of random and independent variables with the same law µ. We define Then, taking Φ ∈ H (see Remark 2.4), we have Indeed, µ n converges almost surely to µ for the weak topology of probability measures (see for instance [Var58]), so the same is true for µ ⊗p n toward µ ⊗p (see [Bil99, Chapter 1, Example 3.2]). This implies the following theorem: Theorem 2.6. We have the almost sure convergence X n −→ a.s.

X in the space M of mm-spaces.
We can also prove the theorem by using the Gromov-Prohorov distance; indeed, by choosing Z = X as the common metric space in which one embeds X n and X , and the identity maps for the isometric embeddings, we see that d GPr (X n , X ) ≤ d Pr (µ n , µ), and the convergence to 0 of the right-hand side is the Glivenko-Cantelli convergence of empirical measures. Estimates on the speed of convergence of E[d Pr (µ n , µ)] are given in [Dud69], but they depend strongly on the space X : if k denotes the entropic dimension of X = (X , d, µ), then in general one cannot prove a better bound than E[d Pr (µ n , µ)] = O(n − 1 k+2+ǫ ); see Theorem 4.1 in loc. cit. However, if instead of the Gromov-Prohorov distance one uses polynomial observables Φ in order to control the speed of convergence, then the results of this paper will prove that essentially there are only two possible speeds of convergence: 2 ) in the generic case; • |Φ(X n ) − Φ(X )| = O(n −1 ) in the case of compact homogeneous spaces.

The method of cumulants
In this section, we recall the notion of (joint) cumulants of random variables and the results from [FMN16,FMN19], which relate the existence of a sparse dependency graph for a family of random variables to the size of the cumulants and to the fluctuations of their sum. is a set partition of [[1, 9]]. We denote Q(n) the set of set partitions of [[1, n]]. It is endowed with the refinement order: a set partition π is finer than another set partition π ′ if every part of π is included in a part of π ′ . Denote µ the Möbius function of the partially ordered set (Q(n), ) (see [Rot64]). One has where ℓ(π) is the number of parts of π; see [Sta97, Chapter 3, Equation (30) p. 128].
Given a probability space (Ω, F , P) , we set which has a structure of real algebra. For any integer r ≥ 1, we define a map κ r : A r → R by where [t 1 · · · t r ] (F ) is the coefficient of the monomial r i=1 t i in the series expansion of F . Here, log(E[e t 1 X 1 +···+trXr ]) is considered as a formal power series whose coefficients are polynomials in the joint moments of the X i 's; we do not ask a priori for the convergence of the exponential generating function. We call the map κ r the r-th joint cumulant map, and we define the joint cumulant map κ : ) the joint cumulant of (X i ) i∈[[1,r]] ∈ A r . This notion of joint cumulant was introduced by Leonov and Shiryaev in [LS59], and it generalises the usual cumulants: for X ∈ A , κ (r) (X) := κ r (X, . . . , X) is the usual r-th cumulant of X, that is r! [t r ](log E[e tX ]). We summarise the properties of the map κ in the following: (1) The map κ is multilinear.
(2) The joint cumulants and the joint moments are related by the poset of set partitions, and the following formulas hold: (3) If the variables X 1 , . . . , X r can be split into two non-empty sets of variables which are independent of each other, then κ(X 1 , . . . , X r ) vanishes.
For example, the joint cumulants of one or two variables are respectively the expectation and the covariance: For the convenience of the reader, we also recall the value of the third cumulant: κ(X 1 , X 2 , 3.2. Dependency graphs and bounds on cumulants. A real random variable X is distributed according to the normal law N (m, σ 2 ) with mean m and variance σ 2 if and only if κ (1) (X) = m, κ (2) (X) = σ 2 and κ (r) (X) = 0 for r ≥ 3. More generally, a sequence of random variables (X n ) n∈N converges in distribution towards a normal law N (m, σ 2 ) if the two first cumulants κ (1,2) (X n ) converge toward m and σ 2 respectively, and if lim n→∞ κ (r) (X n ) = 0 for r ≥ 3; see for instance [Jan88, Theorem 1]. In the series of papers [FMN16, FMN17, FMN19, DBMN19], a method of cumulants has been built in order to make more precise this result of asymptotic normality, assuming that one has good upper bounds on the size of the cumulants of the random variables X n . This method falls in the framework of mod-Gaussian convergence also constructed in the aforementioned papers. We recall below the main results from this theory; see [FMN17, Definition 2 and Theorem 3].
Definition 3.2. Let (S n ) n∈N be a sequence of real-valued random variables. We fix A ≥ 0, and we consider two positive sequences (D n ) n∈N and (N n ) n∈N such that The hypotheses of the method of cumulants with parameters ((D n ) n∈N , (N n ) n∈N , A) and with limits (σ 2 , L) for the sequence (S n ) n∈N are the two following conditions: • For any r ≥ 1, we have: • There exist two real numbers σ 2 ≥ 0 and L such that: In particular, the first estimate in the second item states that the variance of S n is equivalent to σ 2 N n D n .
Theorem 3.3. Let (S n ) n∈N be a sequence of real-valued random variables that satisfies the hypotheses of the method of cumulants, with parameters ((D n ) n∈N , (N n ) n∈N , A) and with limits (σ 2 , L). Assuming that σ 2 > 0, we set: var(S n ) .
(1) Central limit theorem with an extended zone of normality: we have Y n ⇀ n→∞ N R (0, 1), and more precisely, for any sequence (y n ) n∈N with |y n | ≪ Nn Dn 1/6 .
(2) Berry-Esseen type bound: the Kolmogorov distance between Y n and the standard Gaussian distribution satisfies where C = 76.36 is a universal constant.
(4) Local limit theorem: for any y ∈ R, any Jordan measurable set B with positive Lebesgue measure m(B) > 0, and any real exponent δ in (0, 1 2 ), (5) Concentration inequality: suppose that in addition to the hypotheses of the method of cumulants, we have almost surely |S n | ≤ N n A. Then, for any x ≥ 0 and any n ∈ N, This list of results corresponds to Theorem 9.5.1 in [FMN16] (CLT and moderate deviations), Corollary 30 in [FMN19] (Kolmogorov distance), Proposition 4.9 in [DBMN19] (local limit theorem), and Proposition 6 in [FMN17] (concentration inequality).
We shall use the method of dependency graphs in order to verify the hypothesis of the previous theorem. Let S = v∈V A v be a finite sum or real-valued random variables. We say that a graph G = (V, E) is a dependency graph for the family of random variables We also assume that |A v | ≤ A almost surely for any v in V . Then, for any r ≥ 1, We refer to [FMN16, Theorem 9.1.7] for a proof of this result; later, we shall recall some of its arguments and adapt them in order to obtain adequate bounds on the cumulants of polynomials of the Gromov-Prohorov sample model of a compact homogeneous space.

Generic fluctuations of the sample model
Throughout this section, X = (X , d, µ) ∈ M is a fixed metric measure space and Φ p,ϕ ∈ Π a fixed polynomial. As in Section 2.3, we denote X n the sample model of X with n independent points X 1 , . . . , X n , and we are going to study the convergence of Φ(X n ) toward Φ(X ).

4.1.
Dependency graphs for the sample model. For any sequence X : N → E with values in a set E and for any map f : S → N, we denote by X f the map X • f . For example, if we take f = I ∈ [[1, n]] 5 which is a 5-tuple, we have X I = (X I 1 , X I 2 , X I 3 , X I 4 , X I 5 ). For any finite or infinite sequence I : S → T , we write We see a p-tupleī as a mapī : [[1, p]] → [[1, n]] and we denote by Im(ī) the multiset-image of this map, taking as a multiplicity function the map m : Im(ī) → N defined for any Im(ī) by m(x) = Card((ī) −1 (x)). We have , which is a sum of dependent random variables. We are going to use the method of cumulants in order to study the asymptotic probabilistic behavior of S n (ϕ, X ). Placing ourselves in the framework of the previous section, Lemma 4.1. The condition written above defines a dependency graph for the family of random variables (ϕ(d(Xī)))ī ∈V .
Proof. Suppose that {ī 1 , . . . ,ī r } and { 1 , . . . , s } are two sets of p-tuples which are not connected. Then, there is no index i belonging to an intersection Im(ī a ) ∩ Im( b ), so the two sets of variables Im(ī a ) and are disjoint. As the two vectors (φ(d(Xīa))) 1≤a≤r and (φ(d(Xb))) 1≤b≤s are measurable functions of these two sets, they are independent.
In the dependency graph G constructed above, we have N = n p and D ≤ p 2 n p−1 . Indeed, we can build a surjective map from [[1, p]] 2 × [[1, n]] p−1 to the set of adjacent vertices of a vertex ı ∈ V taking which is an upper bound of order n (p−1)r+1 .
We then denote Sp n (x) the set-partition in Q(pr) associated to the equivalence relation π x .
Proof. For the convenience of the reader, let us give an independent proof of this fact; this will also enable us to introduce combinatorial objects which will play a major role in Section 5. Given π ∈ Q(pr), we construct a graph G π on the vertex set V (G π ) = [[1, pr]] as follows. For any part A of the set partition π, we associate a spanning tree T A of the set of vertices A, then we define G π as the disjoint union of those spanning trees. We have A∈π (|E(T A )| + 1) = pr. This implies |E(G π )| = A∈π |E(T A )| ≤ r − 2 by the assumption on ℓ(π). We now construct a multigraph H π with vertex set V (H π ) = [[1, r]], by contracting the vertices of the graph G π according to the map The multigraph H π has the same number of edges as G π , so E(H π ) = E(G π ) ≤ r − 2 and H π is not connected. As a consequence, if [[1, r]] = A ⊔ B are two non-connected components and I = (ī 1 , . . . ,ī r ) is a family of indices such that Sp n (I) = π, then the two families of indices a∈Aī a and b∈Bī b are disjoint. This implies that κ(π, ϕ) = 0, by using the third property in Proposition 3.1. 4.3. Limiting variance and asymptotics of the fluctuations. In order to apply Theorem 3.3, we also have to compute the limiting parameters σ 2 and L involved in the method of cumulants. Identifying the leading terms in Equation (2), we obtain: For k, l ∈ [[1, p]], we define the partition Similarly, we compute the limiting third cumulant L: For i, j, k, l ∈ [[1, p]] with j = k, we define the partition: and if j = k: These are the only possible forms for a set partition of [[1, 3p]] with length 3p −2; for the π i,j,k,l 's with j = k, we also need to take into account the set partitions where two elements of the top row or of the bottom row (instead of the middle row) are connected to elements of the other rows; this leads to a factor 3 in the enumeration. Thus, we have: Similar formulas were obtained in [FMN17, Section 5] for the limiting behavior of the first cumulants of observables of random graphs associated to a graphon parameter. We have now established: Theorem 4.4 (Fluctuations in the generic case). Let X = (X , d, µ) ∈ M a metric measure space and Φ p,ϕ ∈ Π a polynomial.
(2) If σ(ϕ, X ) > 0, then the random variables satisfy all the limiting results from Theorem 3.3. In particular, we have the convergence in law Y n (ϕ, X ) ⇀ n→∞ N (0, 1), and With the terminology of [FMN17, Section 6, Definition 30], the theorem above ensures that the pair (M, Π) is a mod-Gaussian moduli space: generically (as soon as σ(ϕ, X ) > 0), an observable of the Gromov-Prohorov sample model of a mm-space X has normal fluctuations of size O(n −1/2 ), and the limiting variance σ 2 (ϕ, X ) writes as an observable κ 2 (ϕ, ϕ) ∈ Π evaluated on the mm-space X . In this setting, a general problem is to identify the singular points of the space M, that is to say the mm-spaces such that σ 2 (ϕ, X ) = 0 for any function ϕ ∈ C b (R ( p 2 ) ), and thus such that the fluctuations of Φ p,ϕ (X n ) are of order smaller than n −1/2 . The next sections of this paper are devoted to this topic.

Fluctuations in the homogeneous case
In this section, we place ourselves in the singular case of the Gromov-Prohorov sample model, where This implies that Φ(Xn)−Φ(X ) √ n converges in probability to 0 for any observable Φ ∈ Π. A condition which implies (3) and which is much easier to check is: It is not known whether it is possible to have (3) without having (4). We strongly believe that these two conditions are actually equivalent; let us detail a bit why this should be true. In Section 6, we shall introduce monomial observables of mm-spaces which are indexed by finite multigraphs; Equations (3) and (4) correspond to relations between the values of these observables on a mm-space. This viewpoint leads then to questions of graph theory, and a combinatorial study of these relations should allow one to understand whether Condition (4) is strictly stronger than, or equivalent to Condition (3); we aim to address this problem in a forthcoming paper. Let us mention that a analogous problem occurs in the study of fluctuations of graphon models, where the Erdős-Rényi random graphs are singular models but may not be the only singular points; see again [FMN17]. In the remainder of the article, we assume that Condition (4) is satisfied, and we prove the following results: (1) This probabilistic condition is equivalent to a geometric property for the space X , namely, X is a compact homogeneous space G/K on which the compact group G acts by isometry; see Theorem 5.1.
(3) The limiting distribution is not necessarily Gaussian; we provide in Section 6 an explicit example when X is the circle.
The group Isomp(X ) is endowed with the topology of uniform convergence on compact subsets, which is defined by the neighborhoods for i ∈ Isomp(X ), K compact subset of X and ǫ > 0. The group action of G = Isomp(X ) on X is the continuous map The orbit of x ∈ X is O x = {y ∈ X | ∃g ∈ G : y = g · x}, and the stabilizer of x is the subgroup of G given by St x = {g ∈ G | g · x = x}. For a subgroup K of a group G, we denote by G/K the space of left cosets of the group G over K, and the canonical projection map. The group action by left translations of G on G/K is g ·ḡ 1 = gg 1 .
For any x ∈ X , we have the bijection Finally, we denote X N µ the space of µ-equidistributed sequences: 5.1. Equivalence between small variance and compact homogeneity. The following theorem characterizes the singular case (4), where the variance of Φ(X n ) is at most of order 1/n 2 for any polynomial X . Let us restate in simpler words our Condition (4). Given 1 ≤ k, l ≤ p, suppose that for any ϕ ∈ C b (R ( p 2 ) ), we have X k , . . . , X ′ p )) .
Thus, the vanishing of one kind of covariance κ(π k,l , ϕ) is equivalent to the vanishing of all these covariances for 1 ≤ k, l ≤ p, and in the sequel we shall work with the case k = l = 1. We recall that ν is the map that associates to any point in X the law of the random variable d(x, (X n ) n∈N ).
(3) The action of Isomp(X ) on X is transitive.
(4) There exists a compact topological group G, and K a closed subgroup of G such that where d G/K is a distance invariant by the action of G (d G/K (gg 1 , gg 2 ) = d G/K (g 1 , g 2 )), and µ G/K = π * (Haar G ) is the push-forward of the Haar measure of G.
Remark 5.2. In the fourth item of Theorem 5.1, the identification of X as a compact homogeneous space has to be understood in the space M, that is to say modulo measure-preserving isometries. In particular, one assumes that X is equal to the support of µ. Proof.
(1) =⇒ (2). Let A be a closed subset of R ( p 2 ) . There exists a sequence (ϕ q ) q∈N of positive continuous bounded functions converging pointwise to ½ A the indicator function of A: take ϕ q (x) = min(1, 1 − q d(x, A)). Taking the limit in Equation (6) as q goes to infinity, we obtain We have: Fix a countable basis of open subsets (A i ) i∈N of R ( p 2 ) . For any A i 1 , . . . , A in , there exists a set X A i 1 ,...,A in of µ-measure 1 such that Ed A i 1 ∪···∪A in is constant on that set. Hence, there exists a set X 0 ⊆ X of µ-measure 1 such that all the maps Ed A i 1 ∪···∪A in are simultaneously constant on X 0 . We can replace in the previous statement X 0 by X , because by dominated convergence, Ed A is continuous over X , and by assumption, X is the support of µ, that is to say the smallest closed subset with µ-measure 1.
Consider now an arbitrary open subset A ⊂ X , and x, y ∈ X . We can write A as a union i∈I A i , and for any finite subfamily J ⊂ I, we have by assumption Ed i∈J A i (x) = Ed i∈J A i (y). By making J grow to I, we conclude that Ed A (x) = Ed A (y). The set of all A ⊂ R ( p 2 ) such that Ed A is constant is a Dynkin system, so we get that for any Borel subset A of R ( p 2 ) , the map Ed A is constant over X . This means that the law of d(x, X 2 , . . . , X p ) is constant over X . As this is true for any p ≥ 1, and as the measurable structure of R met is defined by its finite projections, we conclude that ν x does not depend on x.
(2) =⇒ (3). We adapt the arguments of [Gro07, Section 3 1 2 ]. Let x, y ∈ X , we set ν eq = ν x = ν y as the common value of the map ν by hypothesis. The law of large numbers gives us µ ⊗N (X N µ ) = 1. Then It implies the existence of two sequences (x n ) n∈N and (y n ) n∈N in X N such that By the Portmanteau theorem [Bil99, Theorem 2.1], a µ-equidistributed sequence is dense in the support of µ. Therefore, there exists a unique isometry i : X → X such that for all n ∈ N, i(x n ) = y n . We have for any continuous bounded function f : X → R: δ y j (f ).
By taking the limit of this identity as n goes to infinity, we obtain µ(f • i) = µ(f ). This is true for any f ∈ C b (X ), so by [Bil99, Theorem 1.2], i * µ = µ. We have therefore constructed i ∈ Isomp(X ) such that i(x) = y.
(3) =⇒ (2). Let x, y ∈ X , by 3., there exists an isometry i : X → X with i(x) = y and i * µ = µ. We can define i N : Let ϕ : R met → R a bounded continuous function, we have with x 0 = x and y 0 = y, so ν x = ν y .
(3) =⇒ (4). Let (x n ) n∈N a dense sequence in X and this is a poset for the inclusion order, and it is stable by increasing union. We build by induction a maximal element of this set. We set A 0 = B(x 0 , ǫ) and I 0 = {0}, and then for any n ∈ N: • if B(x n+1 , ǫ) ∩ A n = ∅, then A n+1 = A n ⊔ B(x n+1 , ǫ) and I n+1 = I n ⊔ {n + 1}; • otherwise, A n+1 = A n and I n+1 = I n .
Consider I max = n∈N I n .
(1) The set of indices I max is a maximal element of (D X ,ǫ , ⊆): if n / ∈ I max , then B(x n , ǫ) ∩ A n−1 is non-empty, and a fortiori therefore, we cannot add n to I max and stay in D X ,ǫ .
(2) We have X = n∈Imax B(x n , 3ǫ). If x ∈ X , since (x n ) n∈N is dense in X , there exists n ∈ N such that x ∈ B(x n , ǫ). If n ∈ I max , then obviously and if n is not in I max , then there exists n ′ ∈ I max such that y ∈ B(x n , ǫ) ∩ B(x n ′ , ǫ) = ∅. Hence, we have (3) The set I max is finite. Indeed, because the action of Isomp(X ) over X is transitive, the following map is constant: with common value denoted µ ǫ > 0. Consequently, because µ is a probability measure.
So, I max is finite, and we have proved that X is a pre-compact space. Since X is complete, X is compact. The group of isometries Isom(X ) endowed with the compact-open topology defined by the neighborhoods V (i, K, ǫ) from Equation (5) is also a compact Hausdorff space: • It is a general fact that given two compact metric spaces X and Y, the space of continuous functions C (X , Y) endowed with the compact-open topology is metrised by ); see [Dug66, Chapter XII, Section 8]. By restriction, the topology of Isom(X ) is metrisable.
• The compactness of Isom(X ) is then an immediate application of the Arzela-Ascoli theorem.
The subgroup of measure-preserving isometries Isomp(X ) is a closed subgroup of Isom(X ), hence also compact. Since the action of Isomp(X ) over X is transitive, we have O x = X for each x ∈ X . Therefore, we have the following homeomorphism (see [MT86, Theorem 2.3.2]): Denote G = Isomp(X ) and K = St x , x being an arbitrary reference point in the space X . The homeomorphism ψ allows one to transport the distance d of X to a G-invariant distance d G/K (·, ·) = d(ψ −1 (·), ψ −1 (·)), and the measure µ to a G-invariant probability measure µ G/K = (ψ −1 ) * µ on G/K. It remains to prove that µ G/K = π * (Haar G ). Given a topological compact Hausdorff space Z, we recall the bijective correspondence (see [Lan93, Chapter IX]): To any topological compact Hausdorff group Z, we associate the probability Haar measure Haar Z , and we define We denote by C (G) * + the space of positive continuous linear forms on the R-vector space C (G). The transformation T induces the contravariant transformation and any group action G × A → A induces the group action Consider the probability measure µ G/K as an element of C (G/K) * + ; we have by definition that for any g ∈ G and p ∈ C (G/K), µ(g · p) = µ(p). If q ∈ C (G), then we have so µ • T = T * (µ) is the unique G-invariant positive normalised continuous linear form on C (G).

5.2.
Study of the cumulants in the homogeneous case. We now perform the asymptotic analysis of the fluctuations of the observables Φ(X n ) when X = G/K is a compact homogeneous space. We start by proving an upper bound on the cumulants of S n (ϕ, X ) which will be analogue to the one of the method of cumulants, but with different parameters N n and D n , and with a non-Gaussian limiting distribution; see Theorem 5.6. Our arguments will involve spanning trees of graphs. We recall that a Cayley tree of size r is a labeled tree with vertex set [[1, r]]; there are r r−2 Cayley trees of size r. We start with the homogeneous analogue of Proposition 4.3.
Looking at the coefficient of [t 1 · · · t r ] in the logarithm of the Laplace transform, we conclude that the joint cumulant vanishes.
Remark 5.4. The proof of this proposition leads to a slightly stronger result: if π ∈ Q(pr) is a set partition such that H π is disconnected or is a tree, or even is a connected graph with one vertex of valence 1, then the corresponding cumulant vanishes. For instance, with r = 2 and p = 6, the following set partition which identifies one index of the first block of indicesī 1 with two distinct indices of the second blockī 2 satisfies ℓ(π) = 10 = (p − 1)r, but the corresponding graph H π is the unique Cayley tree on 2 vertices, so κ(π, ϕ) = 0 for any function ϕ ∈ C (R ( p 2 ) ). The most general condition which leads to the vanishing of the joint cumulant κ(π, ϕ) = 0 is the following: if there exists an integer k ∈ [[1, r]] such that, among the integers (k − 1)p + 1, . . . , kp, the set partition π ∈ Q(pr) contains p − 1 singletons (and the remaining integer of this block which can be connected to many other integers in the other blocks), then κ(π, ϕ) = 0. Indeed, we can then use the same trick as above to replace in the computation of the joint Laplace transform the familyī k by a family of indices k which are all distinct and which are not shared by the other familiesī a =k . We call such a set partition π homogeneously vanishing.
We know that for each r ≥ 2, κ (r) (S n (ϕ, X )) is a polynomial function of degree less than (p − 1)r, according to Propositions 4.2 and 5.3. We can write κ (r) (S n (ϕ, X )) = V (n) = (p−1)r i=0 v i n i and var(S n (ϕ, X )) = κ (2) (S n (ϕ, X )) = W (n) = 2(p−1) i=0 w i n i ; the assumption σ 2 hom > 0 amounts to w 2(p−1) > 0. So we have The following theorem ensures that the a r 's are not too large, so that we can sum them and obtain the Laplace transform of a limiting distribution of Y n (ϕ, X ).
Theorem 5.6. In the case where X is a compact homogeneous space, we have for any ϕ ∈ C (R ( p 2 ) ) and any r ≥ 2 the upper bound |κ (r) (S n (ϕ, X ))| ≤ (Ap 2 ) r (2r) r−1 n (p−1)r Proof. We are going to adapt the proof of the upper bound (1) which can be found in [FMN16,Chapter 9]. We expand by multilinearity the cumulant and we start by controlling each term of the following sum: Xī1)), . . . , ϕ(d(Xīr ))) , where π = Sp n (ī 1 , . . . ,ī r ) and ST(H π ) is the number of spanning trees of the multigraph H π . Now, we have identified in a previous remark the cumulants κ(π, ϕ) which vanish in the homogeneous case, so we can add this condition to the upper bound. Thereby, we have where NHV(π) is the condition "π is not homogeneously vanishing". Summing over V r , we get by using the triangle inequality |κ (r) (S n (ϕ, X ))| ≤ A r 2 r−1 are constructed as follows. We fix a vertex k = 1 of degree one (a leaf) in T , and we shall chooseī k at the end. Before that: • We start by choosing theī j 's with j neighbour of 1 in T and j = k. For each such family,ī 1 andī j share at least one index, so the number of possibilities forī j is smaller than D n = p 2 n p−1 .
• We pursue the construction with the neighbours of the neighbours of 1, and so on but leaving always on the side the vertex k. Each time, there are at most p 2 n p−1 possibilities forī j . Moreover, as k is a leaf of T , our inductive construction enumerates all the vertices in [[1, r]] but k.
As there are r r−2 Cayley trees of size r, and n p possibilities forī 1 , we finally get the upper bound |κ (r) (S n (ϕ, X ))| ≤ A r 2 r−1 r r−1 p 2r n (p−1)r . 5.3. Central limit theorem for the homogeneous case. We can finally prove the analogue of Theorem 4.4 when X is a compact homogeneous space.
Since σ n,hom → σ hom > 0, we see that for n large enough, if then log E[e zYn ] is convergent and uniformly bounded on this disk. Taking the exponentials, the same is true for the Laplace transforms E[e zYn ], and by Proposition 5.5, these holomorphic functions converge uniformly on D(0, R) towards By standard arguments (see [Bil95,p. 390]), this implies the convergence in law towards a random variable Y whose moment-generating function E[e zY ] is the left-hand side of the equation above. Since this Laplace transform is convergent on a disc with positive radius, Y is determined by its moments.
Let us compare Theorems 4.4 and Theorems 5.7. In the generic case, the variance of Φ(X n ) is expected to be of order so the fluctuations of Φ(X n ) are usually of order O(n −1/2 ), and asymptotically (mod-)Gaussian. By usually we mean that one specific observable ϕ might satisfy "by chance" σ(ϕ, X ) = 0, but this is in general not the case; and by Theorem 5.1 the vanishing of all these limiting variances is almost equivalent to X being compact homogeneous (the almost is related to the replacement of Condition (3) by the simpler Condition (4); they might be equivalent). In the homogeneous case, the variance of Φ(X n ) is expected to be of order so the fluctuations of Φ(X n ) are now of order O(n −1 ). What remains to be seen is that our estimates on cumulants in the homogeneous case are in a sense optimal: we have the best possible upper bound for these cumulants, and in particular we can have a r≥3 = 0, whence a non-Gaussian limiting distribution. The last section of the paper is devoted to the analysis of one such example.
For any x ≥ 0 and any n, In particular, if X = (X , d, µ) is a metric measure space with diameter smaller than 1, then d(x, y) d(y, z) µ ⊗3 (dx dy dz).
Let us consider the metric measure space X = R/Z. For x ∈ R, we denote x the class of x modulo 1. The space X is endowed with the geodesic distance d(x, y) = inf k∈Z |x − y − k| and with the projection µ of the Lebesgue measure, which is a probability measure. It is obviously a compact homogeneous space in the sense of Section 5, and even a compact Lie group. Therefore, by Theorem 5.7, if X n is the sample model of order n associated to this space, then var(Φ(X n )) converges towards a limiting distribution, assuming that n 2 var(Φ(X n )) = var(S n (ϕ, X )) n 4 admits a strictly positive limit σ 2 hom . The objective of this section is to prove that this limiting distribution indeed exists and is not the Gaussian distribution. To this purpose, we shall compute the three first cumulants of S n (ϕ, X ), and prove in particular that κ (3) (Y n (ϕ, X )) admits a non-zero limit.
6.1. Graph expansion of the moments of monomial observables. More generally, we explain how to compute the moments of monomial observables M G attached to multigraphs. Let G be a (unoriented) graph on p vertices 1, 2, . . . , p, possibly with loops and with multiple edges. We associate to G = (V, E) and to a metric measure space X = (X , d, µ) the function For instance, the function ϕ introduced above is ϕ (d(x 1 , x 2 , x 3 We denote M G (X ) = X p F G (x 1 , . . . , x p ) µ ⊗p (dx 1 · · · dx p ). This quantity is a polynomial observable of X , and it only depends on the unlabeled graph underlying G. The following proposition relates these observables and the moments of the random functions M G (X n ).
Proposition 6.1. Fix a multigraph G on p vertices, and a metric measure space X , with sample model X n for all order n. For any r ≥ 1, we have: where G r denotes the disjoint union of r copies of G, and G r ↓ π is the contraction of this graph according to a set partition π.
By contraction of a multigraph H according to a set partition π of its vertex set V , we mean the multigraph H ↓ π whose vertices are the parts of π, and where every edge {a, b} of the original graph H becomes an edge between the parts π(a) and π(b) containing respectively a and b (and a loop if π(a) = π(b)).
Indeed, if one chooses for every part π c of π an index i ac bc falling in this part, then one has the identity r a=1 F G (X i a 1 , . . . , X i a p ) = F G r ↓π X i a 1 b 1 , . . . , X i a ℓ(π) b ℓ(π) , and the variables X i ac bc are all distinct by definition of π; the identity in distribution follows by a relabeling of these variables.
We therefore have: 6.2. The three first limiting cumulants. Proposition 6.1 shows that if one can compute M G (X ) for any graph G, then one can also compute the moments and cumulants of M G (X n ) for any n and any graph G. However, even in the easy case where X is the circle, it can be difficult to find the value of the integral In the following, we compute the three first moments of Φ(X n ) = M (X n ), and we explain in the specific case where X = R/Z = T how to make some reductions of the graphs G that appear in our computation.
We have of course M (T) = 1. Let us explain how to compute M G (T) when one can reduce G to the trivial graph • by recursively deleting in G the vertices with one or two neighbors: • reduction of the vertices with one neighbor. If in the graph G there is one vertex x only connected to another vertex y, then we can factor in the integral M G (T) the term where a ≥ 1 is the number of edges between x and y. The integral above is equal to 2 1 2 0 t a dt = 1 2 a (a + 1) .
More generally, because the circle T is a homogeneous space, if the graph G is not biconnected and can be written either as the disjoint union of two graphs G 1 and G 2 , or as the union of two graphs G 1 and G 2 that only spare one vertex, then we have M G (T) = M G 1 (T) M G 2 (T).
• reduction of the vertices with two neighbors. Suppose now that there is one vertex x only connected to two other vertices y and z, with a ≥ 1 edges between x and y and b ≥ 1 edges between x and z. Note that this does not mean that one can split G as the union of two biconnected components meeting at x (consider for instance the case where y and z are themselves connected by an edge). We have T (d(x, y)) a (d(x, z)) b dx =