Equivalence of Gromov-Prohorov- and Gromov's Box-Metric on the Space of Metric Measure Spaces

The space of metric measure spaces (complete separable metric spaces with a probability measure) is becoming more and more important as state space for stochastic processes. Of particular interest is the subspace of (continuum) metric measure trees. Greven, Pfaffelhuber and Winter introduced the Gromov-Prohorov metric d_{GPW} on the space of metric measure spaces and showed that it induces the Gromov-weak topology. They also conjectured that this topology coincides with the topology induced by Gromov's Box_1 metric. Here, we show that this is indeed true, and the metrics are even bi-Lipschitz equivalent. More precisely, d_{GPW}= 1/2 Box_{1/2}, and hence d_{GPW}<= Box_1<= 2d_{GPW}. The fact that different approaches lead to equivalent metrics underlines their importance and also that of the induced Gromov-weak topology. As an application, we give an easy proof of the known fact that the map associating to a lower semi-continuous excursion the coded R-tree is Lipschitz continuous when the excursions are endowed with the (non-separable) uniform metric. We also introduce a new, weaker, metric topology on excursions, which has the advantage of being separable and making the space of bounded excursions a Lusin space. We obtain continuity also for this new topology.


Introduction
Tree-valued stochastic processes frequently appear in probability theory and its application areas, such as theoretical biology.For instance, in an evolutionary model, the development of the genealogical tree is of interest.In the continuum limit of infinite population size, the finite tree becomes a continuum tree (R-tree) and the normalised counting measure of individuals becomes a probability measure on it.This measure is needed to describe the population density on the tree and to sample individuals from it.See Aldous' seminal paper [Ald93] for the convergence of finite variance Galton-Watson trees to a (Brownian) continuum measure tree, and results of Duquesne and Le Gal ( [DLG02,Duq03]) for the convergence of infinite variance Galton-Watson trees to Lévy trees.
More generally than R-trees, we can considers random metric (probability) measure spaces, an approach introduced by Greven, Pfaffelhuber and Winter in [GPW09] and applied by the authors and Depperschmidt to obtain tree-valued Fleming-Viot dynamics in [GPW11,DGP11b].Here, X = (X, d, µ) is a metric measure space (mm-space) if (X, d) is a complete, separable metric space and µ a probability measure on the Borel σ-algebra of X.To work with mm-space valued processes, it is crucial to have an appropriate topology on the set of mm-spaces, or rather the set X of isometry classes of mm-spaces.A fruitful topology is given by the Gromov-weak topology introduced in [GPW09].In the same paper, the authors conjectured that it coincides with the topology induced by Gromov's metric ✷ 1 , which is defined in [Gro99, Chapter 3 1 2 ].They also introduced a complete metric, the Gromov-Prohorov metric d GP , that metrises the Gromov-weak topology.
Here, we show that ✷ 1 and d GP are bi-Lipschitz equivalent, which in particular implies that the conjecture is true and ✷ 1 indeed metrises Gromov-weak topology.Furthermore, we use this result to prove that the measure R-tree coded by an excursion depends continuously on the excursion.To this end, we consider two topologies on the space of lower semi-continuous excursions.For the uniform topology, Lipschitz continuity is already shown by Abraham, Delmas and Hoscheit in [ADH12a, Prop.2.9] (with their metric on trees, which implies the result for ours), but we obtain a much shorter proof using the equivalence of d GP and ✷ 1 .The uniform topology has the disadvantage of being non-separable, therefore we introduce a new, weaker, separable, metrisable topology, which is Lusin on the subset of bounded excursions.We also show continuous dependence of the tree on the excursion in this weaker topology.
In the next section, we recall the definition of the metrics d GP and ✷ 1 , as well as of Gromovweak topology, and emphasize that the algebra of polynomials used to define Gromov-weak topology is convergence determining albeit not dense in the bounded continuous functions.We also give a short comparison to related, but slightly different topologies used on spaces of mm-spaces.The third section contains the proof of the equivalence of d GP and ✷ 1 .In the last section, we apply the equivalence to measure trees coded by excursions and define the new topology on the space of excursions.

Metrics and topologies on the space of mm-spaces
We do not distinguish between isomorphic mm-spaces.Here, two mm-spaces X = (X, d, µ) and X ′ = (X ′ , d ′ , µ ′ ) are called isomorphic if there is a measure preserving map f : X → X ′ such that the restriction to the support of µ is an isometry, i.e.
We denote the space of (isometry classes of) mm-spaces by X.
Remark 1.Because (X, d) is complete, an isomorphism f from X to X ′ is an isometric bijection between supp(µ) and supp(µ ′ ).In particular, there is also an inverse isomorphism g from X ′ to X with g • f = id on supp(µ).

Gromov-Prohorov metric
The Gromov-Prohorov metric is obtained by embedding the metric spaces underlying the mmspaces optimally into a common metric space and taking the Prohorov distance between the pushforward measures.
Definition 2 (Prohorov metric).Let µ, ν be probability measures on a metric space (X, d).Then the Prohorov distance is where Remark 3. Below, we use the following equivalent expression for the Prohorov metric.A coupling between µ and ν is a measure ξ on X 2 = X × X with marginals µ and ν on X.Then Definition 4 (Gromov-Prohorov metric).Let X i = (X i , d i , µ i ) ∈ X, i = 1, 2, be mm-spaces.The Gromov-Prohorov metric is defined by where the infimum is taken over all isometries f : X 1 → X and g : X 2 → X into a common separable metric space (X, d).

Gromov-weak topology
The idea of Gromov-weak topology is to use convergence in distribution of finite metric subspaces, which are sampled from X with the measure µ.A very nice property of the Gromov-Prohorov metric is that it induces precisely the Gromov-weak topology, as shown in [GPW09].This alternative characterisation of convergence provides us with a sub-algebra of C b (X), called algebra of polynomials.The usefulness of this algebra of stems from the fact that it is rich enough to determine convergence of measures on X.To emphasize that polynomials are an essential tool for working with convergence in distribution of X-valued random variables, we remark that one cannot use the space C c (X) of continuous functions with compact support, because no point in X has a compact neighbourhood, and hence C c = {0} is trivial.
where n ∈ N and φ ∈ C b R n×n .Let Π be the set of such functions.Gromov-weak topology is the topology induced by Π on X.
Remark 6 (Polynomials are not dense).Π is obviously an algebra, but it is not dense in C b (X).
To see this, assume it is dense and consider the subspace X r of mm-spaces with essential diameter bounded by a fixed r > 0. Because X r is closed, the set Π is separable, and hence X r is compact.This is a contradiction (e.g. the set of finite spaces with discrete metric and uniform distribution has no limit point).
We say that a set F ⊆ C b (X) is convergence determining if for probability measures ξ n , ξ on X, the weak convergence ξ n w − → ξ is equivalent to Since C b (X) is difficult to describe, it is important to have such a set with a more tractable description.That Π is indeed convergence determining is shown with some effort by Depperschmidt, Greven and Pfaffelhuber in [DGP11a].We can also deduce it from an apparently not so well-known general theorem due to Le Cam.
Theorem 7 (Le Cam, [LC57]; see also [HJ77,Lem. 4.1]).Let X be a completely regular Hausdorff space, and F ⊆ C b (X) multiplicatively closed.Then F is convergence determining for Radon probability measures if and only if F generates the topology of X.
Corollary 8.The set Π of polynomials is convergence determining.
Proof.X is a Polish space, hence completely regular and all probability measures on it are Radon.Π is an algebra, thus multiplicatively closed and we can apply the Le Cam theorem.

Gromov's metric ✷ λ
To obtain the Gromov-Prohorov metric, we embed the metric spaces and measure the distance of the resulting pushforward measures with the Prohorov metric.For Gromov's ✷ λ metric, it works the opposite way.Namely, the measure spaces are parametrised by a measure preserving map from [0, 1] (with Lebesgue measure), and then the distance of the resulting pullbacks of the metrics is evaluated with the following metric.
Remark 11.Because (X, d) is a Polish space, it is a Lebesgue space and the set F (X ) of (measure preserving) parametrisations is non-empty.

Related topologies
1.In [Fuk87], Fukaya introduced the measured Hausdorff topology (often cited as measured Gromov-Hausdorff topology) for compact mm-spaces.The same topology is called weighted Gromov-Hausdorff topology, and a complete metric inducing it is constructed by Evans and Winter in [EW06].The idea is that spaces are close if there is an ε-isometry mapping one measure Prohorov-close to the other.Convergence in measured Hausdorff topology implies Gromov-weak convergence, but not vice versa, because the former implies Gromov-Hausdorff convergence of the underlying metric spaces, which is not the case for Gromov-weak topology.Note that the underlying equivalence classes are also different: For two mm-spaces to be equivalent in the measured Hausdorff topology, the whole spaces have to be isometric, while in a Gromov-weak sense, this is required only for the supports of the measures.

Recently, Abraham, Delmas and Hoscheit ([ADH12b]
) extended the measured Hausdorff topology to complete, locally compact, rooted length spaces with locally finite measures.
Note that these measures are finite on all balls, because closed balls are compact in such spaces.The authors introduced the Gromov-Hausdorff-Prohorov metric, first on compact spaces using an embedding and measuring the sum of Hausdorff and Prohorov distance.This metrises measured Hausdorff topology.In the general setting, they integrate the weighted distances of the measures restricted to balls.Note that this extended topology is vague in the sense that the total mass is not preserved.Thus, on spaces with finite (not necessarily probability) measures, it is not stronger than the natural extension of d GP .
3. In [Stu06], Sturm defines the L 2 -transportation distance analogously to d GP , but with the (2-)Wasserstein metric instead of the Prohorov metric.It induces a topology on X that is strictly stronger than Gromov-weak topology, but coincides with it on subspaces of X consisting of spaces with uniformly bounded (essential) diameter.Its restriction to the space of compact mm-spaces is strictly weaker than measured Hausdorff topology.
We show and by symmetry, r 1 (s, t) − r 2 (s, t) ≤ 2ε.In total, , where r i is the pullback of d i with ϕ i .There is a set S ⊆ [0, 1] with λ(S) ≥ 1 − ε and |r 1 − r 2 | ≤ 2ε on S 2 .On the disjoint union X := X 1 ⊎ X 2 , we define a metric d by We check that d satisfies the △-inequality in Lemma 14 below.Extend the µ i to measures on X with support in X i .estimate their Prohorov distance in (X, d), let F ⊆ X be measurable.Note that by definition, d ϕ 1 (s), ϕ 2 (s) = ε for every s ∈ S. Consequently, for every ε 0 > ε, Therefore, Corollary 13. d GP ≤ ✷ 1 ≤ 2d GP .In particular, ✷ 1 induces the Gromov-weak topology.
Proof.The equation ✷ 1 2 ≤ 2✷ 1 ≤ 2✷ 1 2 is obvious from the definition of ✷ λ .We still have to check that (1) in the proof of Theorem 12 defines a metric.Lemma 14.The d defined in (1) satisfies the △-inequality.Thus it is a metric.
Proof.For x, m ∈ X 1 , y ∈ X 2 , we have All other cases follow by symmetry or by the △-inequalities in X 1 and X 2 .

Continuity of the coding of R-trees by excursions
An R-tree (see [DMT96]) is a connected 0-hyperbolic metric space (T, d).One of the possible definitions of 0-hyperbolicity is that it satisfies the four point condition, i.e.
Note that every 0-hyperbolic space can be embedded isometrically into an R-tree, which may be chosen separable whenever the original space was separable.Because d GP (unlike the measured Hausdorff topology) identifies a metric measure space with every subspace containing the support of the measure, the equivalence class of every 0-hyperbolic space contains an R-tree.
One possibility to construct 0-hyperbolic spaces is to code them by excursions, see [Ald93, LG93, DLG02].To this end, let h : [0, 1] → R + be a positive function with h(0) = 0, and consider the semi-metric Then the quotient space T h := [0, 1]/d h is a 0-hyperbolic metric space.We additionally assume that h is lower semi-continuous.Then T h is separable and the natural projection is measurable.To see this, note that the canonical projection from the graph gr(h) = (t, h(t)) t ∈ [0, 1] ⊆ R 2 of h onto the tree T h is continuous due to lower semi-continuity of h.T h needs to be neither complete nor connected, but we identify it with its completion and, once we have put a measure on it, the equivalence class contains a connected representative.
1.If the graph of h is connected, then T h is complete and connected to begin with.We do not, however, make this restriction.
2. If h is continuous, π h is continuous and T h is compact.Conversely, every compact R-tree can be coded by a (non-unique) continuous excursion ([EW06, Rem.3.2]).To code compact measured trees, continuous excursions are not sufficient.See [Duq06] for a detailed account on coding compact, rooted, ordered, measured R-trees in a unique way by upper semicontinuous càglàd excursions.
Definition 16.We define the set of (generalised) excursions on [0, 1] as Let E b be the subset of bounded functions in E. For h ∈ E, let the mass measure µ h on T h be the image of Lebesgue measure λ under π h and define the coding function It is shown in [ADH12a, Prop.2.9] that the coding function C is continuous when the space of excursions is equipped with the uniform metric.The proof, however, becomes completely trivial if we use Theorem 12, because the trees are already given in a parameterised form.
The uniform metric on E is a rather strong one, in particular E and E b are not separable in this metric.The coding function turns out to be still continuous if we equip E with a weaker, separable, metrisable topology, namely the weakest topology which is stronger than convergence in measure and epigraph convergence.For h, h ′ ∈ E, let which metrises convergence in Lebesgue measure, d H the Hausdorff metric in R 2 , and Note that the epigraph of a function is closed if and only if the function is lower semi-continuous.
Definition 18.We endow E with the excursion metric d Proposition , π is the pointwise limit of the π n , thus also measurable.
Remark 20.We do not know if E is Lusin or even Polish.E b is not Polish, because it is an F σ -set in E with dense complement.Hence, by the Baire category theorem, it cannot be a G δ -set.
Example 21 (d E is not complete and C is not uniformly continuous).Let h n (t) = 1 − 1 N0 (nt).Then h n codes the discrete space of n points with uniform distribution or, equivalently, the starshaped tree with n leaves and uniform distribution on the leaves.h n converges in epigraph topology to the zero function, while d λ (h n , 1) = 0 for each n.Thus (h n ) n∈N is Cauchy w.r.t.d E , but does not converge.C(h n ) n∈N is not a Cauchy sequence in X, hence C is not uniformly continuous.
Theorem 22.The coding function C : E → X is continuous (w.r.t.d E and d GP ).
1. Let A η := t ∈ [0, 1] I h (t − η, t + η) < h(t) − ε .Because h is lower semi-continuous, A η ց ∅ and thus there is a 0 < δ < ε with λ(A δ ) < ε.Fix g ∈ E with d E (h, g) ≤ δ and let 19. E is a separable metric space, and the set of continuous excursions is dense.Furthermore, E b is a Lusin space.Proof.dE is obviously a metric, and the continuous excursions are both d Γ -dense (increasing pointwise convergence implies d Γ -convergence) and d λ -dense in E. Hence E is separable, and it remains to show that E b is a Borel subset of a Polish space.First note that this is the case for (E b , d Γ ), because the set of excursions bounded by a fixed M ∈ N is closed in the compact metric space of extended real-valued excursions with epigraph topology.Now we can identify (E b , d E ) with the graph of the function π : (E b , d Γ ) → L 0 := L 0 (λ), d λ , which maps an excursion to its λ-a.e.equivalence class.It is enough to show that π is measurable, because then (E b , d E ) ∼ = gr(π) is an injective measurable image of a Lusin space, hence Lusin itself.To show measurability, choose a fixed dense sequence (f n ) n∈N of continuous excursions, and define π n 2ε and we have to show d h (s, t) − d g (s, t) ≤ 6ε for s, t ∈ X ε .Because h and g are ε-close at s and t, this is satisfied once we have shown I h (s, t) − I g (s, t) ≤ 2ε.2."I g≤ I h + 2ε": Choose u ∈ [s, t] with h(u) = I h (s, t).From d Γ (h, g) ≤ δ, we obtain u ′ ∈ [u − δ, u + δ] with g(u ′ ) ≤ h(u) + δ.Thus we are done if u ′ ∈ [s, t].Assume this is not the case, w.l.o.g.u ∈ [s, s + δ].Because s is not in A δ , we then have I h (s, t) = h(u) ≥ h(s) − ε ≥ g(s) − 2ε ≥ I g (s, t) − 2ε. 3. "I h ≤ I g + 2ε": Choose u ∈ [s, t] with g(u) = I g (s, t) and u ′ ∈ [u − δ, u + δ] with h(u ′ ) ≤ g(u) + δ.As above we can assume u ∈ [s, s + δ], u ′ ∈ [s − δ, s] and obtain I h (s, t) ≤ h(s) ≤ h(u ′ ) + ε ≤ g(u) + 2ε = I g (s, t) + 2ε.