Marked metric measure spaces

A marked metric measure space (mmm-space) is a triple (X,r,mu), where (X,r) is a complete and separable metric space and mu is a probability measure on XxI for some Polish space I of possible marks. We study the space of all (equivalence classes of) marked metric measure spaces for some fixed I. It arises as state space in the construction of Markov processes which take values in random graphs, e.g. tree-valued dynamics describing randomly evolving genealogical structures in population models. We derive here the topological properties of the space of mmm-spaces needed to study convergence in distribution of random mmm-spaces. Extending the notion of the Gromov-weak topology introduced in (Greven, Pfaffelhuber and Winter, 2009), we define the marked Gromov-weak topology, which turns the set of mmm-spaces into a Polish space. We give a characterization of tightness for families of distributions of random mmm- spaces and identify a convergence determining algebra of functions, called polynomials.


Introduction
Metric spaces form a basic structure in mathematics. In probability theory, they build a natural set-up for the possible outcomes of random experiments. In particular, the Borel σ-algebra generated by the topology induced by a metric space is fundamental. Here, spaces such as Ê d (equipped with the euclidean metric), the space of càdlàg paths (equipped with the Skorohod metric) and the space of probability measures (equipped with the Prohorov metric) are frequently considered. Recently, random metric spaces which differ from these examples, have attracted attention in probability theory. Most prominent examples are the description of random genealogical structures via Aldous' Continuum Random Tree (see [2] and [17] for many related results) or the Kingman coalescent [10], the Brownian map [18] and the connected components of the Erdős-Renyi random graph [1], which are all random compact metric spaces. The former two examples give rise to trees, which are special metric spaces, so-called Ê-trees [7]. The latter two examples are based on random graphs and the underlying metric coincides with the graph metric.
In order to discuss convergence in distribution of random metric spaces, the space of metric spaces must be equipped with a topology such that it becomes a Polish space, i.e. a separable topological space, metrizable by a complete metric. Moreover, it is necessary to identify criteria for relative compactness in this topology, allowing to formulate tightness criteria for families of distributions on this space. Such topological properties of the space of compact metric spaces have been developed using the Gromov-Hausdorff topology (see [16,3,11]).
Many applications deal with a random evolution of metric spaces. In such processes, it is frequently necessary to pick a random point from the metric space according to some appropriate distribution, called the sampling measure. Therefore, a (probability) measure on the metric space must be specified and the resulting structure including this sampling measure gives rise to metric measure spaces (mm-spaces). First stochastic processes taking values in mm-spaces, subtree-prune and re-graft [12] and the tree-valued Fleming-Viot dynamics [14] have been constructed. In [13] it was shown that the Gromov-weak topology turns the space of mm-spaces into a Polish space; see also [16, Chapter 3 1 2 ]. Recently, random configurations and random dynamics on metric spaces in the form of random graphs have been studied as well (see [8]). Two examples are percolation [20] and epidemic models on random graphs [6].
The present paper was inspired by the study of a process of random configurations on evolving trees [5]. Such objects arise in mathematical population genetics in the context of Moran models or multi-type branching processes, where the random genealogy of a population evolves together with the (genetic) types of individuals. At any time the state of such a process is a marked metric measure space (mmm-space), where the measure is defined on the product of the metric space and some fixed mark/type space; see Section 2.1. Slightly more complicated structures arise in the study of spatial versions of such population models, where the mark specifies both the genetic type and the location of an individual [15].
Here we establish topological properties of the space of mmm-spaces needed to study convergence in distribution of random mmm-spaces. This requires an extension of the Gromov-weak topology to the marked case (Theorem 1), which is shown to be Polish (Theorem 2), a characterization of tightness of distributions in that space (Theorem 4) and a convergence determining set of functions in the space of probability measures on mmm-spaces (Theorem 5).

Main results
First, we have to introduce some notation. For product spaces X ¢ Y ¢ ¤ ¤ ¤ , we denote the projection operators by π X , π Y , . . . . For a Polish space E, we denote by M 1 ÔEÕ the space of probability measures on the Borel σ-Algebra on E, equipped with the topology of weak convergence, which is denoted by . Moreover, for ϕ : E E ½ (for some other Polish space E ½ ), the image measure of µ under ϕ is denoted ϕ ¦ µ.
Let C b ÔEÕ denote the set of bounded continuous functions on E and recall that a set of functions Π C b ÔEÕ is separating in M 1 ÔEÕ iff for all E-valued ran- Here and in the whole paper the key ingredients are complete separable metric spaces ÔX, r X Õ, ÔY, r X Õ, . . . and probability measures µ X , µ Y , . . . on X ¢ I, Y ¢ I, . . . for a fixed complete and separable metric space ÔI, r I Õ, which we refer to as the mark space.

Marked metric measure spaces
Motivation: The present paper is motivated by genealogical structures in population models. Consider a population X of individuals, all living at the same time. Assume that any pair of individuals x, y È X has a common ancestor, and define a metric on X by setting r X Ôx, yÕ as the time to the most recent common ancestor of x and y, also referred to as their genealogical distance. In addition, individual x È X carries some mark κ X ÔxÕ È I for some measurable function κ X . In order to be able to sample individuals from the population, introduce a sampling measure ν X È M 1 ÔXÕ and define µ X Ôdx, duÕ : ν X ÔdxÕ δ κX ÔxÕ ÔduÕ. ( Recall that most population models, such as branching processes, are exchangeable. On the level of genealogical trees, this leads to the following notion of equivalence of marked metric measure spaces: We call two triples ÔX, r X , µ X Õ and ÔY, r Y , µ Y Õ equivalent if there is an isometry ϕ : suppÔν X Õ suppÔν Y Õ such that ν Y ϕ ¦ ν X and κ Y ÔϕÔxÕÕ κ X ÔxÕ for all x È suppÔν X Õ, i.e. marks are preserved under ϕ. It turns out that it requires strong restrictions on κ to turn the set of triples ÔX, r X , µ X Õ with µ X as in (2) into a Polish space (see [19]). Since these restrictions are frequently not met in applications, we pass to the larger space of triples ÔX, r X , µ X Õ with general µ X È M 1 ÔX ¢ IÕ right away. This leads to the following key concept.
1. An I-marked metric measure space, or mmm-space, for short, is a triple ÔX, r, µÕ such that ÔX, rÕ is a complete and separable metric space and µ È M 1 ÔX ¢ IÕ, where X ¢ I is equipped with the product topology. To avoid set theoretic pathologies we assume that X È BÔÊÕ. In all applications we have in mind this is always the case.
2. Two mmm-spaces ÔX, r X , µ X Õ, ÔY, r Y , µ Y Õ are equivalent if they are measure-and mark-preserving isometric meaning that there is a measurable and We denote the equivalence class of ÔX, r, µÕ by ÔX, r, µÕ.
Outline: In Section 2.2, we state that the marked distance matrix distribution, arising by subsequently sampling points from an mmm-space, uniquely characterizes the mmm-space (Theorem 1). Hence, we can define the marked Gromov-weak topology based on weak convergence of marked distance matrix distributions, which turns Å I into a Polish space (Theorem 2). Moreover, we characterize relatively compact sets in the Gromov-weak topology (Theorem 3). In Subsection 2.3 we treat our main subject, random mmm-spaces. We characterize tightness (Theorem 4) and show that polynomials, specifying an algebra of real-valued functions on Å I , are convergence determining (Theorem 5).

The Gromov-weak topology
Our task is to define a topology that turns Å I into a Polish space. For this purpose, we introduce the notion of the marked distance matrix distribution.

Definition 2.3 (Marked distance matrix distribution).
Let ÔX, r, µÕ be an mmm-space, x : ÔX, r, µÕ È Å I and The marked distance matrix distribution of x ÔX, r, µÕ is defined by For generic elements in Ê Ô AE 2 Õ and I AE , we write r and u, respectively. In the above definition ÔR ÔX,rÕ Õ ¦ µ AE does not depend on the particular element ÔX, r, µÕ of the equivalence class x ÔX, r, µÕ, i.e. ν x is well-defined. The key property of M I is that the distance matrix distribution uniquely determines mmm-spaces as the next result shows.
This characterization of elements in M I allows us to introduce a topology as follows.
in the weak topology on with the product topology of Ê and I, respectively. The next result implies that Å I is a suitable space to apply standard techniques of probability theory (most importantly, weak convergence and martingale problems).
Theorem 2. The space Å I , equipped with the MGW topology, is Polish.
In order to study weak convergence in Å I , knowledge about relatively compact sets is crucial.

Theorem 3 (Relative compactness in the MGW topology).
For Γ Å I the following assertions are equivalent: (i) The set Γ is relatively compact with respect to the marked Gromov-weak topology.
(ii) Both, π 1 ÔΓÕ is relatively compact with respect to the Gromov-weak topology on Å and π 2 ÔΓÕ is relatively compact with respect to the weak topology on M 1 ÔIÕ. Remark 2.5 (Relative compactness in Å). For the application of Theorem 3, it is necessary to characterize relatively compact sets in Å, equipped with the Gromov-weak topology. Proposition 7.1 of [13] gives such a characterization which we recall: Let r 12 : Ôr, uÕ r 12 . Then the set π 1 ÔΓÕ is relatively compact in Å, iff (10) and sup x ÔX,r,µÕÈΓ

Random mmm-spaces
When showing convergence in distribution of a sequence of random mmmspaces, it must be established that the sequence of distributions is tight and all potential limit points are the same and hence we need (i) tightness criteria (see Theorem 4) and (ii) a separating (or even convergence-determining) algebra of functions in M 1 ÔM I Õ (see Theorem 5).

Theorem 4 (Tightness of distributions on Å I ).
For an arbitrary index set J let ØX j : j È JÙ be a family of Å I -valued random variables. The set of distributions of ØX j : j È JÙ is tight iff (i) the set of distributions of Øπ 1 ÔX j Õ : j È JÙ is tight as a subset of M 1 ÔÅÕ, (ii) the set of distributions of Øπ 2 ÔX n Õ : n È AEÙ is tight as a subset of M 1 ÔIÕ.
In order to define a separating algebra of functions in the set of bounded, real-valued functions φ on Ê Ô AE 2 Õ ¢ I AE , which are k times continuously differentiable with respect to the coordinates in Ê Ô AE 2 Õ and such that Ôr, uÕ φÔr, uÕ depends on the first n 2¨v ariables in r and the first n in u.
(The space C 0 consist of constant functions.) For k 0, we set C n : C Ô0Õ n . Definition 2.6 (Polynomials).

A function
for all x È Å I . We then write Φ Φ n,φ .
2. For a polynomial Φ the smallest number n such that there exists φ È C n satisfying (13) is called the degree of Φ. 3. We set for k 0, 1, . . . , The following result shows that polynomials are not only separating, but even convergence determining in M 1 ÔÅ I Õ.
2. There exists a countable algebra Π ¦ Π that is convergence determining in M 1 ÔÅ I Õ.
1. In order to show convergence in distribution of random mmm-spaces X 1 , X 2 , . . . , there are two strategies. (i) If a limit point X is already specified, the property EÖΦÔX n Õ× n EÖΦÔX Õ× for all Φ È Π k suffices for convergence X n n X by Theorem 5. (ii) If no limit point is identified yet, tightness of the sequence implies existence of limit points. Then, convergence of EÖΦÔX n Õ× as a sequence in Ê for all Φ È Π k shows uniqueness of the limiting object. Both situations arise in practice; see the proof of Theorem 1(c) in [5] for an application of the former and the proof of Theorem 4 in [5] for the latter.
2. Theorem 5 extends Corollary 3.1 of [13] in the case of unmarked metric measure spaces. As the theorem shows, convergence of polynomials is enough for convergence in the Gromov-weak topology if the limit object is known. We will show in the proof that convergence of polynomials is enough to ensure tightness of the sequence.

Properties of the marked Gromov-weak topology
After proving Theorem 1 in Section 3.1, we introduce the Gromov-Prohorov metric on Å I a concept of interest also by itself in Section 3.2. We will show in the proofs of Theorems 2 and 3 in Section 3.3 that this metric is complete and metrizes the MGW topology.

Proof of Theorem 1
We adapt the proof of Gromov's reconstruction theorem for metric measure spaces, given by A. Vershik -see Chapter 3 1 2 .5 and 3 1 2 .7 in [16] -to the marked case. Let Thus, it remains to show that the converse is also true, i.e. we need to show that ν x ν y implies that x and y are measure-preserving isometric (see Definition 2.1).

The Gromov-Prohorov metric
In this section, we define the marked Gromov-Prohorov metric on Å I , which generates a topology which is at least as strong as the marked Gromov-weak topology, see Lemma 3.5. However, since we establish in Proposition 3.6 that both topologies have the same compact sets, we see in Proposition 3.7 that the topologies are the same, and hence, the marked Gromov-Prohorov metric metrizes the marked Gromov-weak topology. We use the same notation for ϕ and Ö ϕ as in Definition 2.1. Recall that the topology of weak convergence of probability measures on a separable space is metrized by the Prohorov metric (see [ where the infimum is taken over all complete and separable metric spaces ÔZ, r Z Õ,  For x i ÔX i , r i , µ i Õ È Å I , i 1, 2, denote by X 1 X 2 the disjoint union of X 1 and X 2 . Then, where the infimum is over all metrics r X1 X2 on X 1 X 2 extending the metrics on X 1 and X 2 and ϕ i : X i X 1 X 2 , i 1, 2 denote the canonical embeddings. and any ε 0, by the same construction as in Remark 3.2, we can choose a metric r X1 X2 X3 on X 1 X 2 X 3 , extending the metrics r X1 , r X2 , r X3 , such that Then, we can use the triangle inequality for the Prohorov metric on M 1 ÔÔX 1 X 2 X 3 Õ ¢ IÕ and let ε 0 to obtain the triangle inequality for d MGP . Lemma 3.4 (Equivalent description of the MGP topology). Let x ÔX, r X , µ X Õ, x 1 ÔX 1 , r 1 , µ 1 Õ, x 2 ÔX 2 , r 2 , µ 2 Õ, . . . È Å I . Then, d MGP Ôx n , x Õ n 0 if and only if there is a complete and separable metric space ÔZ, r Z Õ and isometric embeddings ϕ X : X Z, ϕ 1 : Proof. The assertion is an extension of Lemma 5.8 in [13] to the marked case. The proof of the present lemma follows the same lines, which we sketch briefly.
First, the "if"-direction is clear. For the "only if" direction, fix a sequence ε 1 , ε 2 , ¤ ¤ ¤ 0 with ε n 0 as n . By the same construction as in Remark 3.3, we can construct a metric r Z on Z X X 1 X 2 ¤ ¤ ¤ with the property that where ϕ X : X Z and ϕ n : X n Z, n È AE are canonical embeddings. The assertion follows.

Lemma 3.5 (MGP convergence implies MGW convergence).
Let x , x 1 , x 2 , ¤ ¤ ¤ È Å I be such that d MGP Ôx n , x Õ n 0. Then, x n n x in the MGW topology.
It is a consequence of Proposition 3.4.5 in [9] that ä n C n is convergence determining in M 1 ÔÊ Ô AE 2 Õ ¢I AE Õ; see also the proof of Proposition 4.1. Let Φ È Π 0 be such that ΦÔ.Õ Üν . , φÝ for some φ È ä n 0 C n . Since ÔÖ ϕ n Õ ¦ µ n n ÔÖ ϕ X Õ ¦ µ X by (20), we also have that Hence we can conclude that (22) Since x ÔZ, r Z , ÔÖ ϕ X Õ ¦ µ X Õ and x n ÔZ, r Z , ÔÖ ϕ n Õ ¦ µ n Õ, n 1, 2, . . . , this proves that Üν xn , φÝ n Üν x , φÝ. Because Φ È Π 0 was arbitrary, we have that ν xn n ν x . Then, by definition, x n n x in the MGW topology. Üν . , φÝ such that φ does not depend on the variables u È I AE , as well as functions φ which only depend on u 1 È I. Denote the former set of functions by Π dist and the latter by Π mark .
By Proposition 3.6 Øx n : n È AEÙ is relatively compact in the marked Gromov-Prohorov topology. Therefore, for a subsequence, there exists y È Å I and a further subsequence x n1 , x n2 , . . . with x n k k y in the Gromov-Prohorov topology. By the 'if'-direction it follows that x n k k y in the Gromov-weak topology, which shows that y x and therefore (25) holds.

Proofs of Theorems 2 and 3
Clearly, Theorem 3 was already shown in Proposition 3.6.
For Theorem 2, some of our arguments are similar to proofs in [13], where the case without marks is treated, which are also based on a similar metric. We have shown in Proposition 3.7 that the marked Gromov-Prohorov metric metrizes the marked Gromov-weak topology. Hence, we need to show that the marked Gromov-weak topology is separable, and d MGP is complete.
We start with separability. Note that the Gromov-Prohorov topology coincides with the topology of weak convergence on Øν x : x È Å I Ù M 1 ÔÊ Ô AE 2 Õ ¢I AE Õ.
Hence, separability follows from separability of the topology of weak convergence on M 1 ÔÊ Ô AE 2 Õ ¢ I AE Õ.
For completeness, consider a Cauchy sequence x 1 , x 2 , ¤ ¤ ¤ È Å I . It suffices to show that there is a convergent subsequence. Note that π 1 Ôx n Õ is Cauchy in Å and π 2 Ôx n Õ is Cauchy in M 1 ÔIÕ. In particular, Øπ i Ôx n Õ : n È AEÙ,i 1, 2 are relatively compact. By Proposition 3.6, this implies that Øx n : n È AEÙ is relatively compact in Å I and thus, there exists a convergent subsequence.

Properties of random mmm-spaces
In this section we prove the probabilistic statements which we asserted in Subsection 2.3. In particular, we prove Theorems 4 in Section 4.1 and Theorem 5 in Section 4.3. In Section 4.2 we give properties of polynomials a class of functions not only crucial for the topology of M I but also to formulate martingale problems (see [5,14]).

Proof of Theorem 4
The proof is an easy consequence of Theorem 3: By Prohorov's Theorem, the family of distributions of ØX j : j È JÙ is tight iff for all ε 0 there is Γ ε Å I relatively compact with inf jÈJ PÔX j È Γ ε Õ 1 ¡ ε. By Theorem 3 the latter is the case iff for all ε 0 there are relatively compact Γ 1 This is the same as (i) and (ii).

Polynomials
We prepare the proof of Theorem 5 with some results on polynomials. We show that polynomials separate points (Proposition 4.1) and are convergence determining in Å I (Proposition 4.2).
Proposition 4.1 (Polynomials form an algebra that separates points).
Then, for x È Å I , we find that Next, we show that Π k is an algebra. Clearly, Π k is a linear space and 1 È Π k . Next consider multiplication of polynomials. By (30), we find that We turn to showing that Π k separates points. Recall that for x È Å I , the distance matrix distribution ν x is an element of M 1 ÔÊ Ô AE 2 Õ ¢ I AE Õ. On such product spaces, the set of functions φÔr, uÕ is separating in M 1 ÔÊ Ô AE 2 Õ ¢ I AE Õ by Proposition 3.4.5 of [9]. If x y , we have ν x ν y by Theorem 1 and hence, there exists φ È Π k with Üφ, ν x Ý Üφ, ν y Ý and hence Π k separates points. There exists a countable algebra Π ¦ Π that is convergence determining in is a countable algebra that is convergence determining. Indeed, for x , x 1 , x 2 , ¤ ¤ ¤ È Å, we have x n n x in the Gromov-weak topology iff ν xn n ν x in the weak topology on Ê Ô AE 2 Õ ¢ I AE iff Üν xn , φÝ n Üν x , φÝ for all φ È V .

Proof of Theorem 5
By Theorem 3.4.5 of [9] and Proposition 4.1, Π k is separating in M 1 ÔM I Õ.
We will show that Π ¦ from Proposition 4.2 is a countable, convergence determining algebra in M 1 ÔM I Õ. Recall V and its ingredients, V I and V Ê from the proof of Proposition 4.2. By Lemma 3.4.3 in [9], we have that X n n X iff (i) EÖΦÔX n Õ× n EÖΦÔX Õ× for all Φ È Π ¦ and (ii) the family of distributions of ØX n : n È AEÙ is tight. We will show that (i) implies (ii). distribution ν y . We have EÖφÔÔR, U Õ Xn Õ× E EÖφÔÔR, U Õ Xn Õ X n × EÖÜν Xn , φÝ× n EÖÜν X , φÝ× EÖφÔÔR, U Õ X Õ×, for all φ È V by Assumption (i). Since V is convergence determining in M 1 ÔÊ Ô AE 2 Õ ¢ I AE Õ, we note that ÔR, U Õ Xn n ÔR, U Õ X .
In order to obtain (35) for i 2, note that v ¦ ν Xn È M 1 ÔIÕ is the first moment measure of the distribution of the M 1 ÔIÕ-valued random variable π 2 ÔX n Õ and recall that tightness in M 1 ÔM 1 ÔIÕÕ is implied by tightness of the first moment measure. By (37), we find that for g È V I EÖgÔvÔÔR, U Õ Xn ÕÕ× n EÖgÔvÔÔR, U Õ X ÕÕ×, so vÔÔR, U Õ Xn Õ n vÔÔR, U Õ X Õ and, in particular, (35) holds for i 2.