A New Probability Measure-Valued Stochastic Process with Ferguson-Dirichlet Process as Reversible Measure

A new diffusion process taking values in the space of all probability measures over $[0,1]$ is constructed through Dirichlet form theory in this paper. This process is reversible with respect to the Ferguson-Dirichlet process (also called Poisson Dirichlet process), which is the reversible measure of the Fleming-Viot process with parent independent mutation. The intrinsic distance of this process is in the class of Wasserstein distances, so it's also a kind of Wasserstein diffusion. Moreover, this process satisfies the Log-Sobolev inequality.


Introduction
the rate of convergence in exponential ergodicity based on the explicit formula of the transition semigroup.
In this paper, we consider the space of all probability measures on the interval [0, 1]. On it, we will construct a new probability measure-valued process, whose reversible measure is the Ferguson-Dirichlet process. We shall show our new process satisfies the Log-Sobolev inequality, so, by the theory of functional inequalities, its associated semigroup will converge in entropy to its equilibrium with exponential rate. Our process is constructed through classical Dirichlet form theory. Moreover, we show that the intrinsic metric of this Dirichlet form is in the class of Wasserstein distances. This means that our new probability measure-valued process is also a kind of Wasserstein diffusion.
The structure of this paper is as follows: in next section we recall some facts about the Fleming-Viot process, and via comparing with the Fleming-Viot process we sketch the idea of our construction of the new process. In the third section we are concerned with constructing the Dirichlet form. In order to prove the pre-defined bilinear form to be closable, we consider the quasi-invariance property of Ferguson-Dirichlet measure (see Theorem 3.4) and establish the integration by parts formula (see Theorem 3.6). Then we constructed a regular Dirichlet form with Ferguson-Dirichlet process to be the reversible measure in Theorem 3.8. Furthermore, we discuss the intrinsic metric of this Dirichlet form (see Theorem 3.11). The last section is devoted to establish the Log-Sobolev inequality for our new probability measure-valued process. This is proved by the method of finite dimensional approximation based on our construction of Dirichlet form and Döring, M. and Stannat, W.'s work [3].

Comparison with Fleming-Viot process
Let's first introduce the Fleming-Viot process. Let E be a Polish space, i.e. a complete separable metric space. (E) denotes the set of all probability measures on E, and ( (E)) denotes the set of all probability measures on (E). The Fleming-Viot process is initially established as the scaling limit of the measure-valued Moran model. The Fleming-Viot operator is defined as δµ(x)δµ( y) where δφ(µ) , δ x denotes the Dirac measure at x and A is the generator of a where 〈 f , µ〉 = E f dµ for f ∈ (E). When A is the parent independent mutation, i.e.
In the sequel of this paper, we only consider the Fleming-Viot process with parent independent mutation A as in (2.3). Then the Fleming-Viot operator acting on the cylindrical function u(µ) = F (〈 f 1 , µ〉, . . . , 〈 f n , µ〉) has the representation for t = 0 any longer. When we look on (E) as a space and consider to define the gradient for functionals depending only on its values on (E), it's more appropriate to define It's easy to check through direct calculation that the Fleming-Viot operator can also be represented by for u ∈ Cyl. Moreover, the associated Dirichlet form has the form Recently there are many works looking on (E) as an infinite dimensional Riemannian manifold. For instance, Jordan, R. et al. [10] defined a Riemannian manifold structure on the space of all probability measures on R d and constructed the solution of Fokker-Planck equation in R d through establishing gradient flow of relative entropy functional. Otto, F. and Villani, C. [13] used it as a guideline to find the interrelation between the transportation cost inequality and the Log-Sobolev inequality. This kind of viewpoint also stimulate them to find the so called "HWI" inequality, which includes three important quantities: "H" entropy, "W" Wasserstein distance, and "I" Fisher information into one inequality. Sturm, K.T. [20,21] and Lott, J. and Villani, C. [11] further established the concept of lower bound of Ricci curvature on metric measure space E. Particularly, when E is a Riemannian manifold, their lower bound of Ricci curvature coincides with the geometric lower bound of Ricci curvature. Taking this viewpoint, as done in Schied [15], one can look on L 2 (µ) as the tangent space at each point µ ∈ (E), i.e. T µ (E) = L 2 (µ) and for any f , g ∈ T µ (E), define its inner product by 〈 f , g〉 µ . Then the function x → ∇ x u(µ) for some good functional u on (E) is a tangent vector at µ and the Carré du champs operator Γ(u, v) = 〈∇ · u(µ), ∇ · v(µ)〉 µ associated to the Dirichlet form ( , ( )) is just the inner product of tangent vectors. Define a mapping This mapping is called the exponential map of the "Riemannian manifold" (E) since the map t → S t f µ from R to (E) generates a continuous curve in (E) and d dt t=0 About the exponential map S f we should mention the work of Handa [9]. There he provided a characterization of a probability measure Π ∈ ( (E)) to be reversible w.r.t. a general Fleming-Viot operator through the quasi-invariance property of the probability measure Π under the exponential map S f for f ∈ (A). Recall that A represents mutation in the Fleming-Viot operator. For more details on this topic refer to [9].
In this paper, we mainly consider a special case of E, that is, E = [0, 1]. Compared with the Fleming-Viot process on ([0, 1]), we define the tangent space at µ as a subspace of L 2 ((g µ ) * Le b), where g µ denotes the cumulative distribution function of µ, Le b denotes the Lebesgue measure and (g µ ) * Le b := Le b • g −1 µ denotes the push forward measure of Le b under the map g µ : [0, 1] → [0, 1]. We will introduce a new exponential map S f : . Precisely, let 0 denote the space of all right continuous nondecreasing maps g : When g is a continuous function,h g (x) = h (g(x)). When g is of bounded variation, dh(g(x)) = h g (x)dg(x) (refer to [1, Section 3.10] for chain rule of bounded variation functions).
Put e tφ (x) = X (t, x), then e tφ (x) = e φ (t, x) = e tφ (1, x) and e (t+s)φ = e tφ • e sφ for t, s ∈ R and x ∈ [0, 1]. The assumption φ(0) = φ(1) = 0 yields that e tφ (0) = 0, e tφ (1) = 1 for all t ∈ R. Hence, e tφ is a C 2 -isomorphism in 0 . Set provided the limit exists. The tangent space of ([0, 1]) at a point µ is now defined to be the closure of 0 in the norm of L 2 ((g µ ) * Le b), denoted by T µ . We say that a function u has a gradient at µ if there exists a function Define a symmetric bilinear form by for u, v ∈ Cyl, where Cyl is defined in (3.4) below. We shall prove that ( , Cyl) is closable and by classical Dirichlet form theory, we can obtain a probability measure-valued process. This process is reversible w.r.t. the Ferguson-Dirichlet process Π θ ,ν 0 . Moreover, we will show this process satisfies the Log-Sobolev inequality.

Quasi-invariance property
In order to prove the symmetric bilinear form defined in (2.14) is closable, we need to consider first the quasi-invariance of Π θ ,ν 0 under the map S f for f ∈ 0 . From now on for simplicity of notation, we set = ([0, 1]) and Π θ = Π θ ,Leb (that is, the Ferguson-Dirichlet measure Π θ ,ν 0 with ν 0 = Lebesgue measure). It's known that is compact and complete under the weak topology, and its weak topology coincides with the topology determined by Here and in the sequel, (µ, ν) stands for the collection of all probability measures on [0, 1]×[0, 1] with marginals µ and ν respectively. Set d w = d w,2 . Refer to [23] for these fundamental results on probability measure space.
Recall that 0 denotes the space of all right continuous nondecreasing maps g : Each g ∈ 0 can be extended to the full interval by setting g(1) = 1. 0 is equipped with L 2 -distance

Definition 3.1
For θ > 0, there exists a unique probability measure Q θ 0 on 0 , called Dirichlet process, with the property that for each n ∈ N, and each family 0 = t 0 < t 1 < . . . < t n < t n+1 = 1, The measure Q θ 0 is sometimes called entropy measure, but as in [24], in this paper we use the entropy measure to only denote the push forward measure of Q θ 0 under the map g → g * Le b. Define the map ζ : 0 → , g → dg. It's easy to see that (ζ) * Q θ 0 := Q θ 0 • ζ −1 = Π θ . Its inverse ζ −1 assigns to each probability measure its distribution function. In the following we will study the quasi-invariance property of Π θ ,ν 0 through Q θ 0 and ζ. Von Renesse, M-K. and Sturm, K.T. [24] has studied the quasi-invariance property of Q θ 0 on 0 and under the map χ : 0 → , g → g * Le b, Q θ 0 is pushed forward to a probability measure on , which is called entropy measure there. Then through Dirichlet form theory, a stochastic process is constructed on . Since its intrinsic metric of this Dirichlet form is just the L 2 -Wasserstein distance on , this process is usually called Wasserstein diffusion. Our present work also depends on the knowledge of Q θ 0 . Let's recall the quasi-invariance property of Q θ 0 .

Theorem 3.2 ([24] Theorem 4.3 ) Each C
and Y θ h,0 is bounded from above and below. Here , Due to the compactness of the interval [0, 1] and , several well known topologies on 0 and coincide. More precisely, for each sequence (g n ) ⊂ 0 , and each g ∈ 0 , the following types of convergence are equivalent: • g n (t) → g(t) for each t ∈ [0, 1] in which g is continuous; • g n → g in L p ([0, 1]) for each p ≥ 1; • µ g n → µ g weakly; • µ g n → µ t in the L p -Wasserstein distance for each p ≥ 1.
Refer to [24] for a sketch of the idea of the argument.

Lemma 3.3 For each C
Proof. Let µ n ∈ , n ≥ 1 and µ n converges to µ as n → ∞. g µ n and g µ denote the corresponding cumulative distribution functions of µ n and µ. Then g µ n converges to g µ in L p ([0, 1]) for each p ≥ 1. Set ν n =τ h (µ n ), ν =τ h (µ). Then the cumulative function of ν n and ν are h • g µ n and h • g µ respectively. We denote by d w,1 the L 1 -Wasserstein distance, that is, d y) . Then according to [22,Theorem 2.18] about optimal transport on R, we have which yields that as µ n weakly converges to µ, ν n converges to ν as well. This is the desired result.
In particular,τ h is measurable from to for C 2 -isomorphism h ∈ 0 .
where Y θ h,0 (g µ ) is defined as (3.1), and g µ denotes the cumulative distribution function of µ.
Proof. For any bounded measurable function u on , it can induce a bounded measurable function u on 0 byū (g) := u(ζ(g)).

Integration by parts formula
In section 2, we have defined the map e tφ : [0, 1] → [0, 1] and the derivative of a function u : along φ ∈ 0 , provided the limit exists. Let Cyl be the set of all functions on in the form where F ∈ C 1 (R n ), f i ∈ C 1 ([0, 1]) for i = 1, . . . , n and n ∈ N. Let Cyl( 0 ) be the set of all functions w : 0 → R in the form . . , f n dg (3.5) where F ∈ C 1 (R n ), f i ∈ C 1 ([0, 1]) and n ∈ N. For a function w : 0 → R, define its directional derivative along φ ∈ 0 by provided the limit exists.

Proof. (i) By virtue of integration by parts formula on
Using the chain rule for bounded variation function in Vol'pert average form (cf. (ii) Defineū(g) = u • ζ(g), thenū is in the form . . , f n dg .
which concludes the proof.
Before stating the integration by parts formula, we recall a result on the derivative of g → Y θ h,0 (g) appeared in the quasi-invariance formula. According to [24,Lemma 5.7

Theorem 3.6 (Integration by parts formula) (i) For each
Proof. We shall only prove (i), and (ii) can be proved by the similar method used in the previous lemma. By the quasi-invariance of Q θ 0 , one has which concludes (i).

Tangent space and Dirichlet form
The goal of this subsection is to obtain our stochastic process through establishing a Dirichlet form on . The key point is how to specify a suitable pre-Hilbert norm · µ on T µ such that the direction derivative D φ u(µ) of a nice function u on could determine a bounded linear function φ → D φ u(µ) on 0 . Then Riesz representation theorem yields that there exists a unique element Here T µ is the completion of 0 w.r.t. the pre-Hilbert norm · µ .
Naturally, the nice functions are cylindrical functions. Let u ∈ Cyl in the form (3.4). For φ ∈ 0 , where 〈 f , µ〉 = (〈 f 1 , µ〉, . . . , 〈 f n , µ〉), 1]). According to this expression, we choose the pre-Hilbert norm to be the norm of L 2 ((g µ ) * Le b) at µ ∈ . Then φ → D φ u(µ) is a bounded linear function on 0 . So it can be extended to be a bounded linear functional on the completion of 0 w.r.t. the norm of L 2 ((g µ ) * Le b). Therefore, the tangent space T µ is defined to be the completion of 0 w.r.t. the norm of L 2 ((g µ ) * Le b).

Definition 3.7 The gradient of function u :
→ R is said to exist at µ ∈ , if there exists a function ψ : [0, 1] → R such that for any φ ∈ 0 , Then the gradient ψ(·) at µ is denoted by ∇ · u(µ).
Similarly, for a function u : 0 → R, if there exists a function ψ : then its gradient is said to exist at g ∈ 0 and is denoted by ∇ · u(g).
Due to the calculation (3.14), we know that for a cylindrical function u in the form (3.4), its gradient exists at every µ ∈ and Next, we define a symmetric bilinear form by . Let Cyl 0 be the set of all cylindrical functions in the form . . , n, and n ∈ N. (iii) The generator ( , ( )) of ( , ( )) is the Friedrichs extension of the operator ( , Cyl) given by
According to the theory of Dirichlet form (refer to [12]), any regular Dirichlet form possessing local property, i.e.
where supp[u] denotes the support of u, admits a diffusion process, that is, a strong Markov process with continuous sample paths with probability 1. It's clear that Dirichlet form ( , ( )) possesses the local property, so it admits a diffusion process. Note that here sample paths are continuous in the weak topology of . Is this process still continuous in the topology determined by the total variation norm? We tend to believe the negative answer, but we can't prove it now.

Remark 3.9
In the argument of Theorem 3.8, we know that operator is symmetric w.r.t. Π θ , i.e.
Our next goal is to give a description of the intrinsic metric associated to our Dirichlet form. In [24], they constructed a Dirichlet form on whose intrinsic metric is the L 2 -Wasserstein distance on . This result is obtained based on Rademacher theorem ([24, Theorem 7.9, Theorem 7.11]) on the space 0 . There they considered different kinds of cylindrical functions. Using their idea, we can establish Rademacher theorem in our setting on 0 . Proposition 3.10 It holds that Sketch of the proof. Note Cyl( 0 ) can be proved to be a core of a Dirichlet form on 0 in the same method to construct ( , ( )) on , so (3.18) equals to the intrinsic metric of this Dirichlet form on 0 . Almost following the same argument of [24] Theorems 7.9 and 7.11, we can establish Rademacher theorem, then this proposition corresponds to [24,Corollary 7.14].

Log-Sobolev inequalities for the process
In this section, we shall establish the Log-Sobolev inequality for ( , ( )). Döring, M. and Stannat, W. [3] established the Log-Sobolev inequality for the Wasserstein diffusion constructed by [24] on . There they have done lots of calculation on finite dimensional approximation, especially they established the Log-Sobolev inequality for the Dirichlet distribution on finite simplex. We shall take advantage of their work and to establish the Log-Sobolev inequality for ( , ( )) on again by finite dimensional approximation. But to deal with our Dirichlet form, we should take different approximation sequence.
We now include a result proved in [3, Proposition 2.3] as follows: Proposition 4.1 Let q ∈ R n + and q * = min 1≤i≤n q i . Then satisfies the Log-Sobolev inequality with constant less than 4c 1 /q * , that is, Here c 1 can be taken to be 160.  Proof. We will use the idea of [3] to establish the Log-Sobolev inequality through finite dimensional approximation. But compared with [3], we use different type of approximation sequence. In order to use the calculation results of [3], we first make our calculations on 0 , then obtain the desired results under the help of map ζ.

Remark 4.3
It's well known that inequality (4.2) implies that the semigroup (e t ) t≥0 is hypercontractive, i.e. e t 2→ 4 ≤ 1 for some t > 0, where · 2→ 4 denotes the operator norm from L 2 (Π θ ) to L 4 (Π θ ). The contraction properties of Markov semigroups including hypercontractivity, supercontractivity and ultracontractivity are closely related to various functional inequalities. To be more precise, given a complete, connected, noncompact Riemannian manifold M , let (P t ) t≥0 be the semigroup of a diffusion process generated by L = ∆ + Z for some C 1 -vector field Z satisfying Ric(X , X ) − 〈∇ X Z, Z〉 ≥ −K Z |X |, X ∈ T M for some K Z ∈ R, where Ric denotes the Ricci curvature of M . Assume P t admits an invariant measure µ which is positive and Radon. Then, according to [25,Theorem 5.7.1], if there exist C, t > 0, and q > p > 1 such that P t p→q ≤ C, then for some positive constant β K Z ,p,q . Conversely, if there exist C 1 , C 2 > 0 such that for t > 0 and q > p > 1 satisfying exp(4t/C 1 ) ≥ (q − 1)/(p − 1). Refer to Wang's book [25] for more properties associated with Log-Sobolev inequalities and general discussion about functional inequalities including F -Sobolev inequalities, Harnack inequalities, and super and weak Poincaré inequalities.

Remark 4.4
In [19], the same Dirichlet form and Log-Sobolev inequality as this work have been obtained, but the generator and the intrinsic distance of the Dirichlet form haven't been given there. Moreover, the explicit form of the intrinsic metric enable us to establish the transportation cost inequalities for Wasserstein diffusions, which will be done in our forthcoming work [16].