The Posterior metric and the Goodness of Gibbsianness for transforms of Gibbs measures

We present a general method to derive continuity estimates for conditional probabilities of general (possibly continuous) spin models sub jected to local transformations. Such systems arise in the study of a stochastic time-evolution of Gibbs measures or as noisy observations. We exhibit the minimal necessary structure for such double-layer systems. Assuming no a priori metric on the local state spaces, we define the posterior metric on the local image space. We show that it allows in a natural way to divide the local part of the continuity estimates from the spatial part (which is treated by Dobrushin uniqueness here). We show in the concrete example of the time evolution of rotators on the q-1 dimensional sphere how this method can be used to obtain estimates in terms of the familiar Euclidean metric.


Introduction
The absence or presence of phase transitions lies at the heart of mathematical statistical mechanics of equilibrium systems. A phase transition in an order parameter that can be directly observed is of an obvious interest for the system under investigation. Moreover sometimes also the presence or absence of phase transitions is linked in a more subtle way to the properties of the system under investigation. In fact, it is understood that "hidden phase transitions" in an internal system that is not directly observable are responsible for the failure of the Gibbs property for a variety of important measures that appear as transforms of various Gibbs measures. For the mechanisms of how to become non-Gibbs and background on renormalization group type of pathologies and beyond, see the reviews [11; 8; 5].
The first part of the analysis of an interacting system begins with an understanding of the "weak coupling regime" and proving results based on absence of phase transitions when the system variables behave as a perturbation of independent ones. There is a variety of competing ways at our disposition to do so, giving related but usually not equivalent results, notably Dobrushin's uniqueness theory [14; 1], expansion methods, and percolation and coupling methods.
When it works, Dobrushin uniqueness has a lot of advantages, being not very technical, but very general, requiring little explicit knowledge of the system and providing explicit estimates on decay of correlations. Moreover, it implies useful properties generalizing those of independent variables. As an example of such a useful property we mention Gaussian concentration estimates of functions of the system variables which are obtained as a corollary when there is an estimate on the Dobrushin interaction matrix available [6; 7]. Especially when we are talking about continuous spin systems a Dobrushin uniqueness approach seems favorable, since cluster expansions are often applicable only with some technical effort [19; 12], and percolation and coupling are not directly available.
A particular interest has been in recent times in the study of the loss and recovery of the Gibbs property of an initial Gibbs measure under a stochastic time-evolution. The study started in [4] where the authors focused on the evolution of a Gibbs measure of an Ising model under hightemperature spin-flip Glauber dynamics. The main phenomenon observed therein was the loss of the Gibbs property after a certain transition time when the system was started at an initial low temperature state. The measure stays non-Gibbs forever when the initial external field was zero. More complicated transition phenomena between Gibbs and non-Gibbs are possible at intermediate times when there is no spin-flip symmetry: The Gibbs property is recovered again at large but finite values of time in the presence of non-vanishing external magnetic fields in the external measure. A complete analysis of the corresponding Ising mean-field system in zero magnetic field was given in [3] where the authors analyzed the time-temperature dynamic phase diagram describing the Gibbs non-Gibbs transitions. In the analysis also the phenomenon of symmetry breaking in the set of bad configurations was detected, meaning that a bad configuration whose spatial average does not preserve the spin-flip symmetry of the model appears.
What remains of these phenomena for continuous spins? The case of site-wise independent diffusions of continuous spins on the lattice starting from the Gibbs-measure of a special doublewell potential was considered in [10]. It was shown therein that a similar loss of Gibbsianness will occur if the initial double-well potential is deep enough. In contrast to the Ising model, this loss however is a loss without recovery, so the measure stays non-Gibbs for all sufficiently large times. This is due to the unbounded nature of the spins. Short-time Gibbsianness is proved to hold also in this model. While these results hold for a continuous spin model, the method of proof is nevertheless based on the investigation of a "hidden discrete model", exploiting the particular form of the Gibbs-potential. In [17] the authors studied models for compact spins, namely the planar rotator models on the circle subjected to diffusive time-evolution. It is shown therein that starting with an initial low-temperature Gibbs measure, the time-evolved measure obtained for infinite-or high-temperature dynamics stays Gibbs for short times and for the corresponding initial infinite-or high-temperature Gibbs measure under infinite-or hightemperature dynamics, the time-evolved measure stays Gibbs forever. Their analysis uses the machinery of cluster expansions, as earlier developed in [22]. Even before it was shown that the whole process of space-time histories can be viewed as a Gibbs measure [21]. This is interesting in itself, but does not imply that fixed-time projections are Gibbs.
Short-time Gibbsianness in all these models follows from uniqueness of a hidden or internal system. While this is expected to hold very generally, results that are not restricted to particular models appear only for discrete spin systems [9]. The present paper now narrows the gap. It provides a proof of the preservation of the Gibbs property of the time-evolved Gibbs-measures of a general continuous spin system under site-wise independent dynamics, for short times, even when the initial measure is in the strong coupling regime. Intuitively speaking strong couplings offer the possibility for a phase transition not only in the initial system but also in the internal system, which will however be suppressed for small times, but usually only for small times. Small couplings offer no possibility for a phase and it is much easier to show the preservation of Gibbsianness for all times.
More generally than for time-evolution, we prove our results directly for general two layer systems, consisting of (1) a Gibbs measure in the first layer, that is (2) subjected to local transition kernels mapping the first-layer variables to second-layer variables. This generalizes the notion of a hidden Markov model where the second layer plays the role of a noisy observation. Such models have motivations in a variety of fields. Let us mention for example that they appear in biology as models of gene regulatory networks where the vertices of the network are genes and the variables model gene expression activities.
A measure is Gibbsian when the single-site conditional probabilities depend on the conditioning in an essentially local way. Our main statement (Theorem 2.5) is an explicit upper bound on the continuity of the single-site conditional probabilities of the second-layer system as a function of the conditioning. This is valid when the transition kernels don't fluctuate too much, even when the first-layer system is in a strong coupling regime. Our result holds for discrete or continuous compact state spaces and general interactions and is based on Dobrushin uniqueness. To formulate the resulting continuity estimate for the conditional probabilities we don't need any a priori metric structure on the local-spin spaces: The natural metric on the second-layersingle-spin space is created by the variational distance between the a-priori measures in the first layer that are obtained by conditioning on second layer configurations (see Theorem 2.5).
On the way to this result, we exhibit a simple criterion for Dobrushin uniqueness for Gibbsmeasures (of one layer). It is easy to check and can be of use beyond the study of (non)-Gibbsianness.
Intuitively, it demands that the sum over the interaction terms in the Hamiltonian coupling the sites i and j should not fluctuate too much when it is viewed as a random variable at the site i under the corresponding local a priori measure (see Definition 2.2). So even when one has a large interaction, better concentration properties of the a priori measures can still imply an overall small Dobrushin constant. This is a generalization of the simple large-field criterion ensuring Dobrushin-uniqueness in the Ising model (see p.147 example 8.13 of [1] and [20]) to general spaces (Theorem 2.3).
The criterion we need for the study of the second-layer model is based on the description of the interplay between the possible largeness of the initial interaction and the strength of the coupling to the second layer found in Theorem 2.3 (when the initial a priori measures are replaced with conditional a priori measures). To ensure Gibbsianness of the second-layer model, we thus need small fluctuations of the initial Hamiltonian w.r.t. the a-priori measures in the first layer that are obtained by conditioning on second-layer configurations. The estimates on the spatial memory of the second-layer-single-site conditional probabilities follow naturally by evoking Dobrushinuniqueness estimates on comparison of the Gibbs measures with perturbed specifications and chain-rule type of arguments.
To illustrate the simplicity of our approach to get explicit estimates on the spatial decay we prove short-time Gibbsianness of a model of (q − 1)-dimensional rotators for general q ≥ 2 under diffusive time-evolution on the (q − 1)-spheres, and provide an explicit estimate on the time-interval for which the time-evolved measure stays Gibbs. This will be supplemented by arguments that are more specific to the rotators which give us precise continuity estimates in terms of the Euclidean distances on the spheres. As another application of our approach we show Gibbsianness of initial system (with Lipschitz continuous Hamiltonian) subjected to local coarse-grainings. Here the transformed system will be Gibbs if the local coarse-graining is fine enough.
The rest of the paper is organized as follows: In Section 2 we formulate our main results. From Sections 3 to 5 we present the proofs of the main results and state some further results that are also of interest in themselves.

Preliminaries and definitions
Before we formulate our results we recall some generalities on Gibbs measures for spin systems ([1; 11; 14]). Let G be a countable vertex set (for example G = Z d , d ≥ 1) and denote by S the set of finite subsets of G. We write Λ c := G \ Λ for any subset Λ of G and whenever Λ = {i}, we shall in the sequel write i c instead of Λ c . We denote by σ = (σ i ) i∈G the spin-variables where the σ i 's are taking values in a standard Borel space S (i.e. a measurable space with a metric which turns it into a complete separable metric space equipped with the Borel σ−algebra generated by the metric) equipped an a-priori Borel probability measure α (single-spin space). In our general set-up we don't need to make a metric structure on S explicit. We further denote by Ω := S G the configuration space of our system equipped with the product Borel σ-algebra F. We write F Λ for the sub-σ-algebra generated by the σ i 's for i ∈ Λ ⊂ G. A real-valued function defined on the configuration space is said to be local if it is measurable w.r.t. F Λ for some finite subset Λ ⊂ G and is said to be quasilocal if it is a uniform limit of local functions.
A specification is said to be quasilocal if its application to any quasilocal function f on Ω, denoted by γ Λ (f ), is again a quasilocal function. A specification is said to be uniform nonnull if, for each Λ ∈ S there exist constants 0 , for all ω ∈ Ω and A ∈ F Λ . Here α Λ is the free specification obtained by putting a fixed boundary condition outside Λ and integrating over the product measure inside Λ, that A specification is called Gibbsian if it is quasilocal and uniformly nonnull.
Given a Gibbsian specification γ we say that a probability measure µ on Ω is a Gibbs measure for γ whenever, for each Λ ∈ S, The above equation is called the DLR-equation. We denote by G(γ) the set of Gibbs measures for γ.
Often a specification will not just be abstractly defined, but given in terms of an interaction, or a Hamiltonian. The interaction among the components of the systems we shall consider is a family Φ = (Φ A ) A∈S of local functions Φ A (called interaction potential) satisfying the following summability condition: (2) implies the most frequently found condition of absolute summability, i.e sup i∈G A∈S: A∋i Given an interaction Φ and a fixed configuration ω ∈ Ω, we introduce for each Λ ∈ S as the finite-volume Hamiltonian with ω as boundary condition. For a given absolutely summable interaction Φ we denote by γ ≡ γ βΦ the Gibbsian specification (see [1]) given by where Z Λ (σ) is the normalization constant and β is the inverse temperature. In the present paper β will always be absorbed into the interaction.
While it is important to know that a specification is quasilocal, we are aiming in this paper at statements refining the quasilocality, which describe the dependence of the single-site conditional probabilities γ i on variations of the conditioning in more detail. To quantify the continuity of a specification we employ the following notion.
Definition 2.1. Assume that d is a metric on the single-spin space S, and assume that Q = (Q i,j ) i,j∈G is a non-negative matrix with sup i∈G j∈G Q i,j = Q ∞ < ∞.
A specification γ is called a Gibbsian specification of goodness (Q, d) if it is quasilocal, uniformly non-null, and the single-site kernels satisfy the continuity estimates Here whenever ν 1 and ν 2 are probability measures that are absolutely continuous with respect to the measure λ with λ-densities h 1 and h 2 respectively (i.e. ν 1 − ν 2 is one half of the variational distance between ν 1 and ν 2 ) and the supremum is over observables on Ω.
The faster the decay of Q is, the faster the decay of conditional probabilities on variations of the conditioning is, and the "better" or the "more Gibbsian" the system of conditional probabilities is.
We are restricting our attention to single-site γ i 's since all γ Λ for finite Λ can be expressed by an explicit formula in terms of the γ i 's with i ∈ Λ. For the solution of this "reconstruction-problem" see [1; 23].
In particular we denote by C = (C ij ) i,j∈G the Dobrushin interdependence matrix [1; 14], with entries given by This means that C is the matrix with smallest matrix-elements for which the specification γ is of the goodness (C, d), where d is the discrete metric on S given by d(η j , η ′ j ) = 1 η j =η ′ j . The corresponding Dobrushin constant is given as We recall that whenever c < 1 (Dobrushin uniqueness condition) and γ is quasilocal then γ admits at most one Gibbs measure [14; 1]. However if the single-spin space S is standard Borel (as in our case) then the above condition implies that γ admits a unique Gibbs measure (see Theorem 8.7, [1]). It is known that for a potential Φ satisfying (2) there is a sufficiently small β such that βΦ satisfies Dobrushin uniqueness and the measure is in a weak coupling regime, where the measure is a small perturbation of a product measure.

A bound on the Dobrushin constant for concentrated a priori measures
For our purposes we employ the following definition.
Definition 2.2. For a function F : Ω → R we define the α; i, j-linear deviation dev α;i,j of F to be This quantity is the worst-case linear deviation of the variation of F at the site j viewed as a random variable w.r.t. to σ i under α(dσ i ). Note that the linear deviation is bounded by δ j (F ), the jth oscillation of F , which is bounded by the oscillation of F , i.e.
Then our first result is as follows.
The use of this bound lies in the fact that, even when the interaction potential is large, dev α;i,j (H i ) can be small, when α is close to a Dirac measure. A simple example for this to happen is an Ising model at large external field, as it was discussed in [1]. Our bound produces the bound given there, but extends to general potentials. As a less trivial application of the our bound to a spin model where S is not discrete we discuss the Gauss-Weierstrass kernel in the rotator example of Section 2.4 where we use it to prove short-time Gibbsianness (see (23)).
Of course, when the potential is small to begin with, the r.h.s. of (11) will be small, independently of α, so the theorem can be used for both strong couplings and concentrated a priori measures, and weak coupling.

Two-layer models -Goodness of Gibbsianness
Let us now formulate our assumptions on a two-layer system over a graph G. To each vertex will be associated two local state spaces. A particular example will be given by the site-wise independent time-evolution of Section 2.4. So, in general let S and additionally S ′ be measurable (standard Borel) spaces. This implies in particular existence of all regular conditional probabilities. Again, no a priori metric will be used explicitly. We refer to S as the initial (first-layer) spin space and to S ′ as the image (second-layer) spin space. Let us take K(dσ i , dη i ) (Borel probability measure on the product space S × S ′ ) be the joint a-priori measure for the two-layer model. We assume that K can be written in the form Our initial model (probability measure on Ω) is by definition a Gibbs distribution for the specification given in terms of the potential Φ according to (5) where we now put as an a-priori measure the marginal of K on the first layer, that is α(dσ i ) ≡ S ′ K(dσ i , dη i ). It is important to note that we don't assume uniqueness of the Gibbs measure for this specification. In practice α might be given beforehand and K is then obtained by specifying a transition kernel K(dη i |σ i ) from the first layer to the second layer. In the rest of this work we will (unless otherwise stated) denote by σ i ∈ S the local variable (spin) for the initial model and η i ∈ S ′ the local variable (spin) for the image model.
Let µ(dσ) be a Gibbs measure for the first layer for the potential Φ and a priori measure α. Our aim is then: Study the conditional probabilities of the second-layer measure defined by This form appears for instance in the study of a stochastic time evolution, starting from an initial measure µ where the kernel K(dη i |σ i ) will be dependent on time and is applied independently over the spins (infinite-temperature dynamics). In case studies it has been observed that the map µ → µ ′ may create an image measure that is not a Gibbs measure anymore. On the other hand, in all examples observed, Gibbsianness was preserved at short times where K(dη i |σ i ) is a small perturbation of δ σ i (dη i ). We aim here to give a criterion that implies this in all generality, not using any specifics of the model but only the relevant underlying structure. In particular we are not restricting ourselves to discrete spin spaces.
Our main result Theorem 2.5 is a criterion for the Gibbs property of the second-layer measure to hold, that is easily formulated and verified in concrete examples. Moreover, we give explicit bounds on the dependence of the conditional probabilities of the second-layer measure on the variation of the conditioning, in the sense of Definition 2.1.
We said that we will not use any a priori metric on the spaces S and S ′ ; indeed the natural metric that shall be used for continuity in this set-up shall be given by the variational distance of the conditional a priori measures in the first layer, conditional on the second layer.
the a priori measures in the first layer that are obtained by conditioning on second layer configurations.
the posterior (pseudo-)metric, associated to K on the second layer space.
d ′ satisfies non-negativity and triangular inequality, but we may have (which happens e.g. if σ i and η i are independent under K).
In the language of statistics, α η i is the "posterior measure" depending on the observation η i in the second-layer-single-spin space. Stated abstractly, the metric d ′ is the pull back-metric of the map η i → α η i (dσ i ) from single-site configurations in the second layer to single-site measures in the first layer. While this metric seems to be non-explicit, we will show in the rotator example how it can be estimated in terms of a more familiar metric (Euclidean metric).
It is a well-known heuristics which has been made precise in special cases that an investigation of the Gibbs property of the second layer measure must be based on an analysis of the first layer conditioned on configurations in the second layer [4; 5; 8; 12]. So, our estimates will naturally contain quantities that reflect this aspect. Let us put We note thatB ij is a uniform bound on the deviations of the initial Hamiltonian taken with a priori measures conditioned on the image configurations. The entriesC ij are upper bounds on the conditional Dobrushin matrix for the corresponding constrained first-layer model. We warn the reader not to confuse dev αη i ;i,j (H i ) with dev α;i,j (H i ). While the second quantity may be big and correspondingly the unconstrained first layer system in a non-uniqueness regime, the first one might still be small and correspondingly the constrained layer system in a uniqueness regime. This is e.g. the case for a time-evolution started at low temperature, for small times. Then we have the following theorem.
Theorem 2.5. Suppose that the first-layer system has an infinite-volume Gibbs measure µ = lim n µσ Λn obtained for a boundary conditionσ and along a suitable sequence of volumes Λ n .
Suppose further that sup i jC ij < 1.
Then the family γ ′ of finite-volume conditional distributions of the transformed system is a Gibbsian specification of goodness Remark: A formula for γ ′ is given below the statement of Proposition 4.7 in (63).
Note that the first layer system may be very well in a phase transition regime. For arbitrarily large interaction Φ, good concentration of the conditional measures α η i can still lead to a small B ij , which makes theC ij sufficiently small for the theorem to hold. In short: Uniform conditional Dobrushin uniqueness of the first layer implies Gibbsianness of the second layer, with explicit estimates.
The matrix Q describing the spatial loss of memory of the variation of the conditioning, depends on the decay properties of the Φ A 's and of theD kj 's. E.g. in the case of a finite range potential, D kj dominates the decay of Q ij . If we have sup i jC ij e λs(i,j) < ∞ where s(i, j) is an arbitrary distance on G, (15) provides us with an exponential decay estimate of the form for some constant C.
Note that the summability property we impose on the initial potential (2) implies the finiteness of (15). In particular we have the following bound on the entries of the Q-matrix where M is the matrix given by

Short-time Gibbsianness for time-evolved rotator models
We begin this subsection more generally with an introduction of the following useful property we will impose on the initial interaction. For this purpose we now assume additionally a metric structure.
Definition 2.6. Let (S, d) be a metric space. Denote by L ij = L ij (Φ) the smallest constants such that the j-variation of the Hamiltonian H i satisfies We say that Φ satisfies a Lipschitz property with constants (L ij (Φ)) i,j∈G×G , if all of these constants are finite.
Consider now the rotator model on G (a general graph with countable vertex set), with both first-layer and second-layer local spin spaces equal to S q−1 , the sphere in q-dimensional Euclidean space, with q ≥ 2.
Take as (formal) Hamiltonian of the first-layer system in infinite-volume where we assume that J ii = 0 for each i ∈ G. In the above we have used "·" to denote scalar product. Our model satisfies a Lipschitz property with constants L ij (Φ) = 2|J ij |. For G = Z d and d ≥ 3 such a model has been proved to exhibit long-range order under suitable assumptions on the Fourier-transform of J (see [25]).
Let K be given by where α o is the equidistribution on S q−1 and k t is the heat kernel on the sphere, i.e.
where ∆ is the Laplace-Beltrami operator (see e.g. p. 38, eq 54 of [2]) on the sphere and ϕ is any test function. k t is also called the Gauss-Weierstrass kernel. For more background on the heat-kernel on Riemannian manifolds, see the introduction of [24].
The time-evolved measure is given by It has the product over the equidistributions on the spheres as an infinite-time local limiting measure lim Convergence takes place exponentially fast on local observables, the decay rate given by the first eigenvalue of the Laplacian on the sphere (see also (115)).
Theorem 2.7. Denote by d(η, η ′ ) the induced metric on the sphere S q−1 (with q ≥ 2) obtained by embedding the sphere into the Euclidean space R q .
Assume that Then µ t is a Gibbs measure for some Gibbsian specification A is the matrix whose entries are given by A ij = e |J ij | |J ij | and 1 is the identity matrix.
For the definition of Q ij (t) in (25) we have used the bound (16).
The proof of the theorem follows from three ingredients: 1) Theorem 2.5 which gives a continuity estimate in terms of the posterior metric d ′ , 2) a comparison result between d ′ and d, and 3) a telescoping argument over sites in the conditioning. Here is the comparison result.
Proposition 2.8. There is an estimate of the posterior metric d ′ associated to the measure K t of the form Remark: The proof of the proposition uses a coupling argument and a reflection principle for diffusions on the sphere under reflection at the equator. A refinement of the estimates on d ′ in terms of an expansion of Legendre polynomials is found in Proposition 5.3 in the Appendix.
Let us come back to the discussion of Theorem 2.7 and explain the form of the bounds giving an idea of the proof. It is straightforward to apply Theorem 2.5 to our model and obtain a result formulated in d ′ . However, a more natural metric we would prefer to use is d, and so we should use Proposition 2.8. What continuity estimates do we expect to gain from this? It is not difficult to see by telescoping and using the standard interpolation trick also employed in the proof of Theorem 2.3 that for the initial kernel We see that continuity can be measured in terms of d, due to the Lipschitz property of the initial Hamiltonian, and the spatial decay is provided by the decay of the couplings.
So, at small time t, we are aiming at a similar continuity estimate to hold which is uniform in t as t goes to zero. Now, while estimating d ′ against d we have accumulated a nasty factor 1 √ t that blows up when time t goes to zero. We note that this is not just an artifact of Proposition 2.8, but the posterior metric between two points on the sphere indeed blows up like 1 √ t , as can be seen from the proof. At first sight this does not seem to be a problem in the definition of Q ij (t) because the off-diagonal entries of the matrixD ij (t) are suppressed by the same factor proportional to √ t that appears in (16). This suppression follows from a bound on the corresponding Dobrushin matrix of this order. Unfortunately the diagonal terms ofD(t) give rise to blow-up for sites i and j that are within the range of the potential. As it is clear from the proof, this blowup is understandable since so far we did not employ any continuity properties of the initial Hamiltonian w.r.t. the Euclidean metric. Let us mention without details that the blow-up of D(t) really occurs in a system of two sites when the Hamiltonian is a step function. This is shown by a computation and indicates that continuity of the Hamiltonian is needed. Now, to disentangle these local effects from the global effects treated so far, we use in the third step a telescoping argument over the conditioning. Exploiting the Lipschitz-property (17) we obtain the second term in the minimum in (24) which puts a time-independent ceiling to the blow-up for small times. This solves the blow-up problem.

Gibbsianness for local approximations
As another consequence from the general theorem we prove that any sufficiently fine local coarse graining preserves the Gibbs property. Here the fineness of the coarse graining has to be compared with the scale in the local state spaces on which the initial Hamiltonian is varying.
We thus need a bit more structure, namely let (S, d) now be a metric space. Let a decomposition of S be given of the form S = η i ∈S ′ S η i . Here S ′ may be a finite or infinite set. Put T (s) := η i for S η i ∋ s. This defines a deterministic transformation on S, called the fuzzy map. We assume that α(S η i ) > 0 for all η i . For each η i ∈ S ′ the corresponding conditional a priori measure on S η i is given by Then our result is the following.
Theorem 2.9. Assume the interaction Φ has the Lipschitz-property (17). Suppose further that Then, for any initial Gibbs measure µ of the potential Φ with an arbitrary a priori measure α, the transformed measure T (µ) is Gibbs for a specification γ ′ of goodness (Q, d 0 ). Here d 0 is the discrete metric and Q ij is given in formula (15) where we have to substitutē Remark 1: Answering a question of Aernout van Enter, this provides a class of examples where S and S ′ are different (one may be continuous, the other not), the initial measure may be in the phase transition regime, and the image measure will be Gibbs. To think of an even more concrete example, let us take the rotator-model (18). Divide the sphere Then the corresponding discretized model on the country-level is still Gibbs whenever there is no country with diameter bigger than sup i j∈G e |J ij | |J ij | −1 .
Remark 2: Theorem 2.9 can be seen as a stability result for Gibbs measures. This is interesting, since topologically speaking "most measures are not Gibbs measures" (see section 4.5.6. [11]): Although it is true that, on the one hand translation-invariant Gibbs measures for finite range potentials on a compact metric local spin space on the lattice are dense in the space of translationinvariant probability measures w.r.t. the weak topology, they form a thin set nevertheless: Israel has proved that the set of all translation-invariant Gibbs measures is a set of first category, that is a countable union of sets which are nowhere dense (i.e. whose closure has empty interior) in all translation-invariant measures. Now, our result implies that, for any Φ with sup i j L ij (Φ) < ∞, and any fixed µ ∈ G(Φ) the following is true. There exists a ρ 0 > 0 such that the set of all T µ, where T runs over all local coarse-grainings with fineness not bigger than ρ 0 , is contained in the Gibbs-measures. So we have shown that around any such Gibbs measure µ there is at least a "ball of local transforms" that lies in the Gibbs measures.
Remark 3: In fact, the theorem even holds when ρ is replaced by ρ ′ < ρ given by which takes advantage of possible concentration of the conditional measures.

Remark 4:
Let us mention that we may very well apply our method also to other well-known examples of transforms of Gibbs measures that may potentially lead to renormalization group pathologies. This could be investigated in a future paper. For instance, also the decimation transformation mapping a Gibbs measure on the lattice to its restriction to a sublattice can be cast in this framework. Theorem 2.5 then implies the statement that the projected measure is always Gibbs if the interaction is sufficiently small in triple norm. The posterior metric for configurations on the projected lattice then becomes the discrete metric d ′ (η i , η ′ i ) = 1 η i =η ′ i and hence the matrix element Q ij becomes a bound on the Dobrushin interdependence matrix of the image system.

Proof of Theorem 2.and related results:
We start with the Proof of Theorem 2.3: We begin as in the proof of Proposition 8.8 of [1], estimating the variation of the single-site measure at a given site i ∈ G when varying the boundary condition at some site j ∈ i c . Fix ζ, η ∈ Ω with ζ j c = η j c and put u 0 (σ i ) = −H i (σ i ζ i c ) and u 1 (σ i ) = −H i (σ i η i c ). Taking linear interpolation u t = tu 1 + (1 − t)u 0 of u 1 and u 0 , it follows Setting h t = e ut /α(e ut ) and λ t (dσ i ) = h t (σ i )α(dσ i ) we note that λ 0 (dσ i ) = γ i (dσ i |ζ) and λ 1 (dσ i ) = γ i (dσ i |η). Furthermore, setu = u 1 − u 0 and observe that Next we bring the deviation into play by writing from which it follows by resubstituting the Hamiltonian that which also implies the desired estimate on the Dobrushin constant c.

2
One may criticize the use of oscillations in the exponent to compare the interpolating measures which is done to arrive at easily computable expressions. Of course, we are free to (numerically) compute the exact Dobrushin-matrix always. Note to this end that, writing F i for the Boltzmann-weights at site i, i.e.
we can express the Dobrushin matrix simply as If we want to do slightly better, also the estimate can be used. The r.h.s. might be favorable in some models over the original expression for C ij involving the absolute value of a difference of densities because it might lead to explicitly solvable integrals. To see the validity of (33) re-enter the proof, and use Schwartz to notice that Sometimes it is useful to use quadratic variation instead of the linear variation dev α;i,j to obtain an explicit bound, as we shall see in the proof of Theorem 2.7 below. To this end we introduce the following "quadratic variation".
Definition 3.1. For any bounded measurable function F on Ω and for any pair i = j ∈ G the α; i, j−quadratic variation of F is defined by Clearly dev α;i,j (F ) ≤ std α;i,j (F ) and so we could bound the inequality in Theorem 2.3 in terms of the quadratic variation; going directly into the proof however gives us a better result as in the following "quadratic version" of Theorem 2.3.
Proposition 3.2. The Dobrushin constant c is also bounded by Proof: Proceed as in the proof of Theorem 2.3 but use the Schwartz inequality on 2 Along these lines it is easy to obtain the following.

The proof of Theorem 2.5 and related results
The purpose of this section is to give the proof of Theorem 2.5 outlined in Section 2.3 of the introduction. The main ingredient to the proof is to show the lack of phase transitions in some intermediate system and exploit the consequences for the decay of spatial memory. Recall from Section 2.3 that our initial system was given by the Gibbs measure µ admitted by the specification γ obtained from the interaction Φ and an a-priori measure α(·) = K(·, dη i ) described above. Thus for a given boundary conditionσ and any finite volume Λ ⊂ G we write γ Λ (·|σ) ∈ γ as We now introduce a double-layer system or joint system by coupling the initial system to a second system (with single-spin space S ′ ) through the site-wise joint measures K(dσ i , dη i ) on S × S ′ . Denote byγ the specification of our new double-layer system, i.e. for a fixed boundary conditionσ ∈ Ω and a finite volume Λ ⊂ G,γ Λ (·|σ) ∈γ is given bỹ where K(dη i |σ i ) denotes the K conditional distribution of the second spin given the value of the first. This specification is in general not Gibbs but in our case where we only have site-wise dependence between the two layers it is known for instance from [8] and references therein that γ is Gibbs.
For each non-empty subset Λ of G we denote by S Λ the collection of all non-empty finite subsets of Λ. We will write S instead of S G . For any fixed configuration σ ∈ Ω and any Λ ∈ S we define the finite-volume transformed distribution γ ′ Λ,σ as It is important to note that in the joint system considered above, conditionally on the σ's the η's are independent. But taking the σ-average of the joint system creates dependence among the η's. Due to this dependence we now introduce finite-volume η conditional distributions by freezing the η configuration in the definition of γ ′ Λ;σ except at some region ∆ ∈ S Λ . That is for any Λ ∈ S with |Λ| ≥ 2 and ∆ ∈ S Λ we have The natural question that comes to mind is whether lim Λ↑G γ ′ ∆,Λ;σ (dη ∆ |η Λ\∆ ) exists for any fixed ∆ ∈ S Λ ,σ ∈ Ω andη ∆ c ∈ (S ′ ) ∆ c ? If this limit exists we will denote it by γ ′ ∆ (dη ∆ |η ∆ c ) (whose explicit form is given in (63)) and γ ′ by the class of all the conditional distributions for finite ∆. We shall show under some regularity conditions that γ ′ is a Gibbs specification. For the sake of simplicity we will always restrict our analysis to the case where ∆ is a singleton. The analysis for general (but finite) ∆ can be implemented by appropriately decomposing the Hamiltonian H Λ (as given in (42)) and using the same arguments used in the singleton case. Generally, it is known for instance [1; 23] and references therein that one can construct finite-volume conditional distributions of a specification from their corresponding single-site ones. It is our aim to provide a sufficient condition for the conditional probabilities γ ′ i,Λ;σ (dη i |η Λ\i ) to have an infinite-volume limit. For this we introduce the decomposition of the Hamiltonian H Λ in the finite window Λ into its contributions coming from the sites in Λ \ i and site i for any i ∈ Λ as follows; We clearly see from the definition of an interaction that the Hamiltonian H Λ\i is a function on the configuration space S i c . For the infinite-volume transformed conditional distributions for someσ = S i c and η Λ ∈ (S ′ ) Λ .
It is restricted because we only consider the spins in the sublattice i c and constrained since we have frozen the configuration in the second layer. The RCFLM (as we will see from the lemma below) will provide us with a sufficient condition for the existence of an infinite-volume limit γ ′ i (dη i |η i c ) for the conditional probabilities γ ′ i,Λ;σ (dη i |η Λ\i ).
Remark: Note that the RCFLM µσ Λ\i [η Λ\i ] is the model on the sublattice i c corresponding to the a-priori measure α and finite-volume HamiltoniansH Λ\i given bȳ The conditions imposed on k implies that for any given η ∈ Ω ′ the RCFLM results from an absolutely summable interaction guaranteeing the Gibbsianness of the RCFLM.
We state a result concerning a representation of the finite-volume conditional distributions for the transformed system in terms of the RCFLM.
Lemma 4.2. Let Λ ∈ S with |Λ| ≥ 2, then for any i ∈ Λ and anyσ ∈ Ω we have Proof: By using the decomposition of H Λ in (42) we can write γ ′ i,Λ;σ (dη i |η Λ\i ) as; The claim of the lemma follows by multiplying the expression for γ ′ i,Λ;σ (dη i |η Λ\i ) above by and simplifying the resulting expression.

2
The above lemma can easily be extended to any finite subset Γ ∈ S Λ to obtain As an immediate consequence of the above lemma we fix the following infinite-volume result concerning the existence of the infinite-volume kernel γ ′ i (dη i |η i c ).

Proposition 4.3.
Suppose Λ n is a sequence of finite subsets of G with Λ n → G as n tends to infinity,σ ∈ S i c and η ∈ S ′ i c are such that RCFLM µσ Λn\i [η Λn\i ] has an infinite-volume limit µσ i c [η i c ] (in the quasilocal topology, i.e. on uniform limits of local functions). Then the singlesite finite-volume conditional distribution γ ′ i,Λn;σ (dη i |η Λn\i ) for the transformed system has an infinite-volume limit γ ′ i,σ (dη i |η i c ) given by For each η configuration the family of finite-volume conditional distributions µσ Λ\i [η Λ\i ] constitutes a quasilocal specification guaranteeing the hypothesis of the above proposition. The assertion of the proposition follows immediately from Lemma 4.2 and the choice of topology.
A similar statement is observed in the corresponding general mean-field set-up in [18] where also a sufficient condition for the existence of infinite-volume transformed kernels is given in terms of the uniqueness of global minimizers for a (constrained) rate function. We now state a result concerning an upper bound on Dobrushin's constant for the RCFLM.
Proposition 4.4. Let the Dobrushin's interdependence matrix for the RCFLM on i c o for some fixed site i o ∈ G be the matrix whose entries are given by for any pair i, j ∈ i c o where we have denoted µζ i by the single-site part of µζ Λ\io . Then we have; where α η i (dσ i ) = K(dσ i |η i ). Furthermore, defining the Dobrushin constant c ′ [η] for the RCFLM as we also have Additionally, if a notion of translation can be defined on G and the initial interaction is translation-invariant, then the last inequality is an equality.
Proof: The proof follows the same lines as the proof of Theorem 2.3 but here we use α η i = K(·|η i ) instead of α .

2
It is also not hard to deduce from Proposition 3.2 that; Again Lipschitzness of the initial Hamiltonian carries over nicely to yield; Corollary 4.5. Suppose that H i satisfies the Lipschitz-condition (17). Then we have The claim of the corollary follows from Corollary 3.3.
We now proceed to prove Theorem 2.5, but before we do this we still need some results from which the proof will follow. As a first step we recall some known results about Dobrushin's uniqueness concerning an estimate of the distance between the unique Gibbs measure admitted by a Gibbs specification satisfying Dobrushin's condition and another Gibbs measure corresponding to some other specification. This estimate tells us the local variation between the two infinite-volume probability measures. This result which we state in the proposition below can be found for example in [1] as Theorem 8.20. Before we state this result let us fix some notations. Suppose C(γ) is the Dobrushin interdependence matrix of a specification γ and C n (γ), n ≥ 0, the n'th power of C(γ), then we define the matrix Proposition 4.6. Let γ andγ be any two specifications with γ satisfying Dobrushin's condition. Suppose that for each i ∈ G we have a measurable function b i on the standard Borel space Ω with the property that for all σ ∈ Ω. Then for µ ∈ G(γ) andμ ∈ G(γ) we have for all quasilocal functions f .
Observe from Proposition 4.3 that if the RCF LM satisfies Dobrushin's condition uniformly in η the infinite-volume single-site kernels γ ′ i (·|η c i ) exist for every η independent of the boundary conditions used for the initial system. We will adapt the result in Proposition 4.6 to our present set-up to compare γ ′ i (·|η i c ) and γ ′ i (·|η i c ) for any pair of configurations η,η ∈ Ω ′ = (S ′ ) G . Further we denote by γ[η i c ] the specification of the RCFLM with full η i c configuration. Again we assume for the first-layer model that µ = lim n µσ Λn as in the hypothesis of Theorem 2.5.
2. For any pair η i c ,η i c ∈ (S ′ ) i c we have for any j = i that where the γ j [η j ](·|σ i c )'s are the single-site parts of the specification for the RCFLM for i ∈ G and η i c .

Given
it follows that 4. Furthermore, for any k = i it is the case that

And finally
Remark: In particular, we can write for any finite volume the corresponding relation for the finite-volume conditional distribution for the transformed system with full η-conditioning as in (58), i.e. if ∆ ∈ S then we have This form follows by starting from a finite-volume say Λ which contains ∆ and decompose the Hamiltonian H Λ as in (42) but this time i has to be replaced with ∆. Generally one can construct non-singleton parts of a specification from their single-site ones [1; 23].
Proof of Proposition 4.7: We will prove the assertions of the proposition in the following order: 4, 5, 1, 2 and 3.

Recalling that
we estimate for any pair of configurations σ andσ that coincide except at site k that where we have used the fact that |e x − e y | ≤ |x − y|e max{x,y} .

5.
Take a measurable function ϕ : S ′ → R, with |ϕ| ≤ 1 and consider where we have set

By adding and subtracting
to the right hand side of (65) and making use of the fact that ||ϕ|| ∞ ≤ 1 (after an application of the triangle inequality and Fubini's theorem) yield Note that we made no use of the item 2 of the proposition to arrive at this bound.
1. The proof follows from a two-step limiting procedure. We fix an η-conditioning only in a finite volume Γ and construct the infinite-volume measure of the RCFLM by fixing a boundary condition on the first layer outside Λ (which we assume for simplicity to contain Γ) and let Λ tend to infinity. Then we let Γ also tend to infinity, and recover the conditional probabilities by Martingale convergence and uniform approximation of the infinite-volume RCFLM, with conditionings only in volume Γ.
More precisely, it follows as in Lemma 4.2 that we have for finite-volume conditionings the representation On the r.h.s. we see a RCFLM µσ Λ\i [η Γ\i ] appearing with constrained measure α η i only in the volume Γ \ i, i.e.
As was shown above the RCFLM has an infinite-volume limit even when there is full η conditioning, so in this case where we have conditioning only in a finite region there is no issue with the existence of an infinite-volume limit. Hence, the conditional distribution γ ′ i,Λ,σ (dη i |η Γ\i ) has an infinite-volume limit γ ′ i,σ (dη i |η Γ\i ), for any arbitrary conditioning η Γ\i , since is a bounded quasilocal function in σ for each η i . Note that this conditional distribution still depends on the boundary conditionσ when the initial specification is in the phase transition regime. Let us denote the corresponding specification of the RCFLM with η-conditioning only in Γ \ i by γ[η Γ\i ]. It follows from the arguments leading to the proof of (62) that Observe further that By using the above inequality and the following facts: (1) for each η configuration the family of finite-volume conditional distributions for the RCFLM is a quasilocal specification, (2) the assumption that the RCFLM with full η-conditioning satisfies Dobrushin's condition uniformly in η and (3) the comparison criterion in Proposition 4.6 we obtain Note that in the above we have taken µ i c [η Γ\i ] and µ i c [η i c ] to be the infinite-volume Gibbs measures for the RCFLM with η-conditioning in Γ \ i and i c respectively.
Taking now the limit Γ ↑ G we get (58), by local convergence of the RCFLM in Γ to the full one, and by the backwards martingale convergence theorem (see p.472 [26]).

2.
The proof of assertion 2 uses the definition of the single-site part of the RCFLM and arbitrary measurable function g : Ω → R, with |g| ≤ 1 to define The rest of the proof follows by adding and subtracting the following quantity to the expression under the absolute value sign in (72), rearranging terms and simplifying appropriately.
3. It follows from (56) and (57) of Proposition 4.6 that since by definition of H i , h 2 is a quasilocal function on S i c . The rest of the proof of 3 follows from the bound on δ k (h 2 ) given in statement 4 of the Proposition.

2
Note from the proof of statement 5 of the above Proposition that the denominator in (66) can as well be µ i c [η i c ](h 2 ) if one adds and subtracts from the right hand side of (65) , as was the case in the above proof.
But any of the two makes no difference since in our estimate we don't make use of the actual integral of h 2 but instead we utilize its uniform norm. Having disposed of the results above, we now return to the Proof of Theorem 2.5: We divide the proof into two steps; namely, (1) we proof that the class γ ′ is a Gibbsian specification and (2) justify the form of the goodness of γ ′ as given in the theorem. We start with the proof of the latter.
(v) We are now show that γ ′ is quasilocal, i.e. we need to show that for any bounded quasilocal observable f and any configuration η for all finite-volumes Γ. So for any bounded quasilocal observable f observe from (7) that But for any of such Γ observe from the proof of (62) and (60) of Proposition 4.7 and the proof of assertion 2 of this theorem that (vi) It follows from (63) that for any finite volume ∆, γ ′ ∆ is given by It follows from the hypothesis on the joint a-priori measure K that for any A ∈ F ′ ∆ and any η ∈ Ω ′ that This proves the uniform non-nullness of γ ′ since the interactions defining the Hamiltonian H ∆ is quasilocal.

2
It follows from the proof of Theorem 2.5 that for any finite-volume Γ, whenever the RCFLM satisfies Dobrushin's condition uniformly in η then we have the following continuity estimates Thus under the conditions of the joint a-priori measure K, γ ′ is a specification and the RCFLM satisfying Dobrushin's condition uniformly in η is sufficient for γ ′ to be Gibbsian.
Next we present the proof of Theorem 2.9 Proof of Theorem 2.9: This Theorem is an application of Theorem 2.5. The only quantities we have to worry about are the entries of the Dobrushin interdependence matrixC. It follows from the hypothesis of the Theorem; namely the continuity property of the interaction and the terms in the bound on c ′ [η] in Corollary 4.5 that where ρ η i := diam(S η i ).
To obtain the desired bound we have to evaluate all the quantities appearing in the above estimate on c[η]. We start with the evaluation of the quantity Observe that in this set-up S = S ′ = S q−1 . Take a i = e q = (0, · · · , 0, 1), the qth canonical basis element for the q-dimensional Euclidean space. Then it is elementary to see that where we have set σ i = (σ 1 i , · · · , σ q i ). To compute the above integral we denote by Z q t the q-th coordinate of a diffusion on the sphere started at Z q t=0 = 1 ( the "north-pole") and denote the corresponding expectation by E. Thus for any η i we have; The first equality uses the idea that Brownian motion on the sphere is rotation invariant and consequently choosing η i = a i . To see the last equality use either an explicit form of the qth component of the transition kernel k t (115) in polar coordinates and orthogonality of Legendre polynomials as in [2]. Or use that the generator of the diffusion Z q t given by the u-dependent parts of the Laplace-Beltrami operator on the sphere (see e.g. p. 38, eq 54 of [2]) which reads Then it follows from stochastic differential equations (SDE) theory (see e.g. chp5 of [15]) that where B t is a one-dimensional standard Brownian motion. The solutions of this SDE satisfy the equation Solving the above differential equation with the initial condition Z q t=0 = 1 yields This concludes the justification of the second equality in (85).
The last quantity we want to evaluate is A⊃{i,j} δ(Φ A ). But with the rotator model we are considering this quantity is equals to 2|J ij |. So putting all the above together we get The above estimate on c ′ [η] is uniform in η. We denote byC(t) the matrix with entries We will in our analysis with the abuse of notation refer to this matrix as the Dobrushin interdependence matrix for the RCFLM of our current situation.
1. As we saw in the proof of Theorem 2.5, γ ′ t is a specification and a sufficient condition for it to be Gibbsian is sup i∈G j∈GC i.e. the RCFLM satisfies Dobrushin's condition uniformly in η. This concludes the proof of (1).
2. The proof follows from two-step estimation procedures. The first uses the continuity estimate on γ ′ i as given by Theorem 2.5 and the second uses a telescoping argument. As our first step we adapt the continuity estimate on γ ′ i in Theorem 2.5 to the current set-up to obtain a continuity estimate for γ ′ i,t . We only have to worry about the goodness matrix Q(t) and the posterior metric d ′ . Here we take the entries of the goodness matrix Q(t) to be the bounds on Q ij given in (16), i.e.
Observe further from Proposition 2.8 that the posterior metric d ′ has an estimate where d is the Euclidean metric. Therefore putting all the above together we arrive at the following continuity estimate for γ ′ i,t ; The next estimation follows from a telescoping argument involving the sites in i c . The main result in this direction that we will employ in our proof is formulated in the lemma below.
Lemma 5.1. For each non-empty finite subset V 1 ⊂ i c we have the following estimate Note from the second term in the above bound that the conditionings coincides in the chosen finite volume V 1 . We proceed by applying the Lemma 5.1 to obtain a similar bound for Successive application of Lemma 5.1 along such sequence of pair-wise disjoint non-empty finite subsets V n such that ∪ n V n = i c yields the desired result.
whereη l−1≤ η l> is the configuration that coincides withη on n −1 Λ {1, · · · , l − 1} and η on G \ n −1 Λ {1, · · · , l − 1} ∪ {i, j} and f (η i , η j |η l−1≤ η l> ) is given by (93) if we appropriately replace i in (93) with {i, j}. The validity of (97) follows for the following considerations; Therefore by setting Let R be a rotation such that Rη j = η j and set σ ′ j = Rσ j . Then it follows from and similarly where c j = k∈G |J jk |. It follows from (98) and the rotation invariance of K t that The above estimate follows by applying the rotation R to theη j in the r.h.s. of (98). Furthermore, it is not hard to deduce that Therefore it follows from (97) that Hence for any 1 ≤ l ≤ |Λ| we have Comparing (96) and (103) it is clearly seen for any 1 ≤ l ≤ |Λ| with j = n −1 Λ (l) that which proves the lemma.
To obtain the desired bound on the posterior metric as given in Proposition 2.8 we need to solve the diffusion equation on the sphere S q−1 generated by the Laplace-Beltrami operator. This bound is further used in Theorem 2.7 to replace the posterior metric d ′ in the goodness for the Gibbs specification of the corresponding transformed system (as provided by Theorem 2.5) with the Euclidean metric d. We employ stochastic differential equation (SDE) technique to arrive at the bound of interest. It turns out that we only need the qth component of the diffusion to get the desired bound. We employ coupling of reflected diffusions on the sphere with the equator as the mirror as our main tool. To obtain the desired bound we make use of the coupling time which is the first time the qth component of the diffusion visits zero. This necessitates focusing attention on only the qth component. We start with the following lemma.
Lemma 5.2. 1. Denote by Z t the qth-component of the diffusion on the sphere S q−1 for q ≥ 2, started at a value y:=sin ϕ 0 with ϕ 0 ∈ (0, π 2 ). Then there is a coupling of Z t to a Brownian motion on the line, B t , such that the first passage time of Z t at zero, denoted by T 0 (Z q · ), is dominated from above by the first passage time 2. Consequently, independently of the dimension q − 1, there is the estimate where G is a standard normal variable.
where x = d(η j ,η j ) is the Euclidean distance between η j andη j . It follows from Lemma 5.2 that, for any q ≥ 2, Using P 0 ≤ G ≤ u ≤ u √ 2π by concavity and arcsin y ≤ π 2 y for 0 ≤ y ≤ 1 we obtain Note that in both of the last estimates the constants were sharp.  in terms of Legendre polynomials P n (q, s) of degree n in dimension q (see Definition 5.4)and N (q, m) is also the dimension of the space of spherical harmonics of degree n in dimension q (see (115)).
Before we prove Proposition 5.3 let us fix the following notations starting with the definition of q-dimensional Legendre polynomials (see [2]).
Definition 5.4. The Legendre polynomial P n (q, ·) of degree n in dimension q ≥ 2 is given by the Rodrigues formula where −1 ≤ s ≤ 1.
The last equation indicates that the Legendre polynomials are eigenfunctions for the eigenvalue problem for the qth component of the Laplace-Beltrami operator on the sphere S q−1 . Using the above fact and the separation of variables method we can write the transition (heat) kernel k t of the qth component Z q t of the diffusion on the sphere S q−1 as 2 du (which is the q-coordinate projection of the invariant surface measure on the sphere) over the interval [-1,1] is equal to one. We collect the following two lemmas from which the proof of Proposition 5.3 will follow.
Lemma 5.5. For the diffusion on a sphere there is an estimate of the posterior-metric d ′ (η j ,η j ) at fixed t in terms of d(η j ,η j ), the induced metric on the sphere S q−1 obtained by embedding the sphere into the Euclidean space, given by with the function Proof: The idea of the proof is to construct a coupling of two diffusions on the sphere starting at the points η j andη j . By rotation invariance of such diffusions we assume that η j andη j are mirror images of each other under reflection at the equatorial plane. Then we construct a coupling by reflection [13] of the path started at η j with the equator as the mirror line, up to the time when the diffusion hits the equator. After that the two diffusions move on together. In this way the coupling time for the two diffusions is the same as the first time Z q t = 0 (the first passage time T 0 to level 0 given by T 0 := inf{t ≥ 0, Z t = 0}) for either Z q 0 = z or Z q 0 = −z where z = ε q · η j (here ε 1 , · · · , ε q constitute the canonical orthonormal basis for R q and " · " is the usual scalar product ). We know from coupling theory [13] that d ′ (η j ,η j ) ≤ 2P x 2 (T 0 > t) = F q,t (x), where x = d(η j ,η j ) is the Euclidean distance between η j andη j . Further it follows from the reflection principle of Désiré André ( [15],pp.79-81 and [16],p.293) that P x 2 (T 0 ≤ t) = 2P The heuristic argument for the first equality in the above equation is as follows; the probability that the first passage time T 0 (to a level 0 for a 1-dimensional diffusion starting at some initial point y > 0) is less or equal to t is the sum of the probabilities of the events that T 0 ≤ t and Z q t < 0, and T 0 ≤ t and Z q t > 0. The probability for the first event is the same as the probability for the event that the 1-dimensional diffusion Z q t starting at y is below the level 0. For the probability of the second event observe that after the diffusion reached level 0, it has equal probability to reach level −c below 0 or level c above 0 since the diffusion in our set-up is symmetric about 0. Hence the probability of the second event is the same as the first due to the symmetry of Z q t about 0. From here follows the first equality of the expression for F q,t in the lemma.
It follows from the orthogonality property of the Legendre polynomials that for each positive even integer n the integral

2
We have seen from the above proof that for positive even integers n the integral (over [-1,0] and w.r.t to the invariant measure (1 − s 2 ) q−3 2 ds) of the Legendre polynomial of degree n is always equal to zero, as long as the dimension q ≥ 2. The integral for the corresponding odd degree case can also be computed explicitly and we formulate this explicit value of the integral as our next lemma.
Lemma 5.6. For any odd integer 2m+1 (m=0,1,2,....) the integral of the Legendre polynomials P 2m+1 (q, ·) over the interval [-1,0] is given by Note that for each m the above differentiation(s) will always involve terms which are multiples of (1 − s 2 ). This implies that evaluating the above expression at s = −1 will always yield zero.
However, it follows from Binomial expansion of 1 − s 2 r (where r = 2m + q−1 2 ) that The rest of the proof follows from the observations that (2m)! = 2 m m! m i=1 (2i − 1) and .

2
Proof of Proposition 5.3: The proof follows from Lemma 5.5 and 5.6. 2