Convergence to quasi-stationarity through Poincar\'e inequalities and Bakry-Emery criteria

This paper aims to provide some tools coming from functional inequalities to deal with quasi-stationarity for absorbed Markov processes. First, it is shown how a Poincar\'e inequality related to a suitable Doob transform entails exponential convergence of conditioned distributions to a quasi-stationary distribution in total variation and in $1$-Wasserstein distance. A special attention is paid to multi-dimensional diffusion processes, for which the aforementioned Poincar\'e inequality is implied by an easier-to-check Bakry-\'{E}mery condition depending on the right eigenvector for the sub-Markovian generator, which is not always known. Under additional assumptions on the potential, it is possible to bypass this lack of knowledge showing that exponential quasi-ergodicity is entailed by the classical Bakry-\'{E}mery condition.


Notation
For a general metric space (F, d): • M 1 (F ) : Set of the probability measures defined on F .
• P p (F ) : Set of the probability measures defined on F such that F d(x 0 , x) p µ(dx) < +∞, where x 0 ∈ F is arbitrary.
• B(F ) : Set of the measurable bounded functions defined on F .
• L 2 (µ) : Set of the functions such that F |f | 2 dµ < +∞, endowed with the norm · L 2 (µ) : f → F |f | 2 dµ. • For two probability measures µ and ν, the notation µ ≪ ν means that there exists a density function f such that and this density function will be denoted by dµ dν . • For any positive measure µ and any measurable function f such that µ(f ) < +∞, denote f * µ the probability measure defined by

Introduction
Consider a time-homogeneous Markov process (X t ) t≥0 defined on a metric state space (E ∪ {∂}, d), where the element ∂ ∈ E is a cemetery point for the process X, which means that where τ ∂ := inf{t ≥ 0 : X t = ∂} is the hitting time of ∂. We associate to the process (X t ) t≥0 a family of probability measures (P x ) x∈E such that, for any x ∈ E, P x (X 0 = x) = 1. For any µ ∈ M 1 (E∪{∂}), denote P µ := E P x µ(dx). Then, under P µ , the law of X 0 is µ. Finally, the expectations E x and E µ are respectively associated to P x and P µ . Moreover, assume that, for any x ∈ E, P x [τ ∂ < +∞] = 1, and P x [τ ∂ > t] > 0, ∀t ≥ 0.
A natural notion to study considering absorbed Markov processes is the notion of quasi-stationarity, dealing with the weak convergence of the probability measures P µ (X t ∈ ·|τ ∂ > t) when t goes to infinity. It is well-known that, if such a convergence holds for a given initial law µ, then the limiting probability measure α satisfies P α (X t ∈ ·|τ ∂ > t) = α, ∀t ≥ 0.
Such a probability measure is called a quasi-stationary distribution and can be understood as an invariant measure for the semi-flow (φ t ) t≥0 defined by For a general overview on this theory, we refer the reader to [12,23,31], where it is shown that, defining the sub-Markovian semi-group (P t ) t≥0 as α ∈ M 1 (E) is a quasi-stationary distribution if and only if there exists λ 0 > 0 such that αP t := P α (X t ∈ ·, τ ∂ > t) = e −λ 0 t α, ∀t ≥ 0.
In other words, quasi-stationary distributions are left eigenvectors for the operators P t , associated to the eigenvalues e −λ 0 t . Hence, quasi-stationarity can be dealt with through spectral methods, and existence and uniqueness of quasi-stationary distributions has been shown by this way for several processes, such as discrete-time Markov chains [13,28], birth-death processes [7,18,30] and diffusion processes [6,19,21,22,29]. More recently, other methods were developed in order to study quasi-stationarity. These methods aim to obtain exponential convergence towards quasi-stationary distributions for some processes and are based on well-known probabilistic tools coming from the framework without absorption, such as Doeblin's condition or Lyapunov functions (see [24] for an overview on these tools). In particular, in [8], necessary and sufficient conditions for the uniform-in-law exponential convergence in total variation are provided, where we recall that the total variation distance of two probability measures µ, ν is defined by Since, other papers showed exponential convergences in total variation under weaker assumptions, allowing convergences in total variation holding non-uniformly in the initial measure. In particular, we refer the reader to [9,32] for the study of absorbed Markov processes, and [4,11,16] for the study of general renormalized Feynman-Kac semi-groups. For non-absorbed Markov processes, the rate of convergence towards invariant measures can also be studied through functional inequalities, such as Poincaré inequalities. A probability measure π is said to satisfy a Poincaré inequality if there exists a constant C > 0 such that, for any f ∈ D(E), where Var π (f ) := E (f − π(f )) 2 dπ, L is a generator which cancels π, and D(E) is the set of the measurable functions such that is well-defined. We refer the reader to [3,27] to go further about Poincaré inequalities. The inequality (3) is actually equivalent to the exponential decay of the χ 2 -divergence between the semi-group µe tL and π, the χ 2 -divergence being defined as follows : In particular, this implies an exponential decay of the total variation distance between µe tL and π when the quantity χ 2 (µ|π) is finite. In the literature, some papers dealing with the use of Poincaré inequalities for quasi-stationarity have been already written, in particular for Markov processes living on discrete state spaces ( [14,15]). However, the proofs provided by these papers strongly rely on the discrete aspect of the state space, and are therefore hardly applicable for processes living on continuous state space, such as diffusions processes. Our aim will be therefore to show how to use such inequalities to get exponential convergence towards quasi-stationarity for such processes. In particular, the convergence in total variation will be studied, as well as the convergence in 1-Wasserstein distance, which is defined as where Π(µ, ν) is the set of all the couplings (X, Y ) such that the law of X (respectively Y ) is µ (respectively ν). We refer to Theorem 3 in Section 2 for the general statement and Corollary 1 for the convergence in 1-Wasserstein distance.
In the third and last section, we will be more particularly interested in quasi-stationarity for diffusion processes (X t ) t≥0 living on a domain D ⊂ R d , absorbed at the boundary ∂D, and satisfying on D the stochastic differential equation where (B t ) t≥0 is a d-dimensional Brownian motion and V is a C 2 -function on R d . In the non-absorbed framework, it is well-known that the reversible probability measure γ(dx) := Z −1 e −V (x) dx (Z is the renormalization constant) satisfies a Poincaré inequality when the condition is satisfied for a given κ > 0. This last result is a consequence from the one shown by Bakry andÉmery in [2] and the condition (5) is usually called the Bakry-Émery condition or curvaturedimension condition. In particular, under (5), the diffusion process (X t ) t≥0 converges towards γ in total variation and in 1-Wasserstein distance. Our goal is therefore to recover this property of convergence in the quasi-stationary framework through a condition similar to (5). More precisely, the following result is obtained in Section 3 : Theorem 1.
• Assume that there exists η positive on D, vanishing on ∂D such that γ(η 2 ) < +∞ and there exists λ 0 > 0 such that • and assume that there exists κ > 0 such that Then there exists C > 0 such that, for any µ ∈ M 1 (D) and t ≥ 0, where we recall that the notation f * µ is defined previously in (1) in Notation. Moreover, if D (1 + |x|) 2 e −V (x) dx < +∞, the inequality (6) holds in 1-Wasserstein distance for t large enough.
A more specific study will focus on multi-dimensional diffusion processes living on D = (0, +∞) d and absorbed when one component is 0. In this particular case, and assuming moreover that V can be expressed as where, for all i, V i are C 2 -functions, one has the following result : then there exists a quasi-stationary distribution α = α 1 ⊗ · · · ⊗ α d ∈ M 1 (D) and C d > 0 (depending on the dimension d) such that, for any µ ∈ M 1 (D) and t large enough, where η := dα dγ . If moreover µ = µ 1 ⊗ · · · ⊗ µ d , there exists a constant C > 0, which does not depend on d, such that, for t large enough, A particular attention will be paid on processes coming down from infinity, for which it will be shown that the rate of convergence κ provided by the Bakry-Émery condition (5) can actually be bettered (see Theorems 5 and 8).
Then, for any µ ∈ M 1 (E), there exists t µ such that, for any t ≥ t µ , where with some positive constants a, b.
Remark 1. For several processes, it is quite usual to have χ 2 (η * µ|η * α) = +∞ when the initial law is a Dirac measure δ x . In the most of the cases, considering a state x ∈ E, there exists a time t 0 > 0 such that Hence, using the property of semi-flow of (φ t ) t≥0 (i.e. φ t+s = φ t • φ s for all s, t ≥ 0), the previous theorem implies that there exists t φt 0 (δx) such that, for any More generally, the set of all the measures such that there exists t 0 ≥ 0 such that χ 2 (η * φ t 0 (µ)|η * µ) < +∞ is included in the domain of attraction of α, that is to say the set of the initial measures such that the weak convergence − α(f )| is actually stronger than the total variation distance. In particular, Theorem 3 implies that there exists C > 0 such that, for any µ ∈ M 1 (E) and t ≥ 0, Hence, this theorem allows to obtain a result analogous to the ones obtained by Champagnat and Villemonais in [8,9]. However, contrary to their results, the upper bound could be small if the initial measure is close enough to the quasi-stationary distribution α (it is even equal to 0 for µ = α). Moreover, Theorem 3 allows to obtain a convergence in 1-Wasserstein distance, as stated by the following corollary : Corollary 1. If the assumptions (P ) holds for for a given x 0 ∈ E, then, for any µ ∈ M 1 (E), there exists t µ such that, for any t ≥ t µ , Proof. By the dual formula for the 1-Wasserstein distance (for example see [33]), for any probability measures µ and ν in P 1 (E), one has where Thus, any function f belonging to C satisfies Hence, by Theorem 3, there exists t µ such that, for any t ≥ t µ ,

Now, let us tackle the proof of Theorem 3 :
Proof of Theorem 3. First, remark that if χ 2 (η * µ|η * α) = +∞, the inequality (8) is trivially satisfied. So, from now on, we will only consider initial measure such that The proof is divided into two steps.
First step: where we recall that λ 0 and η are such that, for any x ∈ E and t ≥ 0, Then, since α is a quasi-stationary distribution for (X t ) t≥0 , the probability measure β(dx) := η(x)α(dx) is an invariant measure for (P t ) t≥0 . Moreover, denoting byL the generator of (P t ) t≥0 , then, for any f ∈ D(L) such that f η ∈ D(L) and for any x ∈ E, In particular, the Poincaré inequality (7) can be written as follows : In other words, the inequality (7) is the Poincaré inequality for the Markovian semi-group (P t ) t≥0 . Then it is well-known that it is equivalent to : for any probability measure ν on E and t ≥ 0, Now, let us define, for any f ∈ B(E), t ≥ 0 and x ∈ E, Since α(ψ 2 /η) < ∞ by the third assumption, one has, for any measurable function such that |f | ≤ ψ, In particular, for any measurable function f such that |f | ≤ ψ, t ≥ 0 and ν ∈ M 1 (E) where the following equality is used : ∀ν 1 , ν 2 ∈ M 1 (E), 1 The choice of the value 0.9 is totally arbitrary, any value smaller than 1 is suitable for the proof.
Now let µ ∈ M 1 (E) such that χ 2 (η * µ|β) < +∞. Recalling the notation one has the following lemma, whose the proof is postponed after the end of this proof.
Then, using Lemma 1 and the inequality (10), one has for any t ≥ 0, In particular, there exists t µ ≥ 0 such that, for any t ≥ t µ , Hence, applying what we obtained at the first step, one has, for any t ≥ t µ , which concludes the proof.
Now, let us prove Lemma 1.
Proof of Lemma 1. For any t ≥ 0, µ ∈ M 1 (E) and for any measurable function f , where the equality P t [η](x) = e −λ 0 t η(x), for any t ≥ 0 and x ∈ E, was used. As a result, contrary to the technics using Lyapunov functions or minorization properties, the previous theorem provides in such cases a rate of convergence which does not explode in high dimension (as soon as the state space E is the product space of one-dimensional spaces E i ).
Remark 4. In the same manner, subgeometrical convergences to quasi-stationarity can be proved replacing the conditions (P2) and (LS2) by weaker functional inequalities, such as Nash inequalities or weak Poincaré inequalities (see [20,26]). This method does not allow however to cover all the processes having this property of subgeometrical convergence (see for example [25] where the Doob transform is not ergodic).
Remark 5. As stated in Corollary 1, the previous method using the Doob transformP t allows to get convergence in 1-Wasserstein distance through a Poincaré inequality. A natural question is therefore if one can use the logarithmic Sobolev inequality to deal with the convergence in p-Wasserstein distance, which is defined by By the same methodology and using that, for any µ, ν ∈ M 1 (E), where H(µ|ν) := E log dµ dν dν (when µ ≪ ν) is the entropy, one obtains the one-sided estimate This estimate is unfortunately not sharp enough, since log(β(e f /η )) ≥ α(f ), and the convergence in W p for general p still remains an open question.
In a practical way, Theorem 3 is hardly usable because the expressions of the quasi-stationary distribution α and the eigenfunction η (whose the existences are not obvious in general) are scarcely explicitly known, so the conditions (P2)-(P3) cannot be checked. In the following section, diffusion processes will be only dealt with and easy-to-check assumptions will be given.

Bakry-Émery condition and quasi-stationarity : application to diffusion processes
In all what follows, the space R d will be endowed with the L 1 -distance for any x = (x i ) i=1,...,d and y = (y i ) i=1,...,d . In particular, this distance will be implicitly used for the definition of W 1 . From now on, consider (X t ) t≥0 a diffusion process on a domain D ⊂ R d satisfying with a d-dimensional Brownian motion (B t ) t≥0 and V ∈ C 2 (R d ). The process (X t ) t≥0 is considered as absorbed at the boundary of D, denoted by ∂D. Denoting by L the generator of (X t ) t≥0 , one has γ is therefore one reversible measure associated to the underlying non-absorbed Markov process. Note that γ is not necessarily defined as a probability measure. In our case, we will only deal with potential V such that γ(D) < +∞.

Proof of Theorem 1
In this subsection, we will prove Theorem 1 stated earlier in the introduction, that we recall below : Theorem 4. Let (X t ) t≥0 following (14) and such that γ(D) < +∞.
(BE1) Assume that there exists a nonnegative function η defined on D ∪ ∂D, positive on D and vanishing on ∂D, which is an eigenvector for the generator L such that γ(η 2 ) < +∞.
Remark that the measure γ(dx) = e −V (x) dx is a reversible measure for the semi-group (P t ) t≥0 , which means that, for any f, g ∈ B(D), Then, for any t ≥ 0 and f measurable, that is to say that α := η * γ is the quasi-stationary distribution of (P t ) t≥0 .
In order to deal with the total variation distance, it is enough to take ψ = 1. For such a choice of ψ, one has Hence the condition (P3) of the Theorem 3 is satisfied for ψ = 1. So, by Theorem 3, one has The point (ii) of Theorem 4 is a straightforward consequence of Corollary 1.
Remark 6. Exponential decays like (16) and (17) hold also under weaker assumptions than (15), such as the two followings: • There exists c > 0 and R ≥ 0 such that for |x| > R, • There exists a ∈ (0, 1), c > 0 and R ≥ 0 such that for |x| > R, These two conditions actually appear in [1, Corollary 1.6] and imply (P2). In particular, the first condition is satisfied when V − 2 log(η) is convex. It will be shown later that, for diffusion processes on (0, +∞) d , the convexity of V implies the one of V −2 log(η) for a particular eigenfunction η (which is not unique a priori), so (18) is satisfied.

Brownian motion in a hypercube
Denote by C N = [−N, N ] d and let (B t ) t≥0 be a d-dimensional Brownian motion absorbed at the boundary of C N , which will be denoted by ∂. Then this Brownian motion admits a unique quasistationary distribution α Bm , whose the expression is Moreover, one can also compute the renormalized eigenvector η Bm associated to α Bm , which is In particular, for any (x 1 , . . . , x d ) ∈ C N and i, j = 1, . . . , d, Hence, the Bakry-Émery condition (15) holds for κ = ( π 2N ) 2 . So, there exists C > 0 such that, for any initial measure µ ∈ M 1 (C N ), Now, one has So, Note however that this Bakry-Émery coefficient κ is not optimal. In particular, denoting β Bm := η Bm * α Bm , if we use directly the Theorem 3, the Poincaré constant C P is equal to In the particular case of the Brownian motion in the hypercube C N , the sub-Markovian semi-group (P t ) t≥0 is symmetric with respect to the Lebesgue measure, so there exists a family of orthonormal eigenfunctions (η i ) i∈Z + , associated respectively to the eigenvalues e −λ i t . Then, defining and the family (η i ) i≥0 is orthonormal with respect to β Bm . So, it is easy to deduce that Then, by Theorem 3, there exists C > 0 such that Concerning the 1-Wasserstein distance, one can remark that the exponential decay in total variation distance (19) implies the one in W 1 . As a matter of fact, since we are studying a process living on the compact set [−N, N ] d , one has, for any µ, ν ∈ M 1 ([−N, N ] d ), This inequality allows actually to get a better estimate for the decay in 1-Wasserstein distance than the one provided by Corollary 1.

Ornstein-Uhlenbeck process
Ornstein-Uhlenbeck process satisfying the following stochastic differential equation x i , and the associated quasi-stationary distribution is expressed as follows : Moreover, for any x = (x 1 , . . . , x d ) and i, j = 1, . . . , d, So the Bakry-Émery condition (15) is satisfied for κ = 2λ. Hence, by Theorem 4, there exists C d > 0 such that, for any µ and t large enough, Note that the rate of convergence does not depend on the dimension d, but the constant C d explodes in high dimension. More precisely, after computations, one can show that, when d → +∞, In general, contrary to the two previous examples, the eigenfunction η cannot be explicitly given, so the assumptions of Theorem 4 cannot be checked in practice. In the following subsection, one will see how to bypass this problem for diffusion processes living on D = (0, +∞) d .

Diffusion processes on (0, ∞) d
In this section, we will be focused on diffusion processes living on (0, +∞) d and absorbed when one of its component reaches 0.

When d = 1
Take a one-dimensional diffusion process following living on D = (0, +∞) and absorbed at ∂ = 0, where V is a C 2 -function. Then, one gets the following proposition.
Proposition 1. Assume that V is convex on (0, +∞) and Then there exists an eigenfunction η such that log(η) is concave.
Proof. In [9], it is shown that, under the condition lim x→+∞ V ′ (x) = +∞, there exists a unique positive eigenfunction η such that with λ 0 > 0, and such that there exists C, θ > 0 such that, for any x ∈ D and t ≥ 0, where p > 1 and ϕ is a Lyapunov function for the generator L such that there exists D 0 ⊂ (0, +∞), C ′ > 0 and λ > 0 large enough such that For any x ≥ 0, h > 0 and t ≥ 0, where τ x is the hitting time of x by the process (X t ) t≥0 , and where the strong Markov property is used for the second equality. Considering the process (X t∧τx ) t≥0 absorbed at x, it is also a diffusion process coming down from infinity. So there exists also a positive function η x on (x, +∞) and a positive constant λ x such that, for any y > x, η x (y) = lim t→∞ e λxt P y (τ x > t).
The previous proposition actually tells us that, assuming V convex, the second derivative of V − 2 log(η) is greater than the one of V . In particular, Proposition 1 entails the following corollary: (20) and assume that there exists κ > 0 such that ∀x ∈ (0, +∞).
Then there exists a quasi-stationary distribution α, which is absolutely continuous with respect to γ, and a constant C > 0 such that, for any µ ∈ M 1 (D) and for t large enough, and where η := dα dγ .

One-dimensional processes coming down from infinity
Let (X t ) t≥0 be a solution of (20) coming down from infinity, which means that there exists a constant ρ > 0 such that sup x≥0 E x (e ρτ ∂ ) < +∞ (see [5] for alternative definitions). Quasi-stationarity for such processes have been already studied in [10], in particular (X t ) t≥0 absorbed at 0 admits a unique quasi-stationary distribution α absolutely continuous with respect to γ and an eigenfunction η, unique up to a multiplicative constant, satisfying the following relation (see [10,Theorem 4.1.]): where −λ 0 < 0 is the eigenvalue associated to α and η. Moreover, [10,Proposition 4.2.] states that η is proportional to the function In particular, log(η) is concave, whatever the convexity of the potential V . For these processes, one can state the following result : following (20) coming down from infinity such that Then there exists a constant C > 0 such that, for any µ ∈ M 1 (D) and for t large enough, and If moreover V ′ (x) > 0 for any x > 0, then the previous statement holds for Remark 7. In other words, this theorem states that the rate of convergence κ coming from the Bakry-Emery condition V ′′ ≥ κ can actually be improved replacing it byκ. Moreover, this entails that the exponential convergences (25) and (26) holds even if V is concave in a neighborhood of 0, as soon as the function x → V ′′ (x) + 8λ 0 e −V (x) is lower-bounded by a positive constant.
Proof of Theorem 5. The idea is simply to apply Theorem 4 and to compute the best κ satisfying knowing (24). First of all, for any x > 0, By the equality (24), Then, for any x > 0, Hence, for any x > 0, As a result, assumingκ : Now, assuming moreover V ′ (x) > 0 for any x > 0, and using that 1 which entails that which concludes the proof.
the underlying process (X t ) t≥0 satisfying (20) comes down from infinity, so Theorem 5 applies and the inequalities (25) and (26) hold forκ := inf x>0 V ′′ (x) + 8λ 0 e −V (x) + 8λ 2 . For this example, the eigenvalue −λ 0 is not explicitly known, but it is possible to compare it with the eigenvalue −λ OU associated to the one-dimensional absorbed Ornstein-Uhlenbeck process satisfying and such that, for any x > 0, where (P OU t ) t≥0 is the sub-Markovian semi-group associated to (X OU t ) t≥0 . This eigenvalue is explicitly known : Likewise one has, for any x > 0, Hence, since V ′ (x) ≥ x + 1 for any x > 0, one deduces from the theorem of comparison [17, Theorem 1.1, Chapter VI] that, for any x > 0, λ 0 ≥ λ OU = 1.
As a result, one has a lower-bound for λ 0 and one can chooseκ as
By what it was shown previously, for any i = 1, . . . , d, log(η i ) is concave. As a result, one can state the following result, which is the the multi-dimensional version of Corollary 2, already stated in the Introduction.
Previously, it was seen, with the two examples of Subsection 3.2, that the constant C d could explode when the dimension d goes to infinity. However, it is possible to improve this result when the initial measure µ is the tensorial product of d probability measures on (0, +∞). In this case, since (27) is assumed, the one-dimensional processes (X i ) i=1,...,d are mutually independent. Moreover, since {X t = 0} = i=1,...,d {X i t = 0}, then for any t ≥ 0 and µ 1 , . . . , µ d ∈ M 1 ((0, +∞)), where τ i ∂ := inf{t ≥ 0 : X i t = 0}. Then, one obtains the following theorem, which was also stated previously in the Introduction.
Theorem 7. Assume the assumptions of Theorem 6. Then there exists a constant C > 0, which does not depend on the dimension, such that, for any µ 1 , . . . , µ d ∈ M 1 ((0, +∞)), and for t large enough, Proof. The first result comes from the inequalities which can be shown using the equality 1 2 µ − ν T V = inf (X,Y )∈Π(µ,ν) P(X = Y ), ∀µ, ν ∈ M 1 (D), and the result is deduced from the one obtained for d = 1. In the same way, by the definition of W 1 and recalling that W 1 is defined through the L 1 -distance defined in (13), one has which implies the second inequality in the statement of Theorem 7.
Obviously, one can also state a result similar to Theorem 5 for multi-dimensional diffusion processes coming down from infinity: