Polynomial rate of convergence to the Yaglom limit for Brownian motion with drift

This paper deals with the rate of convergence in 1 -Wasserstein distance of the marginal law of a Brownian motion with drift conditioned not to have reached 0 towards the Yaglom limit of the process. In particular it is shown that, for a wide class of initial measures including probability measures with compact support, the Wasserstein distance decays asymptotically as 1 /t . Likewise, this speed of convergence is recovered for the convergence of marginal laws conditioned not to be absorbed up to a horizon time towards the Bessel- 3 process, when the horizon time tends to inﬁnity.


Notation
For a general set F , we will use the following notation: • For any family of probability measure (µ t ) t≥0 and µ ∈ M 1 (F ), the notation µ t L −→ t→∞ µ refers to the weak convergence of (µ t ) t≥0 towards µ, that is: for any continuous and bounded measurable function f , For any x, L ∈ R + × (0, +∞), denote by Lip x,L (R + ) the set of the L-Liptschitz functions defined on R + such that f (0) = x.

Introduction
Denote by (X t ) t≥0 a Brownian motion with constant drift, that is defined as X t := X 0 + B t − rt, ∀t ≥ 0, (1.1) where X 0 is a random variable on R, (B t ) t≥0 is a one-dimensional Brownian motion starting from 0 independent from X 0 and r > 0. Denote by (P x ) x∈R+ a family of probability measures such that, for any x > 0, P x (X 0 = x) = 1. Then, for a given µ ∈ M 1 (R + \ {0}), the probability measure P µ := (0,+∞) P x µ(dx) is such that P µ (X 0 ∈ ·) = µ. Denote by E x and E µ the associated expectations.
Denote by τ 0 the hitting time of (X t ) t≥0 at 0, i.e.
This paper will cope with the quasi-stationarity for the process (X t ) t≥0 , that is the study of the asymptotic behavior of the Markov process (X t ) t≥0 conditioned not to reach 0.
It is well known (see for example [3] or [9]) that to be a QSD is equivalent to the following property: there exists µ ∈ M 1 ((0, +∞)) such that P µ (X t ∈ ·|τ 0 > t) This paper will more specifically deal with the weak convergence of P µ (X t ∈ ·|τ 0 > t) towards the so-called Yaglom limit, denoted by α Yaglom , which is defined as the unique QSD such that, for any x > 0, It is well known that such a probability measure exists for the Brownian motion with drift (see further in Section 2). The goal of this paper is more precisely to study the speed of convergence of the conditional probability measure P µ (X t ∈ ·|τ 0 > t), for some initial measures µ, towards the Yaglom limit when t goes to infinity.
In order to quantify the weak convergence (1.2), it is possible to use several distances on M 1 ((0, +∞)). One of them is the total variation distance, defined as follows: In particular, the convergence towards 0 of the total variation distance between the conditional probability P µ (X t ∈ ·|τ 0 > t) and α Yaglom implies the weak convergence (1.2). Another distance which can be used to quantify weak convergences is the 1-Wasserstein distance, defined as For this distance, the decay towards 0 implies a weak convergence and the convergence of the first moment.
For the most of the absorbed Markov processes, it is usually expected that the distance ||P µ (X t ∈ ·|τ 0 > t) − α Yaglom || T V decreases exponentially fast. Especially, it is well known (see for example [1,2,13]) that some conditions, based on Doeblin-type condition or Lyapunov functions, entail an exponential decay of the total variation distance. However, in our case, the Brownian motion with drift does not satisfy such conditions. It even seems that an exponential speed could be too fast for some initial laws and that the expected rate of convergence is rather 1/t. In particular, it was shown by Polak and Rolski in [12] that the L 1 distance between the density function of P x [X t ∈ ·|τ 0 > t] and the one of the Yaglom limit is equivalent, when t goes to infinity, to c/t, where c > 0 is a constant independent on the state x. In [11], Palmowski and Vlasiou showed that, for a Lévy process satisfying some assumptions (see (SN) and (SP) in [11]) and taking, as initial measure, the invariant measure of the associated reflected process, the difference in absolute value between the density function of the conditional probability and the one of the Yaglom limit decreases also as 1/t when t goes to infinity.
The aim of this note is to recover this speed of convergence considering a large class of initial measures, improving therefore the result of Polak and Rolski, and using the Wasserstein distance W 1 to quantify the convergence. In particular, it will be shown that, if the initial measure µ has a compact support, there exists c µ , C µ ∈ (0, ∞) such that one has asymptotically The asymptotical inequalities (1.4) hold actually for a wider class of initial measures which will be spelled out later in Theorem 3.1, and the same result can be also stated replacing the Wasserstein distance by the total variation distance.
The Section 2 will begin by giving some useful and well-known generalities on the quasi-stationarity for the Brownian motion with drift absorbed at 0. Then, the results of this note will be more precisely presented in Section 3, one of which is an important lemma on the asymptotic property of the Bessel-3 process. Finally, this note ends with the polynomial convergence of the conditional law P µ [X s ∈ ·|τ 0 > t] towards the marginal law at time s of a Bessel-3 process, when t goes to infinity.

Preliminaries on the quasi-stationarity for a Brownian motion with drift
The quasi-stationarity for the Brownian motion with drift absorbed at 0 has been studied by Martinez and San Martin in [8]. In their paper, the authors showed that there exists an infinity of quasi-stationary distributions, one of which is the Yaglom limit α Yaglom .
Moreover, the density function of the Yaglom limit is explicitly given: In another paper ( [7]), Martinez, Picco and San Martin are interested in the domain of attraction of the Yaglom limit, that is the set of the initial laws for which the convergence (1.2) holds. In particular, they showed that, when the initial law µ admits a density function ρ with respect to the Lebesgue measure, the conditional probability measure P µ (X t ∈ ·|τ 0 > t) converges to α Yaglom when Since α Yaglom is a quasi-stationary distribution, it is well known (see [3,9]) that there exists λ 0 > 0 such that, for any t ≥ 0, ECP 25 (2020), paper 35.
For a Brownian motion with drift r, one has Moreover, λ 0 is an eigenvalue for the infinitesimal generator of (X t ) t≥0 , which is and one can associate to λ 0 an eigenfunction η, which is unique up to a multiplicative constant and proportional to the function x → xe rx . For example, one can choose From these definitions, the so-called Q-process can be defined as the Markov process whose the semi-group (Q t ) t≥0 is defined by For any positive measure µ supported on (0, +∞), one uses the notation This Q-process is actually obtained from a Doob-transform of the sub-Markovian semi- . It corresponds to the process conditioned "never" to be absorbed, in the following sense: the family of probability measure (Q x ) x>0 defined as is well-defined and, for any t ≥ 0 and any f measurable, Bessel-3 process, which is a diffusion process following Note that the Q-process does not depend on the drift r > 0 and one gets an explicit formula for the density function of P x (Y t ∈ ·) (for any x > 0 and t ≥ 0), which is Moreover, the measure γ(dx) = x 2 dx is an invariant measure for the Bessel-3 process.

Polynomial convergence in Wasserstein distance
The main result of this paper is now clearly stated: x 3 e rx µ(dx) < +∞, Before proving the theorem, let us have a few remarks about its statement: In this paper, we will only focus on the Wasserstein distance, but it is also possible to obtain the same result (3.2) for others distances. In particular, one gets the same statement for Theorem 3.1 taking the total variation distance instead, as defined in (1.3), or also the Kolmogorov distance defined as follows: In the same way, the convergence in Kolmogorov distance implies the weak convergence of measures, but in a weaker way than the total variation distance or the Wasserstein distance.

Remark 3.3.
Concerning the domain of attraction of α Yaglom , the assumption on the integrability of the initial measure (3.1) is slightly stronger than the assumption (2.1) written previously. As a matter of fact, (3.1) holds when the density function ρ of the initial measure satisfies but does not hold when one has equality instead. Remark in particular that, taking µ = α Yaglom , (3.1) is not satisfied, as well as (3.2). In a general way, the speed of an open question.

Remark 3.4.
As written in the introduction, this speed of convergence was already found out by Polak and Rolski in [12]. More precisely, the authors showed that there exists c > 0 such that, for any x > 0, In addition to the question concerning the choice of the distance, the main question is whether this result holds for others initial laws than Dirac measures. The proof of Polak and Rolski relies on the asymptotic expansion of the density function of the sub-Markovian semi-group P t f (x) = E x [f (X t )1 τ0>t ], which is obtained from the serie expansion (3.7), written further. However, integrating this expansion over a probability measure satisfying (3.1) (for example a probability measure admitting a density function decaying as x → e −r x , with r > r), Fubini's theorem seems not to be well justified, so that the asymptotic expansion of the density function for a general initial distribution satisfying (3.1) is not obvious.

Asymptotic behavior for the Bessel-3 process
We will now proceed to the proof of Theorem 3.1.
To do so, the main strategy is to use the Q-process as a Doob transform for the sub-Markovian semi-group P t f (x) = E x [f (X t )1 τ0>t ]. In particular, it is well-known that the asymptotic behavior of this Doob transform is very linked to the one of the conditional probability measure P µ [X t ∈ ·|τ 0 > t], for some µ ∈ M 1 ((0, +∞)), as it was shown for example in [5,4,10] in the context of absorbed Markov processes, or in [6] for exploding Feynman-Kac semi-groups.
Hence, before proving Theorem 3.1, the following lemma will be first stated and proved: Lemma 3.5. For any measurable function f such that for any t ≥ 0 and for any probability measure µ supported on (0, +∞) satisfying 2 .
By the explicit formula of the density function (2.3), dy.
As a result, for any t ≥ 0, g (z)dz, ∀y > 0, ∀t ≥ 0. However, denoting h : z → z 2 2 , using that g (z) = 1 − e −z ≤ z = h (z) for any z ≥ 0, As a result, for any y ∈ R + and t ≥ 0, Thus, for any t ≥ 0, Polynomial convergence to the Yaglom limit for Brownian motion with drift As a result, integrating over a probability measure µ(dx) supported on (0, +∞), which is the first part of the lemma. Now, assume moreover that f is positive. Then, since xy t − sinh xy t e − x 2 +y 2 2t ≥ 0 for any x, y > 0 (this is proved by (3.4)), and for any probability measure µ supported on (0, +∞), The function y → sinh xy t e − x 2 +y 2 2t can be expressed as a serie expansion and one has xy t − sinh xy t e − x 2 +y 2 2t = xy(x 2 + y 2 ) 2t 2 + n≥3 a n (x, y) t n , where, for any n ≥ 3, a n (x, y) : Hence, for any x, y > 0, Thus, using (3.5), by Lebesgue's theorem, one shows that if µ is a probability measure supported on (0, ∞) satisfying ∞ 0 x 2 µ(dx) < +∞, then one has

Proof of Theorem 3.1
Theorem 3.1 will now be proved. Let µ be a probability measure supported on (0, +∞) Remark that this above-mentioned condition is exactly the condition (3.1). Also remark that this condition implies that ∞ 0 η(x)µ(dx) < +∞.
The first step is to prove that there exists t µ and C µ < +∞ such that, for any t ≥ t µ , which will imply that lim sup t→∞ t × W 1 (P µ (X t ∈ ·|τ 0 > t), α Y aglom ) < +∞.
As a result, using (3.10), for any t ≥ C µ + 1, which concludes the first step.

Polynomial convergence to the Bessel-3 process
Now, let us state the following theorem: Theorem 4.1. There exists s 0 > 0 such that, for any µ ∈ M 1 ((0, +∞)) satisfying ∞ 0 x 4 e rx µ(dx) < +∞ and for any s ≥ s 0 , Remark 4.2. Note that, in this theorem, the assumption of integrability on the initial measure is slightly stronger than (3.1). This is due to the use of the 1-Wasserstein distance. For the total variation distance, the condition (3.1) is suitable to obtain the same statement as Theorem 4.1.