Cut-off phenomenon for Ornstein-Uhlenbeck processes driven by L\'evy processes

In this paper, we study the cut-off phenomenon under the total variation distance of $d$-dimensional Ornstein-Uhlenbeck processes which are driven by L\'evy processes. That is to say, under the total variation distance, there is an abrupt convergence of the aforementioned process to its equilibrium, i.e. limiting distribution. Despite that the limiting distribution is not explicit, its distributional properties allow us to deduce that a profile function always exists in the reversible cases and it may exist in the non-reversible cases under suitable conditions on the limiting distribution. The cut-off phenomena for the average and superposition processes are also determined.


Introduction
The term cut-off phenomenon was introduced by Aldous and Diaconis [1] in the early eighties to describe the phenomenon of drastic convergence to the equilibrium of Markov chains models related to card shuffling. Although the cut-off phenomenon has mostly been discussed in the literature for Markov chains with finite state space, it makes perfect sense in the general context of Markov processes having limiting distribution. To be more precise, let us consider (X (ǫ) , ǫ > 0) a parametrised family of stochastic processes such that for each ǫ > 0, the process X (ǫ) possesses a limiting distribution, here denoted by µ (ǫ) . Roughly speaking the cut-off phenomenon refers to the following asymptotic behaviour: as ǫ decreases, a suitable distance between the laws of X (ǫ) t and the corresponding limiting distribution µ (ǫ) converges to a step function centered at deterministic times t cut ǫ . In other words, the function ǫ → t cut ǫ is such that the distance is asymptotically maximal for times smaller than t cut ǫ − o(t cut ǫ ) and asymptotically zero for times larger than t cut ǫ + o(t cut ǫ ). Limiting distributions of stochastic processes are an important feature in probability theory and mathematical physics; and typically they are not so easy to describe explicitly. The cut-off phenomenon can be used to simulate such limiting distributions. More precisely, the cut-off time t cut ǫ determines the steps needed to converge to the limiting distribution within an acceptable error (this will be discussed in more detail below). Also, the cut-off phenomenon can be used to understand more complicated phenomena which are important in mathematical physics such as metastability (see for instance Barrera et al. [5,6]).
For an introduction to the subject in the Markov chain setting, we refer to Diaconis [15], Martínez and Ycart [28] and the monograph of Levin et al. [26]. We also refer to Saloff-Coste [30] for a review of random walks where such phenomenon appears. Chen and Saloff-Coste [14] considered the cut-off phenomenon for some ergodic Markov processes such as Brownian motions on a compact Riemann manifold and k-regular expander graphs.
Lachaud [24] and Barrera [4] considered the case of Ornstein-Uhlenbeck processes driven by a Brownian motion and more recently Barrera and Jara (see [7,8]) studied the case of random dynamical systems with coercive vector fields.
The authors in [7,8] considered the following Langevin dynamics which are described by the stochastic differential equation (SDE for short), where ǫ > 0, which can be considered as the amplitude of the noise, F is a strongly coercive vector field with a unique asymptotically stable attractor at 0 and satisfying an exponential growth condition; and (B t , t ≥ 0) is a standard Brownian motion in R d . Under the above assumptions, such SDEs converge to their equilibrium and also exhibits, under the total variation distance, the cut-off phenomenon. Moreover, in the unidimensional case and assuming that the dynamics are given by a multi-well potential, the cut-off phenomenon allows to describe the densities of the quasi-stationary measures which are associated to the metastability phenomena (see Section 5 in [7] for further details about this fact).
In this manuscript, we are interested in the cut-off phenomenon of Ornstein-Uhlenbeck processes driven by Lévy processes (or OUL processes). The latter are defined as the unique strong solution of the following SDE where Q is a d-squared real matrix whose eigenvalues have positive real parts, ξ = (ξ t , t ≥ 0) denotes a d-dimensional Lévy process and ǫ > 0. Such class of processes are also known as processes of the Ornstein-Uhlenbeck type, according to Sato and Yamazato [33] terminology, or the Lévy driven case of a generalised Ornstein-Uhlenbeck process, according to Kevei [21] and the references therein. The study of this particular case is relevant for the understanding of the cut-off and metastability phenomena of a wide class of SDEs driven by Lévy noises in R d and, moreover, it also shows the complexity that brings to the problem the addition of a Poisson jump structure with respect to the Brownian case. For instance, by replacing the Brownian component by a Lévy process in (1.1) and since the vector field is strongly coercive, a natural approach to deduce the cut-off phenomenon for the family of SDEs is via a linearisation technique (similar to [7,8]) and henceforth the Ornstein-Uhlenbeck case is needed. Actually, this is the strategy that is used by the authors, together with M. Högele, in a forthcoming manuscript [10] where the non-linear case is studied. Moreover, the jump structure may imply that the positive integers moments may not exist and even exponential moments, a property which is strongly used in [7,8]. This implies that the strategy of the proof may change substantially with respect to the Brownian case and new techniques and couplings are needed. On the other hand, under the absence of the Brownian component, the transition functions of the SDEs may be quite irregular (this will be discussed below), therefore some conditions on the jump structure of the Lévy process are needed. We also note that in the Brownian case the transition functions and the limiting distribution are Gaussian distributions and implicitly good bounds of the total variation distance between them can be obtained. This property is strongly used in the papers [7,8], and is lost when a Poisson jump structure is added to the noise.
Under the assumption that the jump structure of the Lévy process ξ has finite logmoment, as well as some regularity conditions that we will specify below, we get the cut-off phenomenon for the family of OUL (X (ǫ) , ǫ > 0) where X (ǫ) satisfies (1.2). Both conditions allow us to use a wide class of Lévy processes which include, in particular, stable processes and Brownian motion. Moreover, we also provide conditions on the limiting distribution for having the so-called profile cut-off, i.e. the abrupt convergence is described by a deterministic function. For instance, this condition is full-filled for symmetric distributions in the unidimensional case.
It is important to note that in [8] a necessary and sufficient condition is obtained in order to get profile cut-off, something that seems not so easy to deduce when a Poisson jump structure is added to the noise. Indeed the authors in [8] provided a characterisation of the profile cut-off using the invariance property of the standard Gaussian distribution by orthogonal matrices (see Lemma A.2 in [8]) and the upper and lower semi-continuity property for the total variation distance between Gaussian distributions (see Lemma A.6 in [8]). For tracking this issue, we impose an "invariance" type condition and provide an alternative proof that does not rely on explicit computations of the total variation distance.
Finally, we are also interested on the cut-off phenomenon for the superposition and the average processes of OUL. For the superposition process of OUL, profile cut-off is also obtained and the profile function is given in terms of a self-decomposable distribution. Motivated by the work of Lachaud [24], we study the cut-off phenomenon for the average process of OUL under the assumption that the driving Lévy process is stable and prove that there is profile cut-off with an explicit profile function, cut-off time and window cut-off.

Preliminaries and main results
2.1. Cut-off phenomenon. Before we introduce the concept of cut-off formally, let us recall the notion of the total variation distance which will be our reference distance between probability distributions. Given two probability measures P and Q which are defined in the same measurable space (Ω, F ), the total variation distance between P and Q is given by For simplicity, in the case of two random variables X and Y defined on the same probability space (Ω, F , P) we use the following notation for its total variation distance, where L(X) and L(Y ) denote the law under P of the random variables X and Y , respectively. For a complete understanding of the total variation distance (normalised or not normalised), we refer to Chapter 2 of the monograph of Kulik [23].
According to Barrera and Ycart [11], the cut-off phenomenon can be expressed at three increasingly sharp levels.
Observe that the cut-off times and the windows cut-off are deterministic and both may depend on the starting state of the process. Implicitly, the distance d (ǫ) (t) also may depend on the starting state. Moreover, the cut-off times and windows cut-off may not be unique but, up to an equivalence relation, they are (see [28] for further details). On the other hand, there are not to many examples where the profile can be determined explicitly, specially under the total variation distance. Actually explicit profiles are usually out of reach and normally only windows cut-off can be hoped for.
The cut-off phenomenon, under the total variation distance, is naturally associated to a switching phenomenon, i.e., all/nothing or 1/0 behaviour but it has the drawback of being used with other meanings in statistical mechanics and theoretical physics. Alternative names are threshold phenomenon and abrupt convergence (see Barrera et al. [6]). The cut-off phenomenon can be also interpreted as a mixing time (see for instance Lubetzky and Sly [27] and/or Chapter 18 of [26]) or as a hitting time (see for instance [28]). Both interpretations are equivalent for the total variation distance and separation distance, see Barrera and Ycart [11] and the references therein.
Let us exemplify the relevance of the cut-off phenomenon by the following simple and well-known example. Imagine that we would like to sample a probability distribution by a Markov Chain Monte Carlo method using an ergodic Markov chain that has the desired distribution as its limiting distribution. Usually, it is not so difficult to construct such Markov chain with the given properties. The more difficult problem is to determine how many steps are needed to converge to the limiting distribution within an acceptable error. The cut-off phenomenon, in this setting, implies that there exists an asymptotically optimal sufficient running time, here denoted by T ǫ , which is asymptotically equivalent to the cut-off time t ǫ . Moreover, if there is a cut-off time t ǫ with window size w ǫ , then one gets the more precise result, that is to say, that the optimal running time T ǫ should satisfy |T ǫ − t ǫ | = O(w ǫ ) as ǫ goes to 0. The crucial point here is that, if there is a cut-off, these relations hold for any desired fixed admissible error size whereas, if there is no cutoff, the optimal sufficient running time T ǫ depends greatly of the desired admissible error size. For further details we refer [14].

The Ornstein-Uhlenbeck process and its invariant distribution.
Similarly to the diffusive case, OUL have been widely studied since they appear in many areas of applied probability. This family of processes appears as a natural continuous time generalisation of random recurrence equations, as shown by de Haan and Karandikar [19] and has applications in mathematical finance (see for instance Klüppelberg et al. [22] and Yor [36]), risk theory (see for instance Gjessing and Paulsen [18]), mathematical physics (see for instance Garbaczewski and Olkiewicz [17]) and random dynamical systems (see for instance Friedman [16]). From the distributional point of view, they have attracted a lot of attention since the limiting distribution, whenever it exists, satisfies an operator self-decomposability property which in the unidimensional case turns out to be the socalled self-decomposability property, see for instance Sato and Yamazato [33] for further details. Actually, any operator self-decomposable distribution can be determined as the equilibrium distribution of an OUL, see for instance Sato and Yamazato [33] and Sato [31,32] for the self-decomposable case.
Let d ≥ 1 and ξ = (ξ t , t ≥ 0) be a R d -valued Lévy process, that is to say a càdlàg process with independent and stationary increments, whose law, starting from x ∈ R d , is denoted by P x , with the understanding that P 0 = P. We also let | · | and ·, · be the Euclidean norm and the standard inner product in R d , respectively.
It is well known that the law of any Lévy process is characterised by its one-time transition probabilities. In particular, for all z ∈ R d , we have where the characteristic exponent ψ satisfies the so-called Lévy-Khintchine formula a ∈ R d , Σ is a d-squared symmetric non-negative definite matrix and ν is a measure on R d \{0} satisfying the integrability condition We take ǫ > 0 and recall that the associated OUL X (ǫ) is the unique strong solution of the following linear SDE in R d given by where Q is a d-squared real matrix whose eigenvalues has positive real parts. For simplicity, we denote the latter class of matrices by M + (d). If the matrix Q is symmetric, we say that the process (2.1) is reversible since the vector field F (x) = Qx can be written as the transpose of the gradient for the quadratic form V (x) = x T Qx/2, which can be thought as potential energy. Here, x T denotes the transpose of the vector x. When the matrix Q is non-symmetric, we say that the process (2.1) is non-reversible. It is important to note that when we perturb a symmetric matrix, typically it becomes non-symmetric, in other words and roughly speaking, most of the matrices on M + (d) are non-symmetric.
The SDE (2.1) can be rewritten as We denote by P x 0 for its law starting from x 0 . It is known (see for instance Theorem 3.1 in Sato and Yamazato [33]) that the latter is an homogeneous Markov process with transition function P (ǫ) where Q T denotes the matrix transpose of Q.
By straightforward computations, we deduce that the transition function P where e sQ B = {y ∈ R d : y = e sQ x, x ∈ B} (see for instance Theorem 3.1 in [33]). Moreover, according to Masuda [29], the process has a transition density which is infinitely differentiable and bounded (i.e. belongs to C ∞ b ) if the rank(Σ) = d or the Lévy measure ν satisfies Orey-Masuda's condition, that is to say, if there exist constants α ∈ (0, 2) and c > 0 such that A weaker assumption, on the Lévy measure ν, for the transition density being infinitely differentiable and bounded appears in Bodnarchuk and Kulyk [12,13]. Indeed, according to Theorem 1 in [13] if where S d−1 denotes the unit sphere in R d , then the transition density belongs to C ∞ b . Actually in the one dimensional case, the following condition as r goes to 0, turns out to be a necessary and sufficient for the transition density being in C ∞ b (see Theorem 1 in [12]). We also point out that the previous condition differs from the socalled Kallenberg's condition, i.e.
as r goes to 0, which is a necessary condition for the density of the Lévy process ξ being in C ∞ b (see Section 5 in Kallenberg [20]). Indeed, the Lévy measure ν = n≥1 nδ 1/n! satisfies (2.4) but not Kallenberg's condition. Actually for such example, we have the following unexpected behaviour which is that the distribution of the Lévy process ξ is singular but the transition density of its associated OUL process is in C ∞ b (see Example 1 in [13]). In other words, the drift given by the dynamics (2.1) may provide enough regularity to the transitions even if the distribution of the noise is singular.
It seems that we cannot expect a weaker condition for the transition density being in C ∞ b than (2.3), since according to Bodnarchuk and Kulyk [13], the following condition is necessary for the existence of a bounded continuous density. Before we introduce the invariant distribution of the process X (ǫ) , we recall the notion of self-decomposability operator of a distribution on R d . Let Q ∈ M + (d), then an infinitely divisible distribution µ on R d is called Q-self-decomposable if there exists a probability distribution η t,Q such that, for each t ≥ 0, where η t,Q denotes the characteristic function or Fourier transform of η t,Q . An infinitely divisible distribution µ on R d which is Q-self-decomposable for some Q ∈ M + (d) is called operator self-decomposable. If d = 1, then the operator self-decomposability property reduces to self-decomposability. It is important to note that the support of any Q-selfdecomposable distribution is unbounded except for delta distributions (see for instance Corollary 24.4 in Sato [31]).
In the sequel, we assume that the Lévy processes ξ satisfies the following log-moment condition where a ∨ b denotes the maximum between the numbers a and b, which is equivalent to where D r := {z ∈ R d : |z| ≤ r}, for r ≥ 0 , and D c r := R d \ D r (see Theorem 25.3 in Sato [31]). The log-moment condition (2.7) is necessary and sufficient for the existence of a stationary distribution for the process X (ǫ) , here denoted by µ (ǫ) , and it satisfies see for instance Theorems 4.1 and 4.2 in [33] (or Theorem 17.5 in [31] for the case Q = qI d for q > 0 and where I d denotes the identity matrix). Moreover, the distribution In fact, Theorem 4.1 in [33] determines the class of all of Q-self-decomposable distributions as the class of all possible invariant distributions of OUL. According to Yamazato [35] if µ (ǫ) is non-degenerate then µ (ǫ) is absolutely continuous with respect to the Lebesgue measure on R d . However, not so much information about the regularity of the density can be found in the literature, up to our knowledge. In the one dimensional case, the distribution µ (ǫ) is self-decomposable and, if it is non-degenerate, then it is absolutely continuous with respect to the Lebesgue measure and its density is increasing on (−∞, ℘) and decreasing on (℘, ∞), where ℘ ∈ R is known as the mode (see for instance Theorem 53.1 in [31]).
Computing explicitly the density of the invariant distribution µ (ǫ) is rather complicated even in the one dimensional case but in some specific examples we can say something about it. For instance, if the Lévy process ξ admits a continuous density and µ (ǫ) has a smooth density then the latter can be determined explicitly, see Remark 2.3 in [29]. In the case when ξ is a subordinator with finite jump measure (i.e. ν(0, ∞) < ∞), the asymptotic behaviour of its density near 0 can be established (see for instance Theorem 53.6 in [31]).

Main results.
Recall that d (ǫ) (t) denotes the total variation distance between the distribution of X (ǫ) t and its invariant distribution µ (ǫ) , that is to say ∞ denotes the limit distribution of X (ǫ) whose law is given by µ (ǫ) . We also introduce the Lévy process ξ ♮ as follows ξ ♮ t := ξ t − at, for t ≥ 0, and its associated exponential functional I ♮ = (I ♮ t , t ≥ 0) which is defined by We observe that its limiting distribution is well-defined under the log-moment condition (2.7), here denoted by I ♮ ∞ . We also denote by µ ♮ t the distribution of I ♮ t , for t ≥ 0, and µ ♮ ∞ the distribution of I ♮ ∞ . For the sequel, we assume that for any t > 0, where µ ♮ t denotes the characteristic function of µ ♮ t , D c R := R d \ D R and t 0 (R) is positive and goes to ∞ as R increases. The integrability condition on µ ♮ t implies that I ♮ t , for t > 0, has a continuous density f t (x) that goes to 0 as |x| goes to ∞ (see for instance Proposition 28.1 in [31]). If the log-moment condition (2.7) also holds then I ♮ ∞ also has a continuous densities f ∞ (x) that goes to 0, as |x| goes to ∞. It is not so difficult to deduce that the same property holds for the density of X (1) t and its invariant distribution µ (1) . Indeed, the latter holds since I ♮ t and X (1) t differ only in the drift term and where |z| denotes the modulus of z ∈ C. For simplicity, we use the same notation between the Euclidian norm and the modulus for the complex numbers. The second condition in (H) guarantees the convergence, under the total variation distance, of I ♮ t towards I ♮ ∞ , as t increases.
Recall that the matrix Q belongs to M + (d), i.e. it has eigenvalues with positive real part. The following lemma, whose general proof can be found in Barrera and Jara [8] (see Lemma B.1) and we state it here for the sake of completeness, provides the asymptotic behaviour of e −tQ , for large t. Such asymptotic behaviour is necessary for determining the cut-off times.
The numbers {λ ± θ k , k = 1, . . . , m} are eigenvalues of the matrix Q and the vectors {v k , k = 1, . . . , m} are elements of the Jordan decomposition of Q. For a better understanding of the asymptotic behaviour of e −tQ and its role, we provide the proof of Lemma 2.2 in the case when all the eigenvalues of Q are positive real numbers, see Proposition A.4 in the Appendix. Now, we present the main result of this paper using the same notation as in the previous lemma.
Recall that the cut-off times t ǫ and the windows cut-off w ǫ depend on the initial condition. We omit such dependence for simplicity of exposition. It is also important to note that when the initial condition x 0 = 0, we have X that is to say the cut-off phenomenon does not occur. Moreover, Lemma 2.2 does not hold if x 0 = 0. For that reason, we assume that x 0 = 0. We also observe that the profile function G x 0 , when it exists, depends on the initial condition x 0 and is given in terms of the total variation distance between two Q-selfdecomposable distributions that do not depend on the drift term of the underlying Lévy process since such part is deterministic. The previous observation is very interesting since most of the examples that appears in the literature (mainly for Markov chains such as the random walk on the hypercube) that exhibits profile cut-off are given in terms of the Gauss error function. Up to our knowledge, the only examples that exhibit profile cut-off and do not fulfill the previous property are the top-to-random shuffle and the transposition shuffle for which the important statistic is the number of fixed points which behaves like a Poisson random variable, see Lacoin [25] and the references therein.
We also point-out that part (i) of our main result includes the case when Q has real eigenvalues which is easier to understand. Indeed, denote by γ 1 ≤ γ 2 ≤ · · · ≤ γ d for the eigenvalues associated to Q. If Q is a symmetric matrix (or that the process (2.1) is reversible) with different eigenvalues then we can characterise explicitly γ, ℓ and v in a very simple way using the celebrated Spectral Theorem. In other words, there exists an with the understanding that if τ (x 0 ) = d, then the second term of the right-hand side of the above identity equals 0. Consequently, If Q is still symmetric but some of the eigenvalues may repeat, the values of γ, ℓ and v can also be determined using the matrix diagonalisation method. For more details see Proposition A.4, part (i) in the Appendix.
In the particular case when d = 1 and conditions (2.7) and (H) are satisfied, we always have profile cut-off for the OUL process. Moreover, we have that µ (1) and I ♮ ∞ are selfdecomposable (see Theorem 17.5 in [31]).
For a general Q ∈ M + (d) and since part (ii) of Theorem 2.3 only implies window cut-off for the family of OUL processes (X (ǫ) , ǫ > 0), then a natural question arises: Are there cases where profile cut-off exist for general Q? There is an affirmative answer to this question which depends on the following invariance property, Assume that the log-moment condition (2.7), (H) and the invariance property (2.9) hold and we take 0 < ǫ ≪ 1 such that t ǫ and w ǫ are defined as in then there is profile cut-off for the family of OUL processes (X (ǫ) , ǫ > 0).
We point out that the invariance property (2.9) is satisfied when the limiting distribution µ ♮ ∞ is isotropic, i.e. that it is invariant under orthogonal transformations. Examples of isotropic distributions which are also self-decomposable are the standard Gaussian and isotropic stable distributions. The strategy to deduce our main result is as follows. We first introduce an auxiliary metric which approximates the metric d (ǫ) quite well. The idea of using this auxiliary metric comes from the Brownian setting where the distributions of X (ǫ) t and X (ǫ) ∞ are known and everything is much easier to control thanks to the good estimates that one can deduce under the total variation distance. More precisely, both distributions are Gaussian and the convergence of such auxiliary metric depend on the speed of convergence of the mean of X (ǫ) t towards the mean of X (ǫ) ∞ , under the total variation distance. In the Lévy case such estimates are not available (even in the stable case) and henceforth a new approach is needed. Indeed, we use Scheffé's Lemma together with Lemma 2.2 to study the behaviour of the auxiliary metric. Here is where the cut-off times appear by analysing the drift term e −Qt x 0 and using the scaling parameter √ ǫ that appears on the noise. Then to control the error term, we use the Fourier inversion of the characteristic functions of X (ǫ) t and its invariant measure, together with Masuda-type estimates as those used in [29]. The remainder of this paper is organised as follows. In section 3, the statements of the cut-off phenomenon of the superposition and average processes are established. Section 4 provides few examples where the assumption (H) is fulfilled. Section 5 is devoted to the proofs of the results of this paper. Finally, in the Appendix some tools that we omit along this paper are established.

Further results
3.1. Superposition process. A simple and nice way to model observational processes that show significant dependence over long time periods is by means of superposition of independent processes with short-range dependence. In this setting, superposition of independent OU type processes have provided flexible and analytically tractable parametric models, see for instance Barndorff-Nielsen [3] and the references therein.
On the other hand the superposition of independent OU type processes, that we consider here, can be associated with an example of a cylindrical OU process which is defined in terms of an infinite-dimensional Langevin equation, see Section 7 in Applebaum [2] for further details. As it is noted in [2], infinite-dimensional processes arise naturally in mathematical modelling through noise that is described as "superposition" of independent real-valued Lévy processes.
In the sequel, we take the parameter ǫ > 0 and introduce (ξ (j) , j ≥ 1) a sequence of independent real-valued Lévy processes which are not necessarily equally distributed. For each j ≥ 1, we assume that ξ (j) has characteristics (a j , σ j , π j ) satisfying a j ∈ R, σ j ≥ 0 and Similarly as before, for each j ≥ 1, we also consider their associated OUL processes X (ǫ,j) which are defined by where γ j > 0. Let m = (m j , j ≥ 1) be a sequence of real positive numbers such that j≥1 m j = 1 and define the superposition process χ (ǫ) := (χ (ǫ) t : t ≥ 0) as follows We now introduce a series of assumptions that guarantee that the superposition process χ (ǫ) is well-defined. We first assume that the initial configuration x := (x j : j ≥ 1) is m-integrable, that is to say, The next conditions guarantee that the drift and Gaussian terms of (3.1) are well-defined, For the jump structure of the process, some additional conditions are needed. Let us assume that the Lévy measures (π j : j ≥ 1) satisfies t , t ≥ 0) is well-defined and has a limit distribution µ (ǫ,m) which is independent of the initial configuration x. Moreover, µ (ǫ,m) is self-decomposable with characteristics Since µ (ǫ,m) is a self-decomposable random variable on R then it is degenerated or absolutely continuous with respect to the Lebesgue measure. As we mentioned before, from Theorem 53.1 in [31] its density, when it exists, is unimodal. From Proposition 2.4 in [29] or Proposition 24.19 in [31], it follows that µ (ǫ,m) is non-degenerate if and only if the limit distribution of the process X (ǫ,j) is non-degenerate for some j ≥ 1.
For the main result in this section, we assume that the friction coefficients are uniformly bounded away from 0. In other words, we assume uniform coercivity as follows (3.6) there exists γ > 0 such that γ j ≥ γ for any j ≥ 1.
For every ǫ > 0 and t ≥ 0, define denotes the limiting distribution of χ (ǫ) whose law is given by µ (ǫ,m) . 3.2. Average process. Finally, we study the cut-off phenomenon for the average of OUL when the driving process is a stable Lévy process. In the diffusive case, Lachaud [24] observed that the average process satisfies window cut-off with the same cut-off and window times as the sample of OU processes. The previous observation is quite surprising since the sample process comprises a huge amount number of processes. As we will see below, the average process of OU not only possesses cut-off and window cut-off but also has profile cut-off. Let us consider the sequence (ǫ n , n ≥ 1) of strictly positive real numbers converging to 0 accordingly as n increases. In what follows, we assume that the process ξ is a real-valued stable Lévy process with a linear drift a ∈ R, that is to say its characteristic exponent ψ α is given by where α ∈ (0, 1) ∪ (1, 2], c > 0 and β ∈ [−1, 1] or α = 1, β = 0 and we understand β tan(πα/2) = 0. We also consider the sequence of OUL processes ((X (ǫn) t : t ≥ 0), n ≥ 1) such that for each n ≥ 1, X (ǫn) is defined as the unique strong solution of (2.1) with γ := Q > 0 and initial condition X , · · · , X (ǫn),n t , t ≥ 0 be a sample of n independent copies of X (ǫn) . For simplicity on exposition, we denote by (ξ (j) , 1 ≤ j ≤ n) for the sequence of independent copies of the stable Lévy process ξ which drives the above sample of OUL.
For each n ≥ 1, we define the average process A (n) : It is not so difficult to deduce that the uniform average process A (n) satisfies the following SDE dA It is straightforward to deduce that the characteristic exponent of L (n) is given by Since the stable Lévy process ξ satisfies the log-moment condition (2.7) for α ∈ (0, 2), the average process A (n) has a limiting distribution, that we denote by A (n) ∞ . On the other hand, it is well known that the limiting distribution also exists when ξ is a Brownian motion with drift, i.e. when α = 2. In any case, the characteristic exponent of A (n) ∞ is as follows For each n ≥ 1, we also define the total variation distance between A (n) t and its limiting distribution by It is important to note that the assumption that ξ is a stable Lévy process with drift is crucial in our arguments. Indeed, the dimension of the sampling and the cut-off times t n in the distance d (n) are very strong related that without the scaling property seems to be very difficult to deduce any limiting behaviour of d (n) (t n ). To be more precise the weak limit of tn 0 e −γ(tn−s) dL (n) s , under the total variation distance needs to be well-understood.

Smoothness
In this section, we provide a few examples where condition (H) is satisfied. Moreover, in all examples presented below the marginal distribution of the OUL process X (ǫ) (and the superposition process χ (ǫ) ), for any ǫ > 0, has a density in C b or C ∞ b . Implicitly the invariant distribution µ (ǫ) (similarly for µ (ǫ,m) ) and the random variable I ♮ ∞ (similarly for I ♮,m ∞ ) have densities belonging to C b or C ∞ b . For simplicity, we use the notation ℜ(z) and ℑ(z) for the real and imaginary part of any complex number z.
1. The first example that we consider here is the case when Σ is positive definite, i.e. the Lévy process ξ has presence of a d-dimensional Brownian motion. In other words, the matrix Σ has full rank and implicity for any t > 0, X (1) t and I ♮ t have densities belonging to C ∞ b (see Masuda [29]). Both distributions possess a Gaussian component which are described by the covariance matrix implying the integrability of the map λ → |λ| k | µ ♮ t (λ)|, for any k nonnegative integer, and implicitly the smoothness for the densities of X (1) t and I ♮ t . Moreover, if the log-moment condition (2.7) holds, then the same holds true for the limiting distributions µ (1) and I ♮ ∞ , where the Gaussian component is described by the covariance matrix In order to deduce the second part of condition (H), we first observe that the characteristic exponent ψ of the Lévy process ξ satisfies as |λ| → ∞.
2. The second case, that we consider here, is very similar to the previous example and includes the so-called family of stable Lévy processes. Indeed, we suppose that there exists α ∈ (0, 2) such that In other words, there exist constants C > 0 and R > 0 such that ℜ(ψ(λ)) ≤ −C|λ| α for any |λ| ≥ L > 0. Hence, for any t > 0, we define L t = c −1 4 Le c 2 t and deduce The previous integral is clearly finite and implicitly the smoothness for the densities of X (1) t and I ♮ t is obtained. If the log-moment condition (2.7) holds, then the same holds true for the limiting distributions µ (1) and I ♮ ∞ and implicitly, the smoothness for their densities.
Using exactly the same arguments but with R > c −1 which implies the second part of condition (H). 3. Our third case impose an Orey-Masuda or Kallenberg-Bornarchuk-Kulik type condition on the jump structure of the Lévy process ξ. To be more precise, let us assume that there exists a radial non-negative function κ : R d → [0, ∞) satisfying i) as a function of the radius, i.e.κ(r) = κ(v) if |v| = r > 0, it is non-decreasing, ii) for any β > 0, we have for any v ∈ R d with |v| ≥ 1.
In order to prove that for any t > 0, µ ♮ t possesses a density which is smooth, we first observe that for any Recalling that 1 − cos(x) ≥ ( 2 /π −2 )x 2 for |x| ≤ π, we deduce that for any λ ∈ R d , z, e −sQ T λ /π 2 ν(dz)ds .
Next, we take L t = Lπc −1 4 e c 2 t , and observe that if |λ| ≥ L and 0 ≤ s ≤ t, then |e −sQ T λ| ≥ π. Hence, using (4.3) together with (4.1) and the fact thatκ is non-decreasing, we deduce which is integrable from our hypothesis. Similarly as in the previous cases, the latter implies that the densities of X (1) t and I ♮ t , for t ≥ 0, belong to C b . Moreover, if the logmoment condition (2.7) holds, then the same holds true for the limiting distributions µ (1) and I ♮ ∞ . Using exactly the same arguments but with R > c −1 4 π and t 0 (R) := c −1 2 ln c 4 R π > 0, we deduce that for any t ≥ t 0 (R) which after change of variable and using our assumptions on κ, allow us to deduce the second part of condition (H).

5.1.
Preliminaries. From the so-called Lévy-Itô decomposition, we can express the Lévy processes (ξ t , t ≥ 0) as the sum of two independent Lévy processes, in other words, t , t ≥ 0) is a pure jump Lévy process in R d which is independent of B. The latter implies that we can rewrite the solution of the SDE (2.1) as follows t , for t ≥ 0, we identify and for simplicity, we write C t := I − e −tQ Q −1 a, for t ≥ 0. Then, we write X (ǫ) as follows X Assuming that ξ satisfies the log-moment condition (2.7), then X (ǫ) t converges in distribution to X (ǫ) ∞ , as t goes to ∞. We recall that the law of X (ǫ) ∞ is given by µ (ǫ) and observe that it can be written as denotes the limiting distribution of I ♮ t as t increases. We also recall that I ♮ ∞ is Q-self-decomposable and, if it is non-degenerate, then its distribution is absolutely continuous with respect to the Lebesgue measure on R d (see Yamazato [35]). Bearing all this in mind, in order to study how the process X (ǫ) converges to its equilibrium distribution µ (ǫ) under the total variation distance, as t increases, we define the auxiliary metric as follows and introduce the error term which does not depend on ǫ.
Lemma 5.1. For any ǫ > 0 and t > 0, we have Proof. We first use the triangle inequality to deduce On the one hand, from Lemma A.1 part (i), we see Recalling that X (ǫ) On the other hand since X (ǫ) ∞ , we apply Lemma A.1 part (iii) to deduce Putting all pieces together in inequality (5.4), allow us to get the following inequality Similarly, we have where the first and the last identities follow from Lemma A.1 (parts (i), (ii) and (iii)); and the second inequality follows from the triangle inequality. The desired result now follows form both inequalities.
Therefore, our approach for proving the main result consists in determining the cut-off phenomenon (window and/or profile respectively) for the auxiliary distance (5.1) and that the error term (5.2) vanishes, as t goes to infinity. The latter, together with inequality (5.3) implies the cut-off phenomenon for the distance d (ǫ) as we will see in Subsection 5.3.

Auxiliary metric.
For a better understanding of our arguments, we study separately the behaviour of the auxiliary metric D (ǫ) . In order to do so, we use the asymptotic behaviour in Lemma 2.2.
Proposition 5.2. Let Q ∈ M + (d). Using the same notation as in Lemma 2.2, we let 0 < ǫ ≪ 1 such that where lim ii) if θ k = 0 for some k ∈ {1, . . . , m}, we have for each c ∈ R that there exists and is different from the null vector; and moreover Proof. We first prove part (i). Let us define Recalling the definition of the auxiliary metric in (5.1) and using the triangle inequality, we deduce We then apply Lemma A.1 part (iii) in order to deduce D (ǫ) (t) ≤ R (ǫ) (t) + D (ǫ) (t). On the other hand, using again the triangle inequality, we obtain Similarly as before, we apply Lemma A.1 part (iii) and deduce Putting all pieces together, we get Next, from Lemma A.1 part (i), we observe On the other hand, straightforward computations led us to (5.6) lim for any c ∈ R, which implies, together with Lemma 2.2, that for every c ∈ R. Therefore, Scheffé's Lemma allow us to deduce (5.7) lim ǫ→0 R (ǫ) (t ǫ + cw ǫ ) = 0 for any c ∈ R.
Since θ k = 0 for every k ∈ {1, . . . , m}, Scheffé's Lemma together with relation (5.6) imply for every c ∈ R. Using inequality (5.5) and Scheffé's Lemma again, we deduce for every c ∈ R. Finally, we use again Scheffé's Lemma to derive lim The proof of part (ii) follows from similar arguments as those used above by taking a subsequence (t ǫ ′ + cw ǫ ′ , ǫ ′ > 0) of the sequence (t ǫ + cw ǫ , ǫ > 0). Indeed, we first observe that inequality (5.5) and the limit (5.7) always holds. On the one hand, we also observe that On the other hand, from Lemma 2.2 we have which allows us to deduce that the null vector is not in the basin of attraction which is defined by (5.9) Bas := v ∈ R d : there exists a sequence (t j ) ↑ ∞ and lim and turns out to be not empty since the vectors {v k , k = 1, . . . , m} are linearly independent. Next, let D = lim sup ǫ→0 D (ǫ) (t ǫ + cw ǫ ) and we take a subsequence (t ǫ ′ + cw ǫ ′ , ǫ ′ > 0) of |v k | for any ǫ ′ > 0. By Bolzano-Weierstrass' Theorem, we get that there exists a subsequence (t ǫ ′′ + cw ǫ ′′ , ǫ ′′ > 0) of (t ǫ ′ + cw ǫ ′ , ǫ ′ > 0) such that lim The proof of part (ii) is now complete.
It is important to note that (5.8) breaks down the existence of a profile function and only windows cut-off for the auxiliary distance D (ǫ) can be hoped. Indeed, from the previous proof, we have deduced whereṽ(x 0 ) ∈ Bas. Similarly, we can obtain wherev(x 0 ) ∈ Bas. Since lim Proof. We first observe that X ∞ has the same distribution as C ∞ +I ♮ ∞ . From the triangle inequality, we have From our assumptions I ♮ ∞ has a continuous density and since lim t→∞ C t = C ∞ , an application of Scheffé's Lemma allows us to deduce which is equivalent, according to Lemma A.1 part (i), to From our hypothesis, we have that I ♮ t has a continuous density f t (x) that goes to 0 as |x| goes to ∞. Recalling that for any t > 0 and λ ∈ R d , we also deduce that I ♮ ∞ has a continuous density f ∞ (x) that goes to 0 as |x| goes to ∞, under our assumptions. By the Fourier inversion formula, we know, for Lebesgue almost everywhere x ∈ R d , Therefore, for Lebesgue almost everywhere x ∈ R d , In other words, the proof will be completed if we deduce (5.12). In order to do so, we take R > 0 and introduce a strictly positive constant t 0 (R) that only depends on R. Thus, we observe converges in distribution to I ♮ ∞ as t goes to infinity then µ ♮ t (·) converges uniformly on compact sets to µ ♮ ∞ (·) as t goes to infinity. Then, for any R > 0 we have On the other hand for the second term in (5.13), we get, for any R > 0, Since for any t > t 0 (R) we have The latter inequality, together with our assumption (H), imply that as R increases, and implicitly we obtain (5.12). The proof is now complete.
At this stage, we have all the tools to prove Theorem 2.3.
Proof of Theorem 2.3. We first prove part (i). From Lemma 5.1, we have (5.14) and from Proposition 5.3, we know that lim ǫ→0 R(t ǫ + cw ǫ ) = 0, for any c ∈ R. On the other hand, from Proposition 5.2 part (i), we also know Putting all pieces together in inequality (5.14), the desired result is obtained. Now, we prove part (ii). Recall from the equalities (5.10) and (5.11) that whereṽ(x 0 ),v(x 0 ) ∈ Bas, and Bas denotes the basin of attraction which is defined in (5.9). On the other hand since, for any c ∈ R, lim ǫ→0 R(t ǫ + cw ǫ ) = 0 (see Proposition 5.3), inequality (5.14) allow us to deduce Using Scheffé's Lemma, the following limit is obtained Finally, recalling thatṽ(x 0 ) = 0 and using Lemma A.3, we get which implies the statement in part (ii) of Theorem 2.3. This completes the proof.
Proof of Corollary 2.4. From the same arguments used in the proof of Theorem 2.3 part (ii), we deduce where Bas is defined in (5.9). Since m k=1 e iθ k t v k is constant, we obtain that |ṽ(x 0 )| = |v(x 0 )|, and using the invariance property (2.9) we deduce

Proofs for the superposition process.
Proof of Lemma 3.1. Let us first fix ǫ > 0. From the Lévy-Itô decomposition, for each j ≥ 1, we have : t ≥ 0) is a standard Brownian motion and L (j) = (L (j) t : t ≥ 0) is a pure jump Lévy process which is independent of B (j) . Therefore, for each j ≥ 1, we deduce for any t ≥ 0. In other words, for each t ≥ 0, the r.v.
are well-defined. The finiteness of the first two terms is clear. Indeed, from condition (3.2), we have For the second term, we observe from the first condition of (3.3) that For the continuous local martingale term M := (M t , t ≥ 0), we use its quadratic variation to deduce that M t is well-defined if and only if The latter is finite if the second condition in (3.3) holds, implying the M t is well-defined for any t ≥ 0. Finally, we analyse the pure jump term. In order to deduce that the r.v. N t , which is infinitely divisible, is well-defined for any t ≥ 0, we need to verify that its characteristic function is also well-defined. In other words, we need to verify that ∞ j=1 t 0 ψ j (e −γ j s zm j )ds exists for z ∈ R, where ψ j denotes the characteristic exponent of L (j) , for j ≥ 1. In order to do so, we first observe that each for any B ∈ B(R). Therefore, Since Similar computations as those used in the proof of Theorem 17.5 in [31] allow us to deduce that for any where the left-hand sides of both inequalities are finite by assumptions (3.4) and (3.5). It is important to note that all our bounds do not depend on t, implying that χ (ǫ) t converges in distribution as t goes to infinity to χ and does not depend on the initial configuration x 0 . Moreover from its structure, it is not difficult to see that χ It remains to prove that R (m) (t) goes to 0 as t goes to infinity. Using the triangle inequality and Lemma A.1 part i), we deduce Since µ ♮,m t satisfies condition (H) for any t > 0, we have that I ♮,m ∞ has a continuous density and since

Proof of Theorem 3.3.
Proof of the Theorem 3.3. Let us consider the sequence (ξ (j) , j ≥ 1) of independent copies of the stable process ξ with drift a. Therefore, for n ≥ 1 and j ∈ {1, . . . , n}, we can write t − at, for t ≥ 0. For simplicity on exposition, for each j ≥ 1, we denote In other words, the average process (A (n) Observe that the sequence of processes (Y (j) , j ≥ 1) defined above is clearly independent and identically distributed. Moreover, it is not difficult to deduce that for each t > 0, the distribution of Y (j) t is strictly stable with characteristic exponent ψ t,α (z) = − c(1 − e −αγt ) αγ |z| α (1 − iβ tan(πα/2)sgn(z)) for z ∈ R.
Moreover, for each j ≥ 1, the following limit and is also a stable distribution with characteristic exponent Next, for each t > 0, we define the auxiliary metric , and the error term Similar reasonings as those used in Lemma 5.2 allow us to deduce On the other hand by the scaling property of the total variation distance (see Lemma A.1 part (ii)), we deduce Moreover, since the sequence (Y (j) , j ≥ 1) is independent and with the same distribution, we have for each t > 0 = means identity in law or distribution. We observe that the latter identity in law also holds for t = ∞.
In other words, we can rewrite the distance D (n) (after using the scaling property of the total variation distance) and error term R (n) as follows, Finally, we take the sequences t n and w n as in the statement and recall that S α has a continuous unimodal density. Therefore an application of Scheffé's Lemma allow us to deduce lim n→∞ D (n) (t n + cw n ) = e −c x 0 + S α − S α TV , and lim n→∞ R (n) (t n + cw n ) = 0.
This completes the proof of our result.
Appendix A. Tools The following section contains useful properties that help us to make this article more fluid. Since almost all proofs are straightforward, we left most of the details to the interested reader except for those that seem to be not so direct.
Lemma A.1. Let (Ω, F , P) be a probability space. Let X, Y : Ω → R d be random variables such that their laws are are absolutely continuous with respect to the Lebesgue measure on (R d , B(R d )). For any a, b ∈ R d and c ∈ R \ {0}, the following holds true: Proof. The idea of the proofs of (i)-(iii) follow by the Change of Variable Theorem and using the characterisation of the total variation distance between two probabilities with densities where f X and f Y are the densities of X and Y , respectively.
Lemma A.2 (Convolution). Let (Ω, F , P) be a probability space and X 1 , X 2 , Y 1 , Y 2 , Z be r.v.'s defined on Ω and taking values in R d . i) Assume that X 1 and X 2 are independent, and that Y 1 and Y 2 are independent. Then ii) Assume that (X 1 , Y 1 ) is independent of Z. Then Proof. The idea of the proof follows from the fact that the distribution of the sum of two independent random variables corresponds to their convolution.
The following Lemma is stated in [8] and its proof is straightforward, we leave the details to the interested reader. Lemma A.3. Let (Ω, F , P) be a probability space and let X : Ω → R d be a random variable. Assume that L(X) is absolutely continuous with respect to the Lebesgue measure on (R d , B(R d )). Let (α ǫ , ǫ > 0) be a function such that lim  Proof. We first prove part (i). Without loss of generality, we assume that γ 1 ≤ · · · ≤ γ d .
We define τ (x 0 ) := min{j ∈ {1, . . . , d} : y j = 0}, and take limit as t increases in the previous identity to deduce For the proof of part (ii), we observe that since Q is not symmetric, it is not always diagonalisable. Nevertheless, Q has a Jordan decomposition. In other words, there exist an invertible d-square matrix U and a d-square matrix J such that Q = UJU −1 , where Since any Jordan block can be written as a sum of diagonal matrix and a nilpotent matrix then the exponential matrix of a Jordan block can be computed explicitly for any t ≥ 0. We let x 0 ∈ R d \ {0} and write y = U −1 x 0 = (y 1 , . . . , y d ) T = 0. We define τ (ỹ) = max{j ∈ {1, . . . , d} : y j = 0} and observe Next, we let r 0 = 0 and consider the partial sums r j = j i=1 k j for each j ∈ {1, . . . , m}.