Sample-path large deviations for a class of heavy-tailed Markov-additive processes

For a class of additive processes driven by the aﬃne recursion X n +1 = A n +1 X n + B n +1 , we develop a sample-path large deviations principle in the M ′ 1 topology on D [0 , 1]. We allow B n to have both signs and focus on the case where Kesten’s condition holds on A 1 , leading to heavy-tailed distributions. The most likely paths in our large deviations results are step functions with both positive and negative jumps.


Introduction
Let {X n , n ≥ 0} be an affine recursion such that X n+1 = A n+1 X n + B n+1 (1.1) for a sequence of i.i.d.R 2 -valued random vectors (A n , B n ).The Markov chain driven by (1.1) has been studied extensively in the past several decades and continues to pose new research challenges.A classical result, which can be found in [21] and [25] shows that under certain assumptions (see Assumption 2.1 below), the Markov chain X n , n ≥ 0 has a unique stationary distribution π, for which we have π(x, ∞) ∼ c + x −α and π(−∞, −x) ∼ c − x −α , as x → ∞, (1.2) for some c − , c + satisfying c − +c + > 0; see the monograph [7] for a recent comprehensive account.
Define Xn = { Xn (t), t ∈ [0, 1]}, with (1. 3) The focus of the present paper is on sample-path large deviations of the additive process Xn , assuming the invariant distribution of X n has a heavy tail as in (1.2).The study of additive processes of the form (1.3) is less well developed.Classical theory initiated by Donsker and Varadhan (see, for example, [16,17]) provides powerful tools designed to study large deviations for additive functionals of light-tailed and geometrically ergodic Markov chains.More recent contributions in this area include [26,27].Analogs of these sample-path results in a heavy-tailed setting do not seem to be available.
A considerable body of theory has been developed to analyse exceedance probabilities for random walks with heavy-tailed step sizes.Let such a random walk be given by { Ŝn , n ≥ 0}.Early papers [33, 34] identified appropriate sequences (x n ) for which P( Ŝn /n > x n ) = nP( Ŝ1 > x n )(1 + o(1)), as n → ∞, (1.4) holds, depending on the tail behavior of the distribution of Ŝ1 .For a detailed description, we refer to e.g.[6], [15], and [20].When (1.4) is valid, the so-called principle of a single big jump is said to hold.As a generalization of (1.4), a functional form has been derived in [24], where random walks with i.i.d.multi-dimensional regularly varying (cf.Definition 1.1 of [24]) step sizes are considered.
Several works have focused on the extension of (1.4) to more general processes where there is a certain dependence structure in the increments.Some key references are [18,23,30], where stable processes, modulated processes, and stochastic differential equations are considered.Extensions to additive processes of the form considered in this paper have been provided in [31,32], which also consider more general examples of driving recursion X n+1 = f n+1 (X n ).The principle of a single big jump is still valid, but an additional constant in the RHS of (1.4) can appear.
Extending the results of [31,32] to the sample-path level poses several phenomenological and technical challenges.So far, all results cited center around the phenomenon where rare events are caused by a single big jump.However, not all rare events are caused by this relatively simple scenario, for early examples see [19,38].In a recent paper, [35] provides sample-path large deviations results for Lévy processes and random walks with regularly varying increments, which deal with a general class of rare events that can especially be caused by multiple jumps.For further examples see [10].However, the case studied here is considerably harder, as big jumps occur by a condensation phenomenon, through the concatenation of many small jumps.In particular, a large value of the sample mean is not due to a single large value of the A n or B n but to large values of the products A 1 • • • A n , see also [8, 12].When studying sample-path large deviations, this phenomenon poses nontrivial technical requirements.In particular, an appropriate topology needs to be considered.
Our approach to overcome these challenges is as follows.We first proceed to identify a sequence of regeneration times r n , n ≥ 1 (see [2]) and split the Markov chain into i.i.d.cycles.By aggregating the trajectory of Xn over regeneration cycles, we obtain a regenerative process with i.i.d.jump distributions and r n , n ≥ 1 as renewals.Under a set of assumptions originating from [21] and [25], we establish our first result, Theorem 3.1, which is that the "area" under a typical regeneration cycle, denoted by R (see (3.2) below), has an asymptotic power law.To be precise, we have for some constants C − , C + .This is related to a result of [12] for the case where X n ≥ 0.
Our argument is different, developed in a two-sided setting, and can be extended to more general recursions, cf.[9].Using the tail estimates (1.5), we present in Sections 3.2 and 3.3 large deviations results for Xn as in (1.3), which constitutes the second major step in our approach.We achieve this by introducing a new asymptotic equivalence concept (see Lemma 2.13 below), which, together with the decomposition in cycles, allows us to build a bridge between our problem and the one studied [35].In the latter paper, the Skorokhod J 1 topology is used.However, showing that the residual process (i.e. the contribution of the cycle going on at the endpoint of our interval) is negligible in its contribution to P( Xn ∈ E) is not straightforward, especially when the increments of Xn are dependent as in the current setting.To overcome this, we switch to a slightly weaker topology, namely the M ′ 1 -topology on D[0, 1] (as defined in [4], see also Section 3.2 below), and derive asymptotic estimates of events involved with the "area" under the last ongoing cycle.This choice of topology is crucial as it allows many light-tailed jumps, occurring within a cycle, to merge into a single heavy-tailed jump.
Our main sample-path large deviations results are presented in Section 3.For the case where B n as in (1.1) is nonnegative, our result establishes that (1. 6) Precise details can be found in Section 3.2 below.At this moment, we just mention that C j is a measure on D[0, 1] for each j, and J * denotes the minimum number of jumps that are required for a nondecreasing, piecewise linear function with drift EB 1 /(1 − EA 1 ) to be in the set E. In Section 3.3 we develop a two-sided version of this result.
While we restrict to the case of affine recursions in (1.1), the methods developed in this paper can be extended to more general recursions of the form X n+1 = f n+1 (X n ), in which f n (z)/z → A n as z → ∞; we refer to [9] for details.On the other hand, our methods require the assumption A n ≥ 0 in (1.1).If this assumption no longer holds, it is possible to have big jumps of opposite sign in the same regeneration cycle, which requires a topology weaker than M ′ 1 .Functional central limit theorems allowing A n to have both signs were recently derived in [3] using the M 2 topology.Proposition 2.5.Assume that one of the following conditions hold.
1. Let B 1 ≥ b a.s.for some b > 0.Moreover, there exist intervals for some a 1 < a 2 , b 0 , δ > 0, a σ-finite measure ν 0 with b 0 in the support of ν 0 , and a constant c 0 > 0 such that for any Borel sets D 1 , D 2 ⊆ R, where | • | denotes the Lebesgue measure on R.

There exist intervals
δ > 0, a σ-finite measure ν 0 with a 0 in the support of ν 0 , and a constant c 0 > 0 such that for any Borel sets D 1 , D 2 ⊆ R, Then, for any x 0 ∈ R, there exists ǫ = ǫ(x 0 ), θ > 0, and an open interval E 0 such that (2.3) Our next result implies the geometric decay of P(r 1 > k) as k → ∞.
Lemma 2.6.Suppose that Assumptions 2.1 and 2.4 hold.Let {r n } n≥0 be the sequence of regeneration times associated with C 0 .Let E 1 be a bounded set.There exists t > 1 such that

A useful change of measure
Another helpful tool in our analysis is the so-called α-shifted change of measure (see e.g.[13, 14]).Let ν denote the distribution of (log A n , B n ) and define the α-shifted measure ν α by

L(log
for n > T. (2.4) Let P α , P D α T , E α and E D α T denote expectation and probability w.r.t. the α-shifted measure ν α and the dual change of measure D α T , respectively.Defining we have the following result.
Result 2.7 (Lemma 5.3 of [14]).Let T and τ be stopping times w.r.t.{X n } n≥0 , let g : R ∞ → [0, ∞] be a deterministic function, and let g n denote its projection onto the first n + 1 coordinates, i.e., g n (x 0 , . . ., x n ) = g(x 0 , . . ., x n , 0, 0, . ..).Then EJP 0 (2020), paper 0. Remark 2.8.Note that by the same argument, if a random variable R is measurable w.r.t. the stopped σ-algebra F T , then Our analysis relies on the fact that the Markov chain X n is closely related to a multiplicative random walk, that is, Roughly speaking, the process X n resembles a perturbation of a multiplicative random walk, in an asymptotic sense (for details see [13,14]).Hence, it is natural to consider the "discrepancy" process between X n and n i=1 A i , which is defined as where S n is as in (2.5).Under the α-shifted measure, we have E α log A 1 = EA α 1 log A 1 > 0 by Assumption 2.1 and Theorem 2.4.4 of [7].Consequently, we have the following result.

M-convergence
We briefly review the notion of M-convergence [28,35], and introduce a novel asymp- totic equivalence concept.Let (S, d) be a complete separable metric space, and S be the Borel σ-algebra on S. Given a closed subset C of S, let S \ C be equipped with the relative topology as a subspace of S, and consider the associated sub σ-algebra S S\C = {E : E ⊆ S \ C, E ∈ S } on it.Define C r = {x ∈ S : d(x, C) < r} for r > 0, and let M(S \ C) be the class of measures defined on S S\C whose restrictions to S \ C r are finite for all r > 0. Topologize M(S \ C) with a sub-basis {{ν ∈ M(S \ C) : , where C S\C is the set of real-valued, nonnegative, bounded, continuous functions whose support is bounded away from C (i.e., f The following characterization of M-convergence can be considered as a generalization of the classical notion of weak convergence of measures, see e.g.[5].
for all open G ∈ S S\C bounded away from C.
We now introduce a new notion of equivalence between two families of random objects, which will prove to be useful in Section 6.Let F δ = {x ∈ S : d(x, F ) ≤ δ} and G −δ = ((G c ) δ ) c .Note that when it comes to the fattening and shaving of sets, we denote open sets with a superscript and closed sets with a subscript.
Definition 2.11.Suppose that X n and Y n are random elements taking values in a complete separable metric space (S, d).Y n is said to be asymptotically equivalent to X n with respect to ǫ n and C, if, for each δ > 0 and γ > 0, Remark 2.12.Note that the asymptotic equivalence w.r.t.C implies the asymptotic equivalence w.r.t.C ′ if C ⊆ C ′ .In view of this, the strongest notion of asymptotic equivalence w.r.t. a given sequence ǫ n is the one w.r.t. an empty set.In this case, the conditions for the asymptotic equivalence reduce to a simple condition: for any δ > 0. That special case of asymptotic equivalence has been introduced and applied in [35].In our context, this simple condition suffices for the case of B 1 ≥ 0 in Section 3.2; however, we have to work with the case that C is not an empty set when we deal with general B 1 in Section 3.3.
The usefulness of this notion of equivalence comes from the following result.
Lemma 2.13.Suppose that ǫ −1 n P(X n ∈ •) → ν(•) in M(S \ C) for some sequence ǫ n and a closed set C. If Y n is asymptotically equivalent to X n with respect to ǫ n and C, then the law of Y n has the same normalized limit, i.e., ǫ

Main results
This section is organized as follows.In Section 3.1, we analyze the tail estimates of the area under the first return time and regeneration cycle, which are needed to derive the sample-path large deviations of Xn .In Section 3.2 we derive such results in the case where B 1 ≥ 0. The two-sided case is more involved and is treated in Section 3.3.

Tail estimates on the area under the first return time/regeneration cycle
Let We denote the area under the first return time and the regeneration cycle by 1.We have
When B 1 can take both signs, the situation is much more delicate and we sketch how one can deal with this issue.One way is to derive sufficient conditions for the support of Z under P α to be the entire real line, from which strict positivity of both C + and C − can be inferred.Such a sufficient condition can be derived from a careful inspection of the proof of Theorem 2.5.5 (1) of [7] (which is a result due to [22]).For example, if the the support of Z is the whole real line.

One-sided large deviations
We first consider the case where B 1 is nonnegative.To deal with the dependence structure of the Markov chain within the regeneration cycle, we consider in this section the space D = D[0, 1], consisting of real-valued functions with domain [0, 1] which are right-continuous with left limits.We endow D with the M ′ 1 topology.To describe this topology in detail, let ξ ∈ D and define the extended completed graph Γ ′ ξ of ξ by where ξ(0 − ) = 0. Define an order on the graph Γ ′ ξ by saying that (x 1 , t 1 ) ≤ (x 2 , t 2 ) if either (i) t 1 < t 2 or (ii) t 1 = t 2 and |ξ(t We say that a mapping (u, s) : [0, 1] → Γ ′ ξ is a parametric representation of ξ if r → (u(r), s(r)) is continuous and nondecreasing.Let Π ′ (ξ) be the set of all parametric representations of ξ ∈ D. For any ξ 1 , ξ 2 ∈ D, the M ′ 1 metric is defined by For the rest of the paper, we consider the topology w.r.t.this metric, unless specified otherwise.
For the one-sided large deviations result, we need the following elements.We say that a function ξ ∈ D is piecewise constant, if there exist finitely many time points t i such that 0 = t where ξ(0 − ) = 0.For each integer j, define D <j = {ξ ∈ D : ξ piecewise constant and nondecreasing, |Disc(ξ)| < j}.
For z ∈ R and each integer j, define For each constant γ > 1, let ν γ (x, ∞) = x −γ , and let ν j γ denote the restriction (to where α is as in Assumption 2.1 and the random variables U i , i ≥ 1, are independent and uniform distributed on [0, 1].For E ⊆ D and z ∈ R, define , we state below the main theorem for the onesided case.Recall C + defined in Theorem 3.1.As kindly pointed out by a referee, if B 1 ≥ 0 a.s., then thanks to Assumption 2.1 P(B 1 = 0) < 1 and C + must be strictly positive, due to (2.12) in [21].

Two-sided large deviations
Similarly as in Section 3.2, we need the following elements.Define the set of step functions with less than j discontinuities by Let C z 0,0 be the Dirac measure concentrated on the linear function zt.For each (j, where U i , V i are independent and uniform distributed on [0,1].1.For each j ≥ 1, where the summations are over all (l, m) that belong to the set I =Jµ(E) .

Proofs of Section 2
Proof of Proposition 2.5.Part 1) and Part 2) for the case x 0 = 0 are in [7, page 22].Hence, we focus on showing part 2) for the case x 0 = 0.
Note that for any Borel set E, (2.2) implies that and pick an ǫ > 0 sufficiently small so that E 0 is nonempty and ǫ < |x 0 | ∧ a 0 .Note that if x ∈ B ǫ (x 0 ), z ∈ E 0 , and a ∈ (a 0 − ǫ, a 0 + ǫ), then z ∈ E 0 implies z − ax ∈ I 2 ; that is, ½ {z∈E0} ≤ ½ {z−ax∈I2} .Therefore, we have that for all x ∈ B ǫ (x 0 ), Proof of Lemma 2.6.By Theorem 15.2.6 of [29] and Result 2.3, any bounded set is hgeometrically regular with h(x) = |x| δ + 1, δ ∈ (0, α).Thus, from the definition of h-geometrical regularity (cf.page 373 of [29]), there exists t > 1 such that sup x∈E1 E[ Note that for any s ∈ (1, t), by Jensen's inequality, we get From the regeneration scheme as described in Remark 2.2, we obtain Proof of Lemma 2.9.The second statement follows from Theorem 2.1.3 of [7] since the random walk −S n has a negative drift under P α and E α [log B 1 ] < ∞.To prove the first statement, we begin by using a similar argument, invoking Assumption 2.1, to conclude that the random variable ∞ i=1 |B i |e −Si is a.s.finite.Consequently, we can lower bound and we see that Combining this with the fact that the set [M, ∞) is attainable by {|X n |} n≥0 for sufficiently large M (by Assumption 2.1), Theorem 8.3.6 of [29] completes the proof.
Proof of Lemma 2.13.Let G be an open set bounded away from C so that G ⊆ (S \ C) −γ for some γ > 0. For a given δ > 0, due to the assumed asymptotic equivalence, P( , and hence, we arrive at the lower bound by taking δ → 0. Now, turning to the upper bound, consider a closed set F bounded away from C so that F ⊆ (S \ C) −γ for some γ > 0. Given a δ > 0, by the asymptotic equivalence assumption, P( as long as δ is small enough so that F δ is bounded away from C. Note that {F δ } is a decreasing sequence of sets, F = δ>0 F δ (since F is closed), and ν ∈ M(S \ C) (and hence ν is a finite measure on S \ C r for some r > 0 such that F δ ⊆ S \ C r for some δ > 0).Due to the continuity (from above) of finite measures, lim δ→0 ν(F δ ) = ν(F ).Therefore, we arrive at the upper bound lim sup n→∞ ǫ −1 n P(Y n ∈ F ) ≤ ν(F ) by taking δ → 0.

Proofs of Section 3.1
This section provides the proof of Theorem 3.1.Before turning to technical details, we briefly describe our strategy for proving the tail asymptotics of B. A similar idea is behind the proof for R. Let where 0 < γ < β < 1.We can then write X n . (5.2) We will choose β close enough to 1 and γ far enough from 1 so that β + γ > 1 and we can find ρ ∈ (γ, β) such that β − γ + ρ > 1.The proof of Theorem 3.1 ( 1) is based on the following steps.
• On the event {T (u β ) < τ d }, the first and the last term on the right hand side (r.h.s.) of (5.2) are negligible in contributing to the tail asymptotics.Proposition 5.1 below proves such properties.Lemma 5.5 is useful in showing Proposition 5.1.
• In view of the previous bullet, the second term on the r.h.s. of (5.2) plays the key role in P(B > u).Our analysis relies on the fact that the Markov chain X n resembles a multiplicative random walk in the corresponding regime.Proposition 5.2 below proves such asymptotics.Lemmas 5.6, 5.7 are helpful for proving Proposition 5.2.
Similarly, the proof of Theorem 3.1 (2) hinges on Propositions 5.3 and 5.4, which play the similar roles as Proposition 5.1 and 5.2, respectively.
Next we prove Proposition 5.1.For this, we need the following lemma.Let {Y n } n≥0 be the R + -valued Markov chain defined by Lemma 5.5.Suppose that Assumptions 2.1 and 2.4 hold.Let L > 0 be given, and let ǫ > 0 be such that ⌊α − ǫ⌋ ≥ 1.Then there exists a positive constant c such that, for sufficiently large x, In particular Proof of Proposition 5.1.To begin with, note that which decays exponentially.It remains to show the second claim.Let ρ be a number such that ρ ∈ (γ, β) and β − γ + ρ > 1, and define where the second term in the last equation is bounded by P(τ d > u 1−ρ ), and hence is of order o(u −α ).It remains to analyze the first term, which is bounded by P(T (u β ) < τ d , E 1 (u)).Our goal here is to show that (5.7) To begin with, note that, on E 1 (u), where K γ β (u) < ∞ almost surely, we can define {Y ′ n } n≥0 as follows there), we have that where ).Since we have chosen β, γ, and ρ in such a way that β − γ + ρ > 1, it remains to show that the second term on the r.h.s. is O(u −α(ρ−γ) ).Recall the definition of Y n and τ , and note that from the strong Markov property, as u → ∞.Recall Remark 2.8 and consider T = inf{n ≥ 1 : Y n ≥ u ρ }.We obtain (5.9) to prove that the r.h.s. of (5.8) is O(u −α(ρ−γ) ), and hence, P(T To show (5.9), note that (5.10) (5.11) (5.12) for some L > 0, where (5.10) is from Fatou's lemma and Minkowski's inequality, (5.11)   is from Remark 2.8 with T = k and R = |B k | α ½ {k<τ } , (5.12) is from the fact that ½ {k<τ } ≤ ½ {k≤τ } and ½ {k≤τ } ∈ mF k−1 so that ½ {k≤τ } and |B k | α are independent, and (5.13) is from Markov's inequality.Using Lemma 5.5 above, we prove (5.9), which, in turn, proves (5.7).This concludes the proof of Proposition 5.1. Set and Recall C ∞ in (3.3).The following two lemmas are useful in proving Proposition 5.2.
Lemma 5.6.Suppose that Assumptions 2.1 and 2.4 hold.Under the measure P α , Moreover, G + (u) and G − (u) are bounded in u by some constants almost surely.
Proof of Proposition 5.2.We focus on deriving the first asymptotics, since the second one follows using similar arguments.Note that (5.16) We first consider (I.1).Applying the dual change of measure D α T (u β ) together with Result 2.7, we obtain that where Recall the expression for Z n given in (2.6).Note that (5.17) for all n ≥ 0. Using Lemma 5.6, Lemma 5.7, the dominated convergence theorem and the fact that T (u β ) → ∞ as u → ∞, we obtain that lim u→∞ Analogously, we have that where G − (u) was defined in (5.15).Using (5.16), (5.17), and (5.18), we prove the first asymptotics in Proposition 5.2.The second one can be shown analogously.
We need the following lemmas to prove Proposition 5.3.Let Y n+1 = A n+1 Y n + |B n+1 | and let r be the first time that Y n regenerates.Lemma 5.8.Suppose that Assumptions 2.1 and 2.4 hold.Let ǫ > 0, and let L > 0 be such that ⌊α − ǫ⌋ ≥ 1.Then there exists a positive constant c such that, for sufficiently large x, Lemma 5.9.Suppose that Assumptions 2.1 and 2.4 hold.We have that where X is the positive random variable such that log X T (u) − log u converges in distribution to X as u → ∞ under P α .
Proof of Proposition 5.3.By replacing τ d with r 1 , the proposition can be shown using almost identical arguments as in the proof of Proposition 5.1.Nonetheless, we need to show that • P(T (u β ) < r 1 ) ∼ cu −αβ for some constant c, and that For this, we use Lemmas 5.8 and 5.9 above.
Proof of Proposition 5.4.Using Lemma 5.6, Lemma 5.7, the dominated convergence theorem and the fact that T (u β ) → ∞ as u → ∞, one can prove the first asymptotics.
The second one follows by a similar analysis.
Next we provide proofs of all lemmas in this section.To show Lemma 5.5, we introduce a result on bounding functionals of passage times for Markov chains.Let {V n } n≥0 be an {F n }-adapted stochastic process taking values in an unbounded subset of R + .Let {U n } n≥0 be another {F n }-adapted stochastic process taking values in an unbounded subset of R + such that U n is integrable for all n ≥ 0. Let τ V b = inf{n ≥ 0 : V n ≤ b} be the first time V n returning to the set [0, b].
Result 5.10 (Theorem 2 ′ of [1]).Suppose there exists a positive number d and functions g and h that are positive on (b, ∞), and • f is convex for sufficiently large x, • log f ′ is concave for sufficiently large x, • there exists a positive constant c f such that Then there exists a positive constant c such that, for all x ≥ b Proof of Lemma 5.5.We first apply Result 5.10 with f (y) = y α+L , g(y) = c 2 y α , h(y) = y α where α = ⌊α − ǫ⌋, and c 2 is a constant that we construct below, From the binomial formula, we see that there exist positive constants c 1 that depends on the first (α − 1)-st moments of A 1 and B 1 such that, on {Y n ≥ 1}, Using the fact that 0 < α < α and the moment generating function of log A 1 is strictly convex on [0, α], we have E[A α 1 ] < 1.Thus, there exists a sufficiently large constant d ′ and sufficiently small constant c 2 such that, on {Y n > d ′ }, As mentioned at the beginning of the proof, we set g(y) = c 2 y α = c 2 h(y) so that It Using Minkowski's inequality we obtain that for some c 4 , c 5 > 0. Along with (5.19), this implies that there exists a c > 0 such that E[τ α+L | Y 0 = x] ≤ cx ⌊α−ǫ⌋ for sufficiently large x.
The following lemma is useful in proving Lemma 5.6.Define (5.20) Lemma 5.11.Suppose that Assumptions 2.1 and 2.4 hold.Fix an arbitrary constant v such that |v| > 1.For any β + γ > 1 and any ǫ > 0 there exists a u 0 sufficiently large so that, for all u ≥ u 0 , Proof of Lemma 5.6.We first prove the statements associated with G + (u).
Again from (3.3) along with the assumption that (1 − β)α < γ, we get (5.30) In view of (5.28)-(5.30),we have that Combining this with (5.24) we have that Next we show boundedness of G + (u).Using (5.24), for ǫ > 0, and by separately considering v ≤ 1 + ǫ and v ≥ 1 + ǫ, there exists U (ǫ) (independent of v) such that (5.31) Thus Finally, we show the statements involved with G − .By the Markov property, it is sufficient to show that, for any arbitrary ǫ > 0 and v < −1 (5.20).We have that thanks to Lemma 5.11.The boundedness of G − u follows using similar arguments as in (5.31).Remark 5.12.Using similar arguments as in the proof of Lemma 5.6, one can show that As a consequence of this result, we have that Proof of Lemma 5.7.Note that Z + Moreover, using Minkowski's inequality we have that Finally, using Lemma 5.5 above we have E[r α+L | Y 0 = x] ≤ cx ⌊α−ǫ⌋ for sufficiently large x.
Applying a change of measure argument, we obtain that where It remains to consider (II.1).Note that, given F n , n ≤ T (u), the random variable log |X T (u) |−log u converges in distribution to some positive random variable X-which is independent of F n , n ≤ T (u)-as u → ∞, under the α-shifted measure (cf.e.g.Theorem 3.8 of [13]).Hence we have that Moreover, using dominated convergence and the fact completing the proof Proof of Lemma 5.11.To begin with, we write, for some δ > 0, To bound (III.1),we have that Combining this with Lemma 5.5, we conclude that there exist c and u 0 such that Note that we can set L = L(δ, α, β) to be arbitrarily large.Combining the estimates above for (III.1) and (III.2),we conclude the proof.

Proofs of Sections 3.2 and 3.3
Again, we briefly describe our proof strategy before diving into the technicalities.
where {r i } i≥0 is the sequence of regeneration times as in Remark 2.2, and Thanks to Theorem 4.1 of [35] and Theorem 3.1 above, we are able to establish an asymptotic equivalence between X′ n and some random walk Wn that will be specified below.This allows us to provide a large deviations result for X′ n , using Lemma 2.13.

Large deviations for heavy-tailed Markov-additive processes
In both the one-sided and the two-sided case, we will show that the residual process Xn − X′ n is negligible in an asymptotic sense.We state here three lemmata that will play key roles in the proofs of Theorems 3.2 and 3.3.Let Wn = { Wn (t), t ∈ [0, 1]} be such that where X ′ i is as in (6.1).We begin with stating an asymptotic equivalence between X′ n and Wn , however, w.r.t. the J 1 -topology, which is stronger than the M ′ 1 -topology introduced in the beginning of Section 3.2.Let d J1 denote the Skorokhod J 1 metric on D, which is defined by where id denotes the identity mapping, || • || ∞ denotes the uniform metric, that is, x ∞ = sup t∈[0,1] |x(t)|, and Λ denotes the set of all strictly increasing, continuous bijections from [0, 1] to itself.Moreover, for j ≥ 0, define Consider the metric space (D, d J1 ).Suppose that Assumptions 2.1 and 2.4 hold.For any j ≥ 1, the following holds.
1.If B 1 ≥ 0 and C + as in Theorem 3.1 is strictly positive, then the stochastic process X′ n is asymptotically equivalent to Wn w.r.t.n −j(α−1) and D µ j−1 .
2. If C + and C − as in Theorem 3.1 satisfy C + C − > 0, then the stochastic process X′ n is asymptotically equivalent to Wn w.r.t.n −j(α−1) and D µ ≪j .Proof.We only show part 2), since part 1) can be proved by a similar argument.By Lemma 2.13, it is sufficient to show, for any δ > 0 and γ > 0, To prove (6.4), it is convenient to consider the space of paths on a longer time horizon [0, 2].Let Wn denote the stochastic process { Wn (t), t ∈ [0, 2]} over the time horizon [0, 2], and D µ;[0,2] ≪j denote the space of step functions on [0, 2] that corresponds to To prove (6.5), we adopt the construction of a piecewise linear nondecreasing homeomorphism λn from [35, the proof of Theorem 4.1].Let t 0 = 0 and t i be the i-th jump time of N (n•) and t L be the last jump time of N (n•).Let L = (⌊n/Er 1 ⌋ − 1) ∧ N (n).Define λn in such a way that λn (t) = Er 1 N (nt)/n on t 0 , . . ., t L , λn (1) = 1, and λn is a piecewise linear interpolation in between.For such λn , Wn ( λn (t)) = X′ n (t) for all t ∈ [0, t L ], and hence, Wn The second term can be bounded (with high probability) as follows.For an arbitrary ǫ > 0, consider two cases: ⌊n/Er 1 − nǫ⌋ < N (n) < ⌊n/Er 1 ⌋ and ⌊n/Er On the other hand, if ⌊n/Er From (6.7) and (6.8), we see that on the event {⌊n/Er Using (6.6) and (6.9), we obtain that Thanks to Cramér's theorem, the second term in (6.10) decays geometrically.Moreover, using that λn (t) = Er 1 N (nt)/n on t 0 , . . ., t L (and linearly interpolated in between), we can write the last term in (6.10) as P( N (n • )/n − • /Er 1 ∞ > δ), which converges to 0 in view of the functional law of large numbers for renewal processes (see e.g.Theorem 5.10 of [11]).
For the first term in (6.10), we have that (see [35, page 21]) for some c > 0, where the intuition behind the asymptotics above is that, given the rare event takes place, the random walk n must have j big jumps and one of them has to occur in the time interval [1 − ǫ, 1 + ǫ].Since the choice of ǫ > 0 was arbitrary, (6.4) is proved by letting ǫ → 0.
The remainder of this section is split into two parts that deal with Theorems 3.2 and 3.3.

Proof of Theorem 3.2
We consider the case where B 1 is nonnegative.Let us give the "roadmap" of proving Theorem 3.2.
• In Corollary 6.5 below we establish a sample-path large deviations result for the aggregated process X′ n (see (6.1) above) by considering a suitably defined random walk together with utilizing Theorem 4.1 of [35].For the M-convergence in Corollary 6.5 we need Lemma 6.4 below.
• In Proposition 6.6 we show the asymptotic equivalence between the aggregated process X′ n and the original process Xn .Again, one technical lemma, see Lemma 6.7   below, is needed.
• Part 1) of Theorem 3.2 follows by combining Corollary 6.5 with Proposition 6.6.Part 2) is a direct consequence of part 1).Lemma 6.4.For all j ≥ 0 and all z ∈ R, the set D z j is closed w.r.t.(D, d M ′ 1 ).Recall that C z j was defined in (3.6) for z ∈ R. Corollary 6.5.Suppose that Assumptions 2.1 and 2.4 hold.Moreover, let B 1 ≥ 0 and C + as in Theorem 3.1 be strictly positive.For any j ≥ 0, as n → ∞.Proposition 6.6.Suppose that Assumptions 2.1 and 2.4 hold.If B 1 ≥ 0 and C + as in Theorem 3.1 is strictly positive, then Xn is asymptotically equivalent to X′ n w.r.t.
Proof of Lemma 6.4.We give the proof for the case where z = 0, while the proof for z = 0 follows using similar arguments.The statement is trivial for D 0 = {0}; we focus on the case where j ≥ 1.Let ξ n , n ≥ 1, be a sequence such that ξ n ∈ D j , for all n ≥ 1, and lim n→∞ d M ′ 1 (ξ n , ξ) = 0 for some ξ ∈ D. Our goal is to prove that ξ ∈ D j .Note that by Lemma 6.3 above, for every t ∈ Disc(ξ) c ∪ {1}, lim n→∞ ξ n (t) = ξ(t).(6.13)We first show that ξ has at most j discontinuity points.Assume that |Disc(ξ)| ≥ j + 1.
In view of (6.14) and (6.15), by choosing ǫ < ǫ ′ we conclude that ξ N has at least j + 1 discontinuity points, which leads to the contradiction that |Disc(ξ N )| ≤ j.Thus we conclude that ξ is constant between any two neighbouring discontinuity points.Similarly one can show that ξ(t + ) − ξ(t − ) > 0 for every t ∈ Disc(ξ).
The following lemma is essential in the proof of Proposition 6.6.Recall X′ n was defined in (6.1).Define (6.16) Lemma 6.7.Suppose that Assumptions 2.1 and 2.4 hold.Moreover, let B 1 ≥ 0 and C + as in Theorem 3.1 be strictly positive.The following holds for any δ > 0, γ > 0, and j ≥ 0.
1. First we have that 2. Moreover, we have that Proof of Proposition 6.6.To begin with, for ǫ > 0, define Proof of Lemma 6.7.Part 1): We start showing the first equivalence.Defining X′ It remains to consider the first term in (6.20).Using Corollary 6.5, we have that for some c > 0 independent of ǫ.Part (1) is proved using (6.20) and (6.21), and letting ǫ → 0.
In view of (i) and (ii), we have that Hence, we prove (6.34).Next we show that P( Xi,n ∈ To see this, assume that the opposite holds.Set s 0 = 0 and Due to the assumption, we have ζ ∈ D ≪k , d(ξ, ζ) ≤ δ, and hence, d(ξ, D ≪k ) ≤ δ.This leads to the contradiction of d(ξ, D ≪k ) > δ.Thus, we proved (6.35).Using the fact that P(r 1 > nδ/2) decays exponentially, we are able to restrict ourselves to the case where r 1 ≤ nδ/2.Let (t 0 , . . ., t k ) be as in the r.h.s. of (6.35).Using the fact that, under the M ′ 1 topology, jumps with the same sign "merge" into one jump in case they are "close", we conclude that sign(ξ(t i ))sign(ξ(t i−1 )) = −1 for i ∈ {1, . . ., k}.Combining this with the fact that P for any β ∈ (0, 1).Now, it remains to show that P(T k (u β ) < r 1 ) = O(u −(k−ǫ)α ) as u → ∞.We prove this by induction in k.For the base case we need to show P(T where P(T 1 (u where the tail estimate in (6.39) is obtained by following the arguments in the proof of Lemma 5.11 and taking advantage of the additional assumption that E|B 1 | m < ∞ for every m ∈ Z + .Plugging (6.39) into (6.38) and using the dominated convergence theorem, we obtain that u (2β−γ)α P(T 2 (u β ) < K γ β (u)) = o(1). (6.40) In view of (6.34), (6.37), and (6.where the tail estimate in (6.42) is obtained by following the arguments in the proof of Lemma 5.11 and taking advantage of the additional assumption that E|B 1 | m < ∞ for every m ∈ Z + .Combining (6.41) and (6.42) with the fact that |X T (u β ) /u β | ≤ 1 we obtain that P(T k+1 (u β ) < K γ β (u)), and hence, P(T k+1 (u β ) < r 1 ) are of order O(u −(k+1−ǫ)α ).
For a stopping time T , Let D α T be the dual change of measure such that, under D α T ,

( 3 . 1 )
denote the first return time of X n to the set [−d, d], where d is such that [−d, d] ∩ supp(π) = ∅.Recall that {r n } n≥0 is the sequence of regeneration times of {X n } n≥0 .
If we set τ = inf{n ≥ 1 : Y n ≤ d ′ }, Result 5.10 implies that there exists a positive constant c 3 such that E[τ α+L | Y 0 = x] ≤ c 3 x ⌊α−ǫ⌋ (5.19)for all x ≥ d ′ .We assume w.l.o.g. that d ′ ≥ d.Note that Y n satisfies the same set assumptions as X n , and hence, Lemma 2.6 applies to Y n as well, and τ is bounded by the regeneration time of Y n .Therefore, we can choose a t so that where in the second last inequality we used Hölder's inequality, and the finiteness follows from the fact that P(τ d > n) decays exponentially in n uniformly in |X 0 | ≤ d, as established in Lemma 2.6.Proof of Lemma 5.8.Recall that τ = inf{n ≥ 1 : Y n ≤ d}.
1/(α+L) + O(1), as x → ∞, where, by following the arguments as in the proof of Lemma 2.6, t can be chosen such that sup y∈[0,d] E[t r | Y 0 = y] < ∞.For this choice of t, we have that