Estimation of the activity of jumps in time-changed Lévy models ∗

Abstract: In this paper we consider a class of time-changed Lévy processes that can be represented in the form Ys = XT (s), where X is a Lévy process and T is a non-negative and non-decreasing stochastic process independent of X. The aim of this work is to infer on the Blumenthal-Getoor index of the process X from low-frequency observations of the time-changed Lévy process Y . We propose a consistent estimator for this index, derive the minimax rates of convergence and show that these rates can not be improved in general. The performance of the estimator is illustrated by numerical examples.


Introduction
Nonparametric statistical inference for Lévy-type processes have been attracting the attention of researchers for many years starting from the works of Rubin and Tucker (1959) and Basawa and Brockwell (2007).The popularity of Lévy models is based on their simplicity on the one hand and the ability to reproduce many specific properties of the economic data on the other hand.In this article, we consider a class of processes known as the time-changed Lévy process.Let X be a Lévy process and T be a non-negative, non-decreasing stochastic process with T (0) = 0. Then the time-changed Lévy process is defined as Y s = X T (s) .The change of time can be motivated by the fact that some economical effects (e.g., nervousness of the market which is indicated by volatility) can be better expressed in terms of "business" time which may run faster than the physical one in some periods (see Veraart and Winkel, 2010).
Theoretically it is known that even in the case of the Brownian motion X, the resulting class of time-changed processes is rather large and basically coincides with the class of all semimartingales (Monroe, 1978).Nevertheless, the practical application of this fact for financial modelling meets two major problems: first, the change of time T can be highly intricate -for instance, if Y has discontinuous trajectories (see Barndorff-Neilsen and Shiryaev, 2010); second, the dependence structure between X and T can be also quite sophisticated.In order to avoid the above difficulties we consider the whole class of Lévy processes for X and assume that X is independent of T .
Suppose now that a time-changed process Y is observable on the equidistant time grid 0 < ∆ < . . .< n∆ with some n ∈ N and ∆ > 0. A natural question is which parameters of the underlying Lévy process X can be identified from the observations Y 0 , Y ∆ , . . ., Y n∆ as n → ∞.This question has been recently addressed in the literature, and the answer turns out to crucially depend on the asymptotic behaviour of ∆ and on the degree of our knowledge about T .So in the case of high-frequency data with increasing time horizon, i.e., ∆ n → 0 with n • ∆ n → ∞ (also known as rapidly increasing design), one basically can, under some regularity conditions, identify X completely, provided E[T ] is known (see Figueroa-López, 2009).If the time horizon remains fixed, only the diffusion part of X and the behaviour of the Lévy measure of X at 0 can be identified (see Aït-Sahalia andJacod, 2009 and2012).The latter behaviour can be characterised in terms of the so-called Blumenthal-Getoor index or successive Blumenthal-Getoor indexes.The Blumenthal-Getoor index is a characteristic of the activity of small jumps and for a one-dimensional Lévy process Z = (Z t ) t≥0 with a Lévy measure ν can be defined via BG(Z) = inf r > 0 : The Blumenthal-Getoor index, its practical importance and its theoretical properties, have recently got much attention in the literature.For instance, Mijatovic and Tankov (2012) studied the impact of this index on the asymptotic behavior of the implied volatility.Rosenbaum and Tankov (2012) showed how the activity of small jumps influences the optimal discretization strategies for option pricing.
An important remark is that the statistical analysis of the time-changed Lévy models is much more difficult than the one of Lévy models.This lies in the fact that the increments of Y are not any longer independent and that neither the process X nor T is directly observable (see Belomestny, 2011).
This paper is devoted to the case of low-frequency data, i.e., the case when ∆ is fixed and n → ∞.For this case, Belomestny (2011) has proved that one can not in general identify the Lévy measure X from the low-frequency observations of Y .However, the question remains open whether the behaviour of ν at 0, expressed in terms of the BG index, can be recovered.It follows from the results of this work that a consistent estimation of the BG index of X is basically possible, provided the Lévy process X is independent of T and has a nonzero diffusion part.
The approach presented in this paper was already introduced by Belomestny and Panov (2013) in the context of affine stochastic volatility models.Nevertheless, the unique similarity between this paper and Belomestny and Panov (2013) is the main idea to use the asymptotic behaviour of the characteristic function of the increments of the price processes to infer on the Blumenthal-Getoor (BG) index.Indeed, the results are quite different: while the simultaneous estimation of the jump activities for Lévy process driving the volatility and the state process in ASV models is not possible, for the time-changed Lévy processes we are able to consistently estimate the BG indexes of the processes X and T provided X has a non-zero diffusion part.Furthermore, there are huge differences between Belomestny and Panov (2013) and the current paper on the methodological side.For example, the analysis in Belomestny and Panov (2013) is heavily based on the properties of affine processes that can not be used in the current framework.
The paper is organised as follows.In the next section, we present the main setup and give basic definitions and examples.Next, we introduce the main object of our study, the time-changed Lévy processes and formulate the main assumptions.Section 3 contains the so-called Abelian theorem describing the asymptotic behaviour of the characteristic function of the increments Y t+∆ − Y t for fixed ∆ > 0. Estimation algorithm for the Blumenthal-Getoor index of X is presented in Section 4, where also some theoretical results showing the consistency of the proposed estimator and the corresponding rates of convergence, are presented.Some numerical examples can be found in Section 5.All proofs are collected in Section 6.

Lévy process X
Throughout this paper we assume that the process (X t ) is a one-dimensional Lévy process on some filtered probability space (Ω, F, (F t ) t≥0 , P).This particularly means that the characteristic function of X has the form: where the function ψ(u) is the so-called characteristic exponent of X.The Lévy-Khintchine formula yields where µ ∈ R, σ is a non-negative number and ν is a Lévy measure on R \ {0}, which satisfies A triplet (µ, σ 2 , ν) is called the characteristic triplet of the Lévy process X t .We need the following assumptions.
(AL) σ is strictly positive and the function V(u) has the following representation: where λ 1 > 0, γ ∈ (0, 2), and for u large enough The assumption (AL) is, for example, fulfilled if there exist with some C > 0.Here and in the sequel the notation f (u) g(u) as u → ∞ (with some functions f and g such that g(u) = 0, ∀u ∈ IR) means that lim u→∞ f (u)/g(u) = 1.

Time change
Let T = (T (s)) s≥0 be an increasing right-continuous process with left limits such that T (0) = 0 and, for each fixed s, the random variable T (s) is a stopping time with respect to the filtration F. In this paper, it is also assumed that (AT1) the process T is independent of X; (AT2) the sequence T k = T (∆k) − T (∆(k − 1)), k ∈ N, is strictly stationary and α-mixing with the mixing coefficients (α T (j)) j∈N satisfying for some positive constants ᾱ0 and ᾱ1 (the term "α-mixing sequence" means that α T (j) → 0 as j → +∞); (AT3) the Laplace transform of T (∆) has the following asymptotic behavior: with λ 2 > 0, A > 0, α ∈ (0, 1], and Ψ 2 (u) such that for u large enough with some χ 2 , ϑ 2 ≥ 0.
Note that in the case where T is an increasing Lévy process, i.e., a subordinator, the parameter α coincides with the Blumenthal-Getoor index of T .
Remark 2.1.It is easy to see that the most restrictive assumption is (AT1).This assumption is made for the identifiability reasons.The corresponding class of processes remains rather large, but its full characterisation is still an open problem (see Barndorff-Nielsen and Shiryaev, 2010).

Examples
Tempered stable subordinator.The tempered stable distribution with parameters (a, b, α) can be defined via its Laplace transform: where a > 0, b ≥ 0, α ∈ (0, 1), see Schoutens (2003).The tempered stable subordinator is a process Z t , which has increments Z t+s − Z t following a tempered stable law with parameters (sa, b, α).The Lévy measure of this process is of the form: where λ = b 1/α /2.Here the decay rate of big jumps, c = −a * 2 α /Γ(−α) alters the intensity of all jumps simultaneously, and α is the Blumenthal-Getoor index of the process (see Cont and Tankov, 2004).Tempered stable subordinator satisfy (AT3) with χ 2 = 1, λ 2 = 2 α a∆, ϑ 2 = b 1/α α/2 and A = exp{ab∆}.Integrated CIR process.Another candidate for the time change process is given by the integrated Cox-Ingersoll-Ross (CIR) process.The CIR process is defined as a solution of the following SDE: where a, b and ζ are positive numbers, and W t is a Wiener process.If Z 0 is sampled from the stationary invariant distribution π and 2a ≥ ζ 2 , then Z t is strictly stationary and ergodic.The time change process T (s) is then defined as The Laplace transform of T (∆) under π is given by where Λ(u) = b 2 + 2ζ 2 u, see Chapter 15.1.2from Cont and Tankov (2004).
Since the stationary distribution of the integrated CIR process is the Gamma distribution with parameters 2a/ζ 2 and 2b/ζ 2 , the Laplace transform of Z 0 under π has a form and therefore where From the definitions of the hyperbolic functions, it directly follows that where Substituting (2.8) and (2.9) into (2.7),we get the following asymptotics for the function L iCIR ∆ (u) as u → ∞: with From (2.10) it follows that the assumption (AT3) is fulfilled with α = 1/2 and Note that for u large enough, and therefore χ 2 = 1/2 − for arbitrary small < 1/2.

The characteristic function of Y ∆
Denote the characteristic function of Y ∆ by φ ∆ (u), then Since the inside (conditional) expectation is equal to exp{T (∆)ψ(u)}, we get The first objective of this paper is to infer on the asymptotic behavior of φ ∆ (u).
Theorem 3.1.Consider the process Y s := X T (s) , where the processes X and T satisfy the conditions (AL), (AT1)-(AT3) with and where K > 0 doesn't depend on the parameters of the processes X and T .
Remark 3.2.The examples of Section 2.3 show that the condition χ 2 > 1−γ/2 is not restrictive.For example, if the tempered stable distribution is used as a time change, then this condition holds for any Lévy process X since χ 2 = 1 in this case.
Remark 3.3.In the sequel we will use the notation Remark 3.4.In this remark, we draw attention of the special case α = 1, which holds for instance if T is deterministic.In this case, the statement of Theorem 3.1 can be checked directly.Since λ 2 = T (∆), we get Remark 3.5.Theorem 3.1 can be viewed as the Abelian theorem for timechanged Lévy processes.For the detailed discussion of the term "Abelian theorem" and closely related term "Tauberian theorem", we refer to the book by Korevaar (2004).Perhaps the most famous result of this type is the Karamata Tauberian theorem (see Bingham et al., 1987).For the Lévy processes, the corresponding fact was proven by Bismut, 1983 and this yields the following corollary of Theorem 3.1.
Corollary 3.6.If the process T is a subordinator with α < 1, and the assumptions of the Theorem 3.1 are fulfilled, then the Blumenthal-Getoor index of the process Y is equal to 2α.
Note that R 1 (u) → 1 as u → +∞ due to Remark 3.3.The representation (4.1) tells us that Y 1 (u) is, up to a reminder term log(R 1 (u)), a linear function of log |u| with the slope 2α + γ − 2. If the parameter α is assumed to be known, then one can view the estimation of γ as a linear regression problem (at least for large u) and apply the (weighted) least-squares approach.Otherwise, if α is unknown, one should first estimate α.This can be also done by the method of (weighted) least-squares.Indeed, define where R 2 (u) → 1 as u → +∞.So, Y 2 (u) is (at least for large u) a linear function of log |u| with the slope proportional to α.If A = 1, then one can first apply the transformation: and then work with φ ∆ (u) instead of φ ∆ (u).The above discussion shows that one can consistently estimate the parameters α and γ, provided a consistent estimate for the c.f. of Y ∆ is available.

Estimation of the characteristic function
Suppose that the discrete observations Y 0 , Y ∆ , . . ., Y n∆ of the state process Y are available for some fixed ∆ > 0. We estimate φ ∆ (u) by its empirical counterpart φ ∆ n (u) defined as Note that due to the assumption (AT2), the process Y is ergodic and α-mixing with E e iu(Ys+∆−Ys) = φ ∆ (u) for any s ≥ 0 (see Appendix D.1).Hence by virtue of the Birkhoff ergodic theorem (see Athreya and Lahiri, 2010), almost surely and in L 1 .

The case of known α
Introduce a weighting function w , where V n is a sequence of positive numbers tending to infinity, and w 1 is an almost everywhere smooth function supported on [ε, 1] for some ε > 0 that satisfies Some examples of such weighting functions can be found in Panov (2012).If 2 − γ < 2α, we define an estimator of γ by In our theoretical study we mainly focus on the first case (a couple of remarks about the second case can be found in Section A).The estimate γn (α) can be represented as γn (α) = 2(1 − α) + m n,2 , where m n,2 is a solution of the following optimization problem: where w Vn (u) is an almost everywhere smooth positive function on IR that can represented in the form w Vn (u) = w 1 (u/V n ) /V n with some function w 1 supported on the interval [ε, 1].The proof of the fact that (4.4) and (4.6) are equivalent follows the same lines as the proof of Lemma A.4 in Belomestny and Panov (2013).In particular, the weighting functions w 1 and w1 are related as follows: Next, we introduce the deterministic quantity The next lemma shows that γn (α) is close to γ.
Lemma 4.1.In the setup of Theorem 3.1, it holds for n large enough, where χ1 was introduced in Remark 3.3, and C is some positive constant.More precisely, for n large enough, where C (1) > 0 does not depend on the parameters of Y .
In the sequel we shall use the notation which means the existence of some C > 0 such that (4.8) is fulfilled.The next lemma shows that γn (α) converges to γn (α) in probability.
2 ) and consider a class of time-changed Lévy models A = A (P) such that .
Then there exists Ξ > 0 such that for any where χ1• := min {χ 1• , 2χ 2• + γ • − 2}, the supremum is taken over the set of all models from A (P), constants κ and δ do not depend on P.
The next theorem states that the rates obtained in Theorem 4.3 are optimal.
Theorem 4.4 (lower bound).For any Ξ 2 < Ξ, it holds where the infimum is taken over all possible estimates of the parameter γ, the supremum -over the set of all models from A .where U n is a sequence of positive numbers tending to infinity, and a weighting function w Un α satisfies the same properties as the function w Vn , see (4.3).This estimate can be alternatively defined as a solution αn = l n,1 of the following optimization problem: where w Un α (u) is an almost everywhere smooth positive function on IR having the representation: with some function w 1 α supported on the interval [ε, 1].The upper bound for the estimate αn is given in the next theorem.Theorem 4.5 (upper bound for αn ).Take the sequence Then there exists a positive constant Ξ 3 such that where the supremum is taken over the set of all models from A , and the constants κ and δ are defined in Theorem 4.3.
Estimation of γ in the case of unknown α.After estimating α using the sequence U n , one can define an estimate of γ via γn ( αn The next theorem shows that the upper bound for the estimate γn (α n ) is the same as in the case of the known α as long as χ1• ≤ 2(2 − γ).The latter inequality holds true if, e.g., γ ≤ 4/3.
Theorem 4.6 (upper bound for γn ( αn )).Take the sequences where the supremum is taken over the set of all models from A , constants κ and δ do not depend on the parameters of Y , and Ξ 4 depends on P only.In particular, if we additionally assume that χ1• ≤ 2(2 − γ), then

Numerical examples
In our numerical study we consider the following time-changed Lévy model.Let Y s be in the form where • W t is a Brownian motion and parameter σ = 0.25/ √ 252 ≈ 0.016, which corresponds to the annual volatility equal to 0.25, • G t is a Lévy process which will be specified later, • T (s) is the integrated CIR process with parameters a = 1.763, b = 1.763, ζ = 0.563, see Section 2.3 for notation; note that the assumptions (AT2) and (AT3) hold for this T (s) with α = 1/2, if Z 0 has the invariant distribution.
Note that the values of parameters for T (s) are taken from Figueroa-López (2012).As for the process G t , we focus on the following two cases: 1. G t is a stable process with index γ = 1.2.In this case, the characteristic exponent of X t is equal to where we take δ 1 = 0.25 and δ 2 = 0.3.Note that the assumption (AL) is fulfilled with λ 1 = δ 1 .2. G t is a Normal inverse Gaussian (NIG) process (i.e., γ = 1).This process is defined by where W is a Brownian motion independent of W , U is an inverse Gaussian subordinator independent of W such that E[U t ] = 0.21 t, and σ G = 0.5, θ G = −0.8.Note that these parameter values are used in Figueroa-López (2012).In this case, the assumption (AL) is guaranteed by the fact that the characteristic exponent of X can be represented in the form with some χ 1 ∈ IR + , χ 2 ∈ IR + , χ 3 ∈ IR, χ 4 ∈ IR, see Belomestny (2011).
For both choices of the process G t , we estimate the parameters α and γ by the following procedure.First, we generate a trajectory Y 0 , Y ∆ , ..., Y n∆ with ∆ = 1.Next, we estimate the characteristic function φ n (u) by φ ∆ n (u), see (4.2), and consider the optimization problem (4.15): where U low and U up are the truncation levels.The solution l n,1 of this problem gives an estimate of α, which we denote by αn .Figures 1 and 2 show the boxplots of αn as a function of n based on 100 simulation runs for the first and the second choice of the process G t , respectively.Next, we proceed to the estimation of γ by considering the optimization problem (4.6) with α = αn : where V low and V up are the truncation levels, and θ = 2.The estimate of γ is then defined as γn ( To illustrate Theorem 4.6, we compute also the estimates γn (α), which are the solutions of the optimization problem (5.5) with the true value α = 1/2 instead of its estimate αn .The boxplots given on Figure 3 and 4 indicate that the quality of the estimates γn (α) and γn (α n ) is quite similar.Note that the condition 2 − γ > 2α is not fulfilled for the model with NIG process (and therefore the theory presented in this paper formally cannot be applied to this situation), but the use of estimator (5.5) still makes sense.

Proof of Theorem 3.1
First note that by (3.1), i.e., |φ(u)| is asymptotically equivalent to the Laplace transform of T (∆) computed at the point − Re(ψ(u)).Substitung (2.6) into (6.1),we get Using now Assumption (AT3), we conclude that where |S(u)| ≤ ϑ 2 u −χ2 .Recall that by Assumption (AL), Therefore for u large enough, where K 1 > 0 doesn't depend on the parameters of the processes X and T .Substituting (6.3) and (6.4) into (6.2),we get where yields the required upper bound for r(u) and completes the proof.
with C > 0. The statement of the lemma follows with C (1) := 2 The proof of this theorem follows the same lines as the proof of its analogue for the case of affine stochastic volatility models (Belomestny and Panov, 2013).
We begin the proof with the following lemma.
Lemma 6.1.Suppose that Then there exist positive constants B 1 , κ and δ such that for any n > 1 where Proof.We divide the proof of the lemma into several steps. with 2. Lemma 7.5 from Belomestny and Panov (2013) shows that the event has the probability tending to 1 as n tends to infinity.More precisely, it holds for some positive constants B 2 , κ and δ.
3. For any u ∈ [εV n , V n ], the Taylor expansion for the function f (x) = log(− log(x)) in the vicinity of the point x = G(u) yields with where I n (u) stands for the interval between G(u) and G n (u).Due to Theorem 3.1, where . Thus, I n (u) ⊂ (0, 1) on W n for n large enough and the maximum on the right hand side of the inequality in (6.12) is attained at one of the endpoints of interval Belomestny and Panov (2013) shows that there exists a positive constant B 3 such that for any u ∈ [εV n , V n ] and for n large enough 5. The Taylor expansion (6.11) and the previous discussion yield on the set By (6.9), the expression in the brackets is equal to Taking into account that ξ 2,n < 1 on the set W n by (6.10), we conclude that P can be upper bounded on W n as follows (all supremums are taken over [εV n , V n ]): This completes the proof of the lemma.
Next, we proceed with the proof of Lemma 4.2.First, we get a lower bound for the infimum of the function |φ Applying Lemma 6.1 and taking into account that by (6.13), | log(G(u))| u 2α+(γ−2) , we arrive at the desired result.
From here it follows that the estimate γn (α) is of logarithmic order, i.e., where Note that the constant C can be uniformly bounded on the set of models from A .This completes the proof.

Lower bounds
Proof of Theorem 4.4.The aim of this proof is to show that where Ξ 2 is any positive constant smaller than Ξ, infimum is taken over all possible estimates of the parameter γ, supremum -over all models from the class A , which includes the models from A constructed with subordinators T .The main ingredient of the proof is the following lemma, which follows from Tsybakov (2009), Section 2.2 and Theorem 2.2.Lemma 6.2.Let P = {P γ } be a (nonparametric) family of models in IR m .Assume that there exist two values of parameter γ, say γ 1 and γ 2 , such that |γ 1 − γ 2 | > 2ψ n and moreover the corresponding measures P 1 := P γ1 and P 2 := P γ2 satisfy the following properties: 1. there exists a measure µ such that P 1 << µ and P 2 << µ; 2. the χ 2 -distance between P ⊗n 1 and P ⊗n 2 is bounded by some constant η ∈ IR + , where the χ 2 -distance is defined for any two measures P and Q as Then (6.17) holds.
Proof. 1. Presentation of the models.Let us fix some set of parameters P. Our aim is to construct two time-changed models from A = A (P).Let T (s) follow the tempered stable process with b = 0 in both models.The choice of parameters and any ϑ 2 , γ 2 .For the Lévy process X in the first model, we take the sum of two independent processes: the Brownian motion W and γ-stable process X with γ ∈ γ • , 2 and , so that the characteristic exponent of X is equal to ψ(u) = −u 2 /2 − λ 1 |u| γ .Note that the condition (AL) holds for any values of ϑ 1 and χ 1 .According to (6.1), the characteristic function of the increments in the first model has the following asymptotics: We define the Lévy process X for the second model by its characteristic exponent The characteristic function of the second time-changed Lévy process is given by φ∆ Note that both time-changed Lévy processes have absolute continuous distributions.Denoting the corresponding densities in the time moment ∆ by p ∆ (x) and p∆ (x), we can express the χ 2 -divergence between p ∆ (x) and p∆ (x) in the following way: since p ∆ (x)p ∆ (x) > 0, ∀x ∈ IR.
2. Lower bound for p ∆ (x).The density function p ∆ (x) can be written as where π ∆ (t) is the density function of the tempered stable process at the time moment t, and q t (x) is the density function of the sum of Xt and W t .Since q t (x) is a convolution of two density functions, and the (strictly) γ-stable process Xt posseses the property Xt d = t 1/γ X1 (see Cont and Tankov, 2004), we conclude that where p st is the density of X1 ; the last inequality follows from Zolotarev (1986).Fixing some 0 < d 1 < d 2 < 1/2, we arrive at γ+1) .
Returning now to (6.20).Taking into account that p ∆ (x) is bounded on any set of the form {|x| ≤ C} by some constant D, we get with C large enough, . Upper bound for I 1 .By the Parseval-Plancherel theorem (see, e.g., Ushakov, 1999), where Therefore, The asymptotic bound for the last integral can be obtained using the change of variables and integration by parts.Denote .21)This in turn leads to the following upper bound for the integral I 1 : 4. Upper bound for I 2 .Note that where g(x) stands for the Fourier transform of a function g(x).Making use of the property x 2 g(x) = ∂ 2 g(x)/∂x 2 , we conclude by the arguments similar to (6.21), for any n > 0. 5. Choice of M .Finally, we get where The aim now is to choose the parameter M in such way that the conditions (6.18) and (6.19) are fulfilled simultaneously.Take M = (A n /µ 1 ) 1/(2α) .Substituting this M into (6.18)results in the following condition on A n : , β > (γ − ψ n + 3)/(2α), satisfies (6.18) and (6.19).This completes the proof.

Estimation of α
The empirical counterpart of the estimate αn is given by ᾱn := This completes the proof.
The next lemma is an analogue of Lemma 4.2 for the estimate αn .(q log n) with κ 1 = 1/2−qτ (1) and some κ 2 .Therefore, choosing q < q • = min A 1/(2τ (1)  we get on W n 2. Upper bounds for Υ 1 and Υ 2 .Note that for any u > 1 and any β ∈ IR, Moreover, for β tending to zero, where C > 0. This in particularly implies that for any small υ > 0 and n large enough,

Fig 1 .Fig 2 .Fig 3 .
Fig 1. Boxplots of the estimate αn for different values of n based on 100 simulation runs for the model with stable process Gt.

Fig 4 .
Fig 4. Boxplots of the estimates γn(α) and γn( αn) for different values of n based on 100 simulation runs for the model with Normal inverse Gaussian process Gt.