Spectral estimation for non-linear long range dependent discrete time trawl processes

Discrete time trawl processes constitute a large class of time series parameterized by a trawl sequence (a j) j$\in$N and defined though a sequence of independent and identically distributed (i.i.d.) copies of a continuous time process ($\gamma$(t)) t$\in$R called the seed process. They provide a general framework for modeling linear or non-linear long range dependent time series. We investigate the spectral estimation, either pointwise or broadband, of long range dependent discrete-time trawl processes. The difficulty arising from the variety of seed processes and of trawl sequences is twofold. First, the spectral density may take different forms, often including smooth additive correction terms. Second, trawl processes with similar spectral densities may exhibit very different statistical behaviors. We prove the consistency of our estimators under very general conditions and we show that a wide class of trawl processes satisfy them. This is done in particular by introducing a weighted weak dependence index that can be of independent interest. The broadband spectral estimator includes an estimator of the long memory parameter. We complete this work with numerical experiments to evaluate the finite sample size performance of this estimator for various integer valued discrete time trawl processes.


Introduction
A discrete time trawl process X = {X k , k ∈ Z} is defined in [Doukhan et al., 2019] by where (A-1) The sequence (γ k ) k∈Z is a sequence of i.i.d.copies of a generic process γ = {γ(u), u ∈ R} and a = {a j , j ≥ 0} is a sequence converging to zero.
Processes so defined can be interpreted as discrete time versions of the trawl processes introduced in [Barndorff-Nielsen et al., 2014].The generic process γ is called the seed process and the sequence a is called the trawl (height) sequence.
Additional assumptions are required to have a converging sum in (1.1).The convergence See [Doukhan et al., 2019, Proposition 1], where the covariance function is also given by the formula Cov(γ(a j ), γ(a j+k )) . (1.3) By [Doukhan et al., 2019, Proposition 3], we moreover know that if, in addition, the two following asymptotic behaviors hold: Cov(γ(u), γ(v)) = (|u| ∧ |v|) (1 + o(1)) as u, v → 0 , (1.4) a j = c j −α * (1 + o(1)) as j → ∞ , (1.5) with c = 0 and α * > 1, then, the covariance function behaves at large lags as (1.6) In the following we will refer to α * in (1.5) as the trawl exponent.In particular if this behavior is often referred to as X being long range dependent with long memory parameter (1.8) Here (1.7) implies d * ∈ (0, 1/2) (sometimes referred to as positive long memory).We here use one of the several existing definitions of long range dependence, see for instance Condition II in [Pipiras and Taqqu, 2017, Section 2.1].In fact in the cases considered here, the same long memory parameter d * can also be defined through their condition IV, based on the spectral density.In the case where α * ≥ 2, the two definitions may no longer coincide.The definition of negative long memory (d * < 0) is generally relying on the behavior of the spectral density at the origin (in particular imposing this spectral density to vanish there).Adopting this definition the formula (1.8) may not be valid anymore as the obtained process could have short memory (d * = 0) even if α * > 2, or have negative long memory (d * < 0).In the following, we will only consider the case where d * ≥ 0, avoiding the negative long memory case for convenience.
A very interesting feature of trawl processes is that under the fairly general assumption (1.4) on the seed process, the low frequency behavior of the spectral density is mainly driven by the trawl sequence.However, it is shown in [Doukhan et al., 2019] that, for a given trawl sequence, two different seed processes can yield different large scale behaviors, as can be seen by different types of limits in the invariance principle.In the case of a Lévy seed for instance, a Brownian seed process leads to an invariance principle with fractional Brownian motion limit, with Hurst parameter (3 − α * )/2, and a (centered) Poisson seed process leads to an invariance principle with Lévy α * -stable limit, see [Doukhan et al., 2019, Theorems 1 and 2].
The goal of this paper is to investigate the spectral estimation of a long-range dependent process X from a sample X 1 , . . ., X n .Deriving general results applying to a wide class of long range dependent trawl processes raise two major difficulties.First, as already noted about the asymptotic results derived in [Doukhan et al., 2019], the large scale behavior of such processes, can be very different from one trawl process to another, even with similar or even equal covariance structure.Second, the spectral density has a closed form only in particular cases for the seed process and the trawl sequence.The computation of the spectral density function depends both on the seed process γ and the sequence (a j ).For instance, in [Doukhan et al., 2019, Example 5], it is shown that for a large class of seed processes (that will be referred to as the Lévy seed process below), a specific sequence (a j ) leads to the same spectral density as an ARFIMA(0,d * ,0), namely, (1.9) Here the spectral density is normalized in such a way to have the innovation process with unit variance.The general form that we will assume on the spectral density includes of course a multiplicative constant c * but also an additive smooth function h * belonging to the space C of continuous and (2π) periodic functions endowed with the sup norm.Namely, to encompass as many cases as possible, we assume that X has a spectral density function given by (1.10) where d * ∈ [0, 1/2), c * > 0 and h * ∈ C. The form (1.10) is the spectral behavior corresponding to that of the covariance in (1.6).Here d * is again the long memory parameter, and it characterizes the power law behavior of f at low frequencies while the function h * encompasses the short-range behavior.As we will see, in many cases of interesting trawl processes, the function h * is smooth in the Hölder sense, leading naturally to the additive parametric form (1.10) of the spectral density, which is different for the usual product parametric form usually encountered in linear models such as ARFIMA processes.Note however that such an additive form of the spectral density were already considered in [Hurvich et al., 2005] for completely different (non-linear) models.
We consider either pointwise or broadband estimation of the spectral density.In the first case, we estimate f (λ) directly for a given λ, and, in the second case, we estimate the triplet (c * , d * , h * ) by assuming it belongs to a known parameter set.The first approach only makes sense for λ = 0 and will be investigated in Section 2.2 using a smoothed version of the periodogram.The second approach will be investigated in Section 2.3.The estimation of the long memory parameter is a widely studied problem in statistical inference, see the reference book [Doukhan et al., 2002], or, more recently, [Giraitis et al., 2012] and the references therein.Here, we propose to estimate the parameter (c * , d * , h * ) using a parametric Whittle approach.Define the periodogram .11)where Xn denotes the empirical mean of the sample X 1 , . . ., X n , and denote the Whittle contrast by where f d is defined by (1.9) and L is the Lebesgue measure on [−π, π] divided by 2π.Our estimator ( dn , ĥn , ĉn ) is to find a near minimizer ( dn , ĥn ) of (d, h) → Λ n (d, h) over a well chosen set of parameters for (d, h), and then set ĉn = I n f dn + ĥn dL . (1.13) From which we can also define an estimator of the spectral density, namely, fn = ĉn f dn + ĥn .
Here we derive results that apply to a wide class of trawl processes, in particular to those of nature quite different from the well studied class of Gaussian or linear processes.For convenience, we focus on proving the consistency of our estimators under very general assumptions, that can be of interest beyond trawl processes: (A-2) The process X = (X k ) k∈Z is stationary, ergodic and L 2 .
(A-3) There exist C 0 > 0 and s 0 ∈ (0, 1) such that, for all integers Assumption (A-2) is basically satisfied by all well defined discrete-time trawl processes.To show that a given trawl process satisfies (A-3) with a well chosen exponent s 0 , we will rely on a weighted weak dependence property that is easy to prove for discrete-time trawl processes.
The paper is organized as follows.In Section 2, we present successively: 1) general conditions on the seed process and the trawl sequence so that the corresponding trawl process satisfies Condition (A-2) and (A-3) above, 2) general results on second order estimation under Assumption (A-3) and 3) a general consistency result on the parametric Whittle estimation of the parameters (d * , h * ) of the unknown spectral density in (1.9).For this estimation result to hold, we only require on the observed process to satisfy (A-2).The assumption on the parameter set on which the Whittle contrast is maximized will be detailed in (A-4).We provide in Section 3 various examples of trawl processes.Although the usual causal linear models for long range dependence (such as ARFIMA processes) constitute specific examples of trawl processes, we here focus on the non-linear models introduced in [Doukhan et al., 2019], and specify simple sufficient conditions implying the assumptions used in the general results.
The proofs of the results presented in Sections 2 and Section 3 are detailed in Section 5. Before that, we introduce in Section 4 some weighted weak dependence coefficients that can be of independent interest but which will mainly serve us here to check (A-3) for trawl processes.
Finally in Section 6, we present numerical experiments focusing on the estimation of the long memory parameter d * comparing our approach to the more classical local Whittle estimator, which is known to perform well for standard linear models.Concluding remarks including directions for future work are proposed in Section 7.
2 Main results

Results on trawl processes
As explained in the introduction, the L 2 convergence of (1.1) follows from (1.2).We provide hereafter a more precise statement, and a slight extension to a convergence in L 2p with p ≥ 1.
All the proofs of this section are postponed to Section 5.1.
Lemma 1. Assume (A-1).Then (1.2) implies that the convergence (1.1) holds in L 2 and the resulting process X is ergodic.A centered version of X can be obtained by setting (2.18) If moreover, we have, for some p > 1, then the convergence (1.1) also holds in L 2p .
Having a condition for X to satisfy (A-2) and to be L 2p , we now provides conditions for obtaining (A-3), which requires p ≥ 2 for the considered covariances to be well defined.

Second order estimation
In this section, we suppose that X is a weakly stationary process with auto-covariance r or spectral density f .All the proof of this section are postponed to Section 5.2 for convenience.
The main assumption that we will require on X is (A-3).It is interesting to note that, if (1.14) holds then assuming (1.15) and (1.16) is equivalent to assuming The precise statement is the following.
Lemma 2. Let X be a weakly stationary process with zero mean such that (1.
We denote the empirical covariance function by where Xn denotes the empirical mean of the sample X 1 , . . ., X n .The centering in the definitions of r n can be treated separately.Define non-centered covariance estimator (2.24) The empirical covariance function defined by (2.23) can then be written as where r n is the non-centered empirical covariance function defined in (2.24) and R n n (k) is the reminder term defined by This term is "small" only if X is a centered process.Nevertheless, X can be assumed centered here, since the empirical covariance r n is unchanged when X is replaced by its centered version.
In the case where X has mean zero, we have the following result.
Proposition 1.Let X be an L 4 process with zero mean and satisfying (A-3).Then there exists a constant C ′ only depending on C 0 , C 1 and s 0 such that, for all 0 ≤ k ≤ ℓ < n, (2.27) The following result follows.
Corollary 1.Let X be a weakly stationary L 4 process satisfying (A-3) with covariance function r.Then there exists a constant C ′ only depending on C 0 , C 1 and s 0 such that, for all (2.28) Another possible application of Proposition 1 is the pointwise Kernel estimation of the spectral density f wherever it is well defined and smooth.Let J denotes a two times continuously differentiable function with support [−1/2, 1/2] and such that J = 1.For any β > 0 and λ 0 ∈ [−π, π], let J β,λ 0 denotes its λ 0 -shifted, β-scaled and (2π)-periodic version: Define the Kernel estimator of f (λ 0 ) fn,β (λ 0 ) = T I n J β,λ 0 .
Let µ denote the spectral measure of X and suppose that it admits a density f in the neighborhood of λ 0 , and that this density is continuous at λ 0 .Then, it is easy to show that lim and the rate of convergence as β → 0 can be obtained from the smoothness index of f at λ 0 .
This deterministic limit can be interpreted as a control on the bias of the estimator fn,βn (λ 0 ) of f (λ 0 ).The deviation is bounded by the following result.
Corollary 2. Let µ be the spectral density associated to the covariance function r.Define I n and rn by (1.11) and (2.23).Let J be a kernel function as above and define the kernel estimator fn,β accordingly.Then (2.28) implies that there exists a constant C ′′ only depending on C ′ , r(0) and J such that, for any (2.30) As usual, the deviation bound (2.30) (where β = β n should not converge to 0 at a rate faster than n s 0 /2 ) has to be balanced with the convergence (2.29) (where β = β n should converge to 0, the faster the better).

Parametric Whittle estimation
Although h * is an unknown element in the infinite dimensional space C, our approach is parametric in nature in the sense that we now assume that (d * , h * ) belongs to a known compact subset K of [0, 1/2] × C. In practice, to get a good approximation of an element of C, only a finite number of its Fourier coefficients needs to be estimated.More generally we denote by (K n ) a sequence of subsets of K in which we can always find (d * , h * n ) such that h * n approximates h * well for n large.More precisely we consider the following assumption.
on R, and let (K n ) be a sequence of subsets of K such that for a well chosen sequence Remark 1.If h ∈ K is parameterized by finitely many parameters, one can take K n = K for all n ≥ 1, in which case the last assertion of (A-4) is immediately satisfied for all (d * , h * ) ∈ K by taking h * n = h * for all n.
An infinite dimensional setting can be set up as follows.For any s, C > 0, let H(s, C) denote the ball of even, real and locally integrable (2π)-periodic functions h : R → R such that the Fourier coefficients For any non-negative integer m, let moreover P m denote the set of even real trigonometric polynomials of degree at most m.For any locally integrable (2π)-periodic function h, denote by p m [h] the projection of h onto P m , that is, For any s, C > 0, it is easy to show that sup as m → ∞.The following result can be used to build a parameter space K and a sequence for any diverging sequence (m n ) of integers larger than or equal to m 0 , Assumption (A-4) holds by setting The proof of this lemma is postponed to Section 5.3.We can now state the consistency of our estimator which, in the same flavor as in [Giraitis et al., 2012, Theorem 8.2.1],only requires the observed process to be ergodic.Its proof is also postponed to Section 5.3.
Theorem 2. Suppose that the process X satisfies (A-2) and admits a spectral density of the form (1.10) with parameter (c * , d * , h * ) satisfying (A-4) for some subsets K and where Λ n is defined by (1.12), and define ĉn by (1.13).Then, a.s., dn and ĉn converge to d * and c * , and ĥn converges to h * uniformly.
Assumption (A-4) provides a new framework of parametric models, different from the ones classically used in Whittle parameter estimation, and which seems to be well adapted for many examples of trawl processes, see Section 3. However it also includes many known cases.Let us examine the celebrated ARFIMA model, in which the spectral density takes the form where, for some positive integers p and q, the MA and AR coefficients θ * = (θ * 1 , . . ., θ * p ) and φ * = (φ * 1 , . . ., φ * q ) are assumed to make the corresponding ARMA process canonical.In the following this will be denoted by (φ, θ) ∈ Θ p,q , defined by Θ p,q = (φ, θ) ∈ R p+q : Φ and Θ have no common roots and for all z ∈ C such that |z| ≤ 1,Φ(z) = 0 and Θ(z) = 0} , (2.33) where Φ and Θ are the AR and MA polynomials defined by The corresponding reduced Whittle contrast reads where f d is defined by (1.9) and L is the Lebesgue measure on [−π, π] divided by 2π.The form (2.32) is in fact a special case of (1.10) by setting (2.36) Note that h * is indeed continuous.The ARFIMA linear processes have been extensively studied.However the usual proof of the consistency relies on the Hannan's approach of [Hannan, 1973] but it does not hold if d = 0 is included in the set of parameters.Here, as a consequence of Theorem 2, we get the following, which provides an alternative proof.
Corollary 3. Let p, q be two positive integers and K be a compact subset of [0, 1/2) × Θ p,q .
3 Examples of discrete time trawl processes

Random line seed
As explained in [Doukhan et al., 2019, Example 1], any causal linear process is a trawl process by setting the seed process to be the random line seed γ(t) = tǫ, where ǫ is a random variable with zero mean and finite variance.
The parametric estimation in the linear case is a well known topic, usually treated using ARFIMA parametrization, see e.g.[Giraitis et al., 2012, Section 8.3.2] for a complete statistical analysis of this model.

Lévy seed and non-increasing sequence
Consider the two following assumptions (A-5) The process γ is a Lévy process with finite variance normalized so that Var γ(1) = 1.
They imply (1.2) since then we have, for all t ≥ 0, E γ(t) = δ t for some drift δ and Var γ(t) = t.
By Lemma 1 and Eq.(1.3), the trawl process X defined by (1.1) satisfies (A-2) and its autocovariance function r is given by If (A-5) and (A-6) hold and γ(1) admits a finite q-th moment, we easily have that, for all where κ q is the q-th order cumulant of γ(1).We then obtain So, by Lemma 2, if q = 4, X satisfies (A-3) with s 0 = α * − 1. Theorem 1 shows that Condition (A-3) continues to hold for more general trawl processes, provided some adequate moment conditions, but with s 0 possibly higher than α * − 1 (see Section 3.3 for examples).
For such a process, we can specify (a k ) so that the spectral density is of the form (1.10) with (d * , h * ) lying within a parameter space K satisfying Condition (A-4).A very special case, detailed in [Doukhan et al., 2019, Example 5], consists in setting where, for all d < 1/2, r (d) is defined as the auto-covariance function of ARFIMA(0, d, 0) with unit variance innovation, that is, It is shown in [Doukhan et al., 2019] that, for any d * ∈ (0, 1/2) such a sequence (a j ) satis- , so that, under (A-5), following (3.39) and (3.41), the corresponding trawl process has a spectral density of the form (1.10) with h * = 0.
We check in the following section that more general seed processes and trawl sequences can be used.

More general seeds and sequences
In this section, in contrast to (A-6), we consider trawl sequences (a j ) that may not be nonincreasing but we specify (1.5) by assuming that, there exists c > 0 and If moreover Eζ 6 < ∞, then (2.19) holds with p = 3, and, by Theorem 1 (ii), we get (A-3) with s 0 = (α * − 1)/2.If we only assume that Eζ 4 < ∞, then (2.21) holds with p = 2 and s 0 = (α * − 1)/4, so that Theorem 1 (i) gives that (A-3) holds this time only with The Binomial seed process of [Doukhan et al., 2019, Example 4] is defined for some given n ∈ N * by setting γ(t) = n i=1 ½ {U i ≤t} with the U i 's i.i.d. and uniform on [0, 1].In this case, we have that, for all u ≥ 1, γ(u) = n and, for all Thus, for any sequence (a j ) satisfying (3.42), similarly to the previous case, (A-2) holds and the trawl process X has auto-covariance function r given by where, for all j ∈ N, ãj = a j ½ {a j <1} .Also, for the binomial seed and (a j ) satisfying (3.42), (2.19) holds for any integer p, and (A-3) holds with s 0 = (α * − 1)/2 by Theorem 1 (ii).
Having checked that the trawl process satisfies (A-2) and (A-3) for these seeds, we now turn to the form of its spectral density and show that it is indeed of the form (1.10) and can be used with Lemma 3 to form a parameter space K that satisfies (A-4).
Proposition 2. Assume (A-1).Suppose that γ is Lévy seed process, a mixed Poisson seed process or a binomial seed process.Suppose moreover that γ(1) has finite positive variance and (a j ) satisfies (3.42) with α * ∈ (1, 2).Then the trawl process defined by (1.1) has a spectral density of the form (1.10) with Proof.See Section 5.1.

Weighted weak dependence indices
Here we introduce a somewhat general setting that will be used later to derive some important properties on the memory of Trawl processes.They can be, however, of independent interest.
We use the classical weak-dependence concept.
Definition 1 ( [Dedecker et al., 2007]).A random process (X t ) t∈Z is said to be θ−weakly Definition 2. A time series (X k ) is said to be a causal Bernoulli shift process (CBS) if there exists an iid sequence (γ j ) j∈Z valued in (E, E) and a measurable function Φ : r ) r≥1 of (X k ) are then defined by where (γ ′ j ) j∈Z is an independent copy of (γ j ) j∈Z .
Provided that a CBS process is well defined in L 2 , it is weakly dependent.
Lemma 4. Let X be an L 2 centered CBS process.Then it is π (2) −weakly dependent.
Proof.We write On the other hand we have that And we conclude with the Cauchy-Schwartz inequality.
Using the same proof we can include polynomial terms in the functions f and g. ) , and all functions g : R v → R satisfying Remark 2. Note that in Definition 3, the conditions on f and g are weaker as p (±) increases.
Using this new definition, we get the following result.
Lemma 5. Let (X k ) be an L 2p centered CBS process.Then, for any p (±) ≥ 1 such that Proof.Let us now prove the bound of the p-weighted θ−weak dependence coefficient θ (p) r .We use the same notation as in the proof of Lemma 4 but this time with f and g as in Definition 3.
We immediately have that, by the assumption on f that .
Finally, we note that C ≤ 1 ∨ X 0 . The result follows from (4.47) and the above bounds of A, B and C.
We also obtained this lemma with an improved weighted weakly dependent coefficient by conceding a bit of moment condition.
Lemma 6.Let (X k ) be an L 2p centered CBS process.Then, for any p (±) ≥ 1 it is (p (−) , p (+) )-weighted θ-weakly dependent with where C p is a positive constant only depending on p and S 0 (2p) is defined in (5.51).
We immediately have that, by the assumption on f that .
Finally, we note that, similarly, C ′ ≤ 1 ∨ X 0 . The result follows from (4.49) and the above bounds of A ′ , B ′ and C ′ .

On trawl processes
Proof of Lemma 1.We prove the result under (1.2) and (2.19).The case where (2.19) is not assumed corresponds to setting p = 1 in the following.By the Rosenthal Inequality for sums of independent random variables, see [Petrov, 1995, Theorem 2.9], we have, for any 1 ≤ i ≤ k, for some constant C p only depending on p, (5.50) The convergence of (1.1) in L 2p follows.
It follows that we can write X k as with R R endowed by the σ-field B(R) ⊗R (the smallest one that makes the R R → R mapping x → x(t) measurable for all t ∈ R).Since (γ j ) j∈Z is i.i.d., it is ergodic, and so is (X k ) k∈Z .
All the other assertions of the lemma are obvious.
Using the same idea and the results of Section 4, we now prove Theorem 1.
It remains to show (1.15) and (1.16).We use that ( Xk ) defined in (2.18) can be written the causal Bernoulli shift process where Φ is a measurable mapping on E N , with E = R R endowed with B(R) ⊗R .Then the L q coefficients defined in (4.44) with (γ ′ j ) j≥0 denoting an independent copy of (γ j ) j≥0 , satisfy, for all r ∈ N, and q ≥ 2, where the second inequality follows from (5.50) by setting (5.51) We now separate the two cases.
The following lemma is useful for proving Proposition 2.
Proof.First observe that, for all k ∈ N, (5.52) Now, there exists C > 0 such that for all j ∈ N, (5.53) It follows from the first inequality that, for all j ∈ N and k In particular, the latter term is larger than or equal to 1 for j large enough and it follows that there exists j 0 only depending on α and C such that, for all j ≥ j 0 and k ≥ where we used the second inequality of (5.53).This now implies that, for all k Since j 0 is fixed the two last term in the previous display are O(k −α ) as k → ∞ and we conclude from (5.52).
We can now provide the proof of Proposition 2.

Proof of Proposition 2.
From what precedes, we know that under these assumptions, the trawl process has an auto-covariance function of the form where A > 0, B ∈ R and ãj = a j ½ {a j <ā} with ā some positive constant.We treat the two terms in the right-hand side of (5.54) separately.
Term S: Since ãk = a k for k large enough, (ã k ) also satisfies Condition (3.42) and Lemma 7 gives that (5.55) Recall the definition of r (d) in (3.41).Define, for all k ≥ 0, where the second equality is derived in [Doukhan et al., 2019, Example 5].By [Giraitis et al., 2012, Theorem 72.1] and its proof, we have for any (5.56) Hence the previous equation and the definition of d * give that (5.57) And Condition (3.42) is equivalent to have with c * > 0 only depending on c and α * .Inserting this in (5.55) and using the definition of a * , we obtain This, with the definition (3.41) implies where h * S ∈ H(α * − 1, C S ) for some C S > 0. Term P : It only remains to prove that P defined in (5.54) satisfies P (k) = O(k −α * ) as k → ∞ (so that the associated Fourier series belongs to H(α * − 1, C P ) for some C P > 0).This follows immediately by observing that (3.42) with α * > 1 implies, for some constant C > 0 and all k ∈ N, This concludes the proof.

Convergence of the empirical covariance function
We start with the proof of Lemma 2.
Proof of Lemma 2. Let r denote the autocovariance function of (5.59) This, with the bound (1.14), allows to go back and forth from (1.15) or (1.16) to (2.22).
We can now prove Proposition 1.
Proof of Proposition 1.We have, using again the identity displayed in (2.22), (5.62) Using (1.14), we get that (5.61) and (5.62) are both less than or equal to where C > 0 only depends on C 0 and s 0 .To get (2.27), it thus only remains to show that a similar bound holds for the term appearing in (5.60).To this end we use the bound (2.22) that we have showed to hold under (A-3) in Lemma 2.More precisely we use the bound on left-hand side of the ∧ sign in (2.22) in the first following case and the bound on right-hand side of the ∧ sign for all the other cases: so is included in the first case.)Hence we get that the term in (5.60) is bounded from above by where C only depends on C 2 and s 0 .
Proof of Corollary 1.Since rn (k), I n , r and f are invariant by centering, we can assume in the following that X is centered without loss of generality.
Let T denote R/2πZ.Recall that the spectral measure µ of a weakly stationary process X is a finite measure on T such that the covariance function of X satisfies r(k) = e iλ µ(dλ) , k ∈ Z .
We derive the following useful lemma.
Lemma 8. Let X be weakly stationary process with spectral measure µ and define the periodogram and the empirical covariance by (1.11) and (2.23).Let h : R → R be a (2π)-periodic bounded function.Then, for all 0 ≤ m < n and Replacing h m by its definition, we get Then, by definition of ǫ m , we have
We can no prove Corollary 2.
Proof of Corollary 2. For β small enough, since J is compactly supported, we have, for all where J * (ξ) = J(x) e ixξ dx is the Fourier transform of J. Since J is two times continuously differentiable and has compact support, we have has absloutely summable Fourier coefficients and the following identity holds Applying Lemma 8 with m = n − 1 and c k = c k (J β,λ 0 ) we get that Applying (2.28) and |J * (ξ)| = O(|ξ| −2 ), we get where C 1 only depends on J.The result then follows from the fact that lim n→∞,β→0

Consistency of parametric Whittle estimation
We first introduce some notation valid throughout this section and derive useful lemmas.For any d ∈ R and ǫ > 0, we define (5.64) Finally we denote (5.65) We now introduce the useful lemmas.
Lemma 9. Let a > 0 and f : R → R be (2π)-periodic.Let g : R → R + be (2π)-periodic and such that g dL > 0. Then h → ln Proof.We apply successively that, for all 0 < x ≤ y, We obtain, for all h, h ∈ C, hence is bounded from above by the min of the two last right-hand sides.Taking the difference of the log's then yields Hence we get the first assertion.
Similarly, we get that, for all h, h ∈ C, And we get the second assertion.
Lemma 11.Let d * < 1/2, h * ∈ C and a > 0. For all d ∈ R and h ∈ C, we have lim Proof.By Lemma 9, we have for all d ′ ∈ R and h, h ∈ C, Then using (5.64) and Lemma 10, we get the first assertion.The second assertion is proved similarly.
Lemma 12. Let K be a compact subset of R × C such that, for all (d, h) ∈ K, f d + h > 0 on R. Suppose moreover that where for any Proof.Let which is finite since K is compact and (h, λ) → h(λ) continuous.
Case 1) Suppose that λ = 0. Since the mapping (d Case 2) Suppose that d < 0. This case is similar to Case 1): it is sufficient to show that ) is continuous on (−∞, 0)×C ×R, which follows from the continuity of (x, u) → u x on (0, ∞) × R + , which is easy to establish.
Proof of Lemma 3. We first recall why H(s, C) is a compact subset of C. For all u, v ∈ R, we have We get that, for all h ∈ H(s, C) and u, v ∈ R, where the O does not depend on h.where K 0 is defined by We start with the proof of (5.68).By definition of a K in (5.65), we have, for all (d, h) ∈ K, (5.71) And by Lemma 12, a K > 0. Note that I n dL = rn (0)/(2π) > 0 for n large enough, a.s.
Applying Lemma 9 with a = a K and since h * n converges to h * uniformly by (A-4), we get Since 1/((f d * +h * )∨a K ) is continuous and X is ergodic with spectral density f given by (1.10), we have (see e.g.[Giraitis et al., 2012, Theorem 8.2.1]).By definition of a K , this limit is c * and, with the three previous displayed equation, we get (5.68).
We conclude with the proof of (5.69), given some ǫ 0 > 0. Equations (5.71) and (5.64) and Lemma 9 with a = a K yield, for all (d, h) and (d The last two displays give that, for all (d, h) ∈ K and ǫ > 0, lim inf Since K 0 is compact, by Lemma 11, we can find η > 0 such that inf Applying Lemma 10 with a = a K , we get that, for all (d, h) ∈ K 0 , there exists ǫ > 0 such Since K 0 is compact, we can thus cover K 0 with a finite collection (B i ) i=1,...,N , for which, for any i = 1, . . ., N , there exists Letting n tend to ∞ and then ǫ to zero (using Lemma 10), we get that By definition of a K , the latter integral is c * and the proof is concluded.and (2.38), if we can apply Theorem 2, then we also get that σ2 n is a consistent estimator of σ 2 * .Finally, it only remains to explain how to get that θn converges to ϑ * a.s.This follows from the assertion that ( dn , ĥn ) = Ψ( dn , θn ) converges a.s. to (d * , h * ) = Ψ(d * , ϑ * ), provided that Ψ can be continuously inversed on K. To summarize, to conclude the proof, we only need to prove the following assertion.
(b) The mapping Ψ is bijective and bi-continuous from K to K (its range).

Define the mapping
where, for any C > 0, Hence Assertion (b) follows from Assertion (iii) among with the following facts.
(iv) For all compact subset K 0 ⊂ Θ p,q there exists C > 0 such that the range R(K 0 ) ⊂ A(C).
Proof of Assertion (i): This follows directly from the definitions of Λ n and Λn in (1.12) and (2.34) and the well known fact that, for all d ∈ (−1/2, 1/2) and ϑ ∈ Θ p,q , ln( Proof of Assertion (ii): This is simple algebra using the above definitions.

Estimation of the trawl exponent
To test our new estimator, we will compare it with local Whittle estimator for long range dependent sequences ([Robinson, 1995]).First, let us recall the definition of the local Whittle estimator.
Here the Hurst exponent is H = (3 − α)/2 and the spectral density writes Let λ j = 2jπ/n denote the canonical frequencies for 1 ≤ j ≤ n/2, where n is the sample size.
The local Whittle contrast is defined for a given bandwidth parameter m ≤ n/2 by where I n is the usual periodogram, see (1.11).Then the local Whittle estimator αLW is computed through numerical minimization of R(α) over α ∈ [1, 2].In the non-linear case, such as trawl processes with Poisson or binomial seed, the use of such an estimator is theoretically justified in [Dalla et al., 2005] under the assumption lim n→∞ m n + 1 m = 0.The parametric Whittle estimator that we use is based on the parameterization (1.10).
In our setting, both the local Whittle estimator αLW and the parametric Whittle estimator αPW rely on tuning parameters, respectively denoted by m and N .Observe that N and m have very different interpretations.As the bandwidth parameter m increases, a larger range of frequencies is used in the estimation, thus reducing the variance, and the estimator relies on the approximation f (λ) ≈ cλ α−2 also over a larger range of frequencies, thus worsening the bias.In contrast, as N increases, we expect the variance to increase, since the number of parameters to estimate for h is larger, and the bias to decrease, since the approximation of h by a trigonometric polynomial is more accurate.

Results
We show here the comparison of the two estimators.We have to guess the hyperparameter of the two estimators: The "m" for the local Whittle and the number "N " of Fejér kernels for the parametric estimator.We give our results in function of the choice of these hyperparameters.
For each experiment, we write in bold the choice of hyperparameters minimizing the sum of the square of the bias and the variance (the mean square error).In all cases, but especially for the Binomial seed, we can see in the following tables that our estimator outperforms the local Whittle estimator.A right choice for the number of kernels seems to be around between 3 and 5, even if, best results may be obtained for higher number of kernels, but this may be due to local minima reached by numerical optimization.

Conclusion
In this paper the consistency of pointwise and broadband spectral estimators have been proved under general conditions.We show in particular that a wide class of trawl pro- cesses satisfy these conditions.However, in view of the sample mean behaviors exhibited in [Doukhan et al., 2019], finer results on the asymptotic behavior of these estimators should be treated under more specific assumptions.Up to our best knowledge, very few results are available for non-linear long-range dependent trawl processes.The rate of a wavelet based semiparametric estimator of the long-range dependence parameter is studied in [Fay et al., 2007] for so called Infinite source Poisson, which can be seen as a specific trawl process with Poisson seed.A first step for future work could be to study the asymptotic behavior of such an estimator.
.42) We also consider the non Lévy seed processes introduced in [Doukhan et al., 2019], for which the covariance structure can still be derived precisely.Let us examine here the mixed Poisson seed and the Binomial seed processes of their Examples 3 and 4. The first case extends the (thus Lévy) Poisson seed by setting γ(t) = N (ζ t), where N is a homogeneous Poisson counting process with unit intensity and ζ is a positive random variable independent of N and with finite variance.Then we have, for all u, v ≥ 0, E γ(u) = u Eζ and Cov(γ(u), γ(v)) = (u ∧ v)Eζ + uvVar(ζ).Thus, for any sequence (a j ) satisfying (3.42), Condition (1.2) holds and (A-2) follows from Lemma 1 and Eq.(1.3) yields the following auto-covariance function for X : j≥0 (b j ∧ b j+k ) ≤ j≥0 b j+k .

Remark 3 .
Condition (5.66)  means that any parameter (0, h) in K is isolated from parameters (d, h) with d < 0. This assumption cannot be avoided in Lemma 12.(As a counterexample,take K = {(d, −d) : d ∈ [−1/2, 0]} where here −d is seen as the function in C that is constant equal to −d).It is of course trivially satisfied if K ⊂ R + × C.
i) We have (2.19) with p = 2 and , as r → ∞, By the Arzelà-Ascoli theorem, we get that H(s, C) is a compact subset of C. It follows that A, as a closed subset of [0, 1/2] × C is also compact.Thus by Lemma 12, there exists a K > 0 such that f d + h ≥ a K for all (d, h) ∈ A. Since sup(|p m (h) − h|) tends to 0 uniformly in h ∈ H(s, C) as m → ∞, we get that there exists a positive integer m 0 such that f d + p m [h] ≥ a K /2 > 0 on R for all (d, h) ∈ A and m ≥ m 0 .Let K n and K be defined as in the lemma for some diverging sequence (m n ) of integers larger than or equal to m 0 .It is straightforward to show that K is compact (because for any increasing or constant sequence (α k ) k∈N of integers and any sequence ((d k , h k )) k∈N valued and converging in A, we have that (d k , p mα k [h k ]) converges in K).Assumption (A-4) easily follows by setting h * n = p mn [h * ].
s. (5.72)For all (d, h) ∈ K 0 , since f d + h and f d * + h * do not coincide almost everywhere, the Jensen inequality and the definition of Λ * give that ln c *