The Trace Problem for Toeplitz Matrices and Operators and its Impact in Probability

The trace approximation problem for Toeplitz matrices and its applications to stationary processes dates back to the classic book by Grenander and Szeg\"o,"Toeplitz forms and their applications". It has then been extensively studied in the literature. In this paper we provide a survey and unified treatment of the trace approximation problem both for Toeplitz matrices and for operators and describe applications to discrete- and continuous-time stationary processes. The trace approximation problem serves indeed as a tool to study many probabilistic and statistical topics for stationary models. These include central and non-central limit theorems and large deviations of Toeplitz type random quadratic functionals, parametric and nonparametric estimation, prediction of the future value based on the observed past of the process, etc. We review and summarize the known results concerning the trace approximation problem, prove some new results, and provide a number of applications to discrete- and continuous-time stationary time series models with various types of memory structures, such as long memory, anti-persistent and short memory.


Introduction
Toeplitz matrices and operators, which have great independent interest and a wide range of applications in different fields of science (economics, engineering, finance, hydrology, physics, signal processing, etc.), arise naturally in the context of stationary processes.This is because the covariance matrix (resp.operator) of a discrete-time (resp.continuous-time) stationary process is a truncated Toeplitz matrix (resp.operator) generated by the spectral density of that process, and vice versa, any non-negative summable function generates a Toeplitz matrix (resp.operator), which can be considered as a spectral density of some discretetime (resp.continuous-time) stationary process, and therefore the corresponding truncated Toeplitz matrix (resp.operator) will be the covariance matrix (resp.operator) of that process.
Truncated Toeplitz matrices and operators are of particular importance, and serve as tools, to study many topics in the spectral and statistical analysis of discrete-and continuous-time stationary processes, such as central and noncentral limit theorems and large deviations of Toeplitz type random quadratic forms and functionals, estimation of the spectral parameters and functionals, asymptotic expansions of the estimators, hypotheses testing about the spectrum, imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014 prediction of the future value based on the observed past of the process, etc. (see, e.g., [1] - [7], [9,11,12,14,15], [17] - [66], and references therein).
The present paper is devoted to the problem of approximation of the traces of products of truncated Toeplitz matrices and operators generated by integrable real symmetric functions defined on the unit circle (resp.on the real line).We discuss estimation of the corresponding errors, and describe applications to discrete-and continuous-time stationary time series models with various types of memory structures (long-memory, anti-persistent and short-memory).
The paper contains a number of new theorems both for Toeplitz matrices and operators, which are stated in Sections 2 and 3, respectively.Those in Section 3 involving Toeplitz operators are proved in Section 5.The corresponding theorems of Section 2 involving Toeplitz matrices can be proved in a similar way and hence are omitted.Section 4 concerns applications and also contains some new results.These are proved within the section.
The paper is organized as follows.In the remainder of this section we state the trace approximation problem and describe the statistical model.In Section 2 we discuss the trace problem for Toeplitz matrices.Section 3 considers the same problem for Toeplitz operators.Section 4 is devoted to some applications of the trace problem to discrete-and continuous-time stationary processes.In Section 5 we outline the proofs of Theorems 3.1 -3.4.An appendix contains the proofs of technical lemmas.

The Trace Approximation Problem
The problem of approximating traces of products of truncated Toeplitz matrices and operators can be stated as follows.
In this paper we review and summarize the known results concerning Problems (A) and (B), prove some new results, as well as provide a number of applications to discrete-and continuous-time stationary time series models that have various types of memory structures (short-, intermediate-, and long-memory).
We focus on the following special case which is important from an application viewpoint, and is commonly discussed in the literature: m = 2ν, τ k = 1, k = 1, m (or τ k = (−1) k , k = 1, m), and Throughout the paper the letters C and c, with or without index, are used to denote positive constants, the values of which can vary from line to line.Also, all functions defined on T are assumed to be 2π-periodic and periodically extended to R.

The Model: Short, Intermediate and Long Memory Processes.
Let {X(u), u ∈ U} be a centered, real-valued, continuous-time or discretetime second-order stationary process with covariance function r(u), possessing a spectral density function f (λ), λ ∈ Λ, that is, E[X(u)] = 0, r(u) = E[X(t + u)X(t)], u, t ∈ U, and r(u) and f (λ) are connected by the Fourier integral: (1.6) Thus, the covariance function r(u) and the spectral density function f (λ) are equivalent specifications of second order properties for a stationary process X(u).
The time domain U is the real line R in the continuous-time case, and the set of integers Z in the discrete-time case.The frequency domain Λ is R in the imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014 continuous-time case, and Λ = T = (−π, π] in the discrete-time case.In the continuous-time case the process X(u) is also assumed measurable and meansquare continuous: E[X(t) − X(s)] 2 → 0 as t → s.
The statistical and spectral analysis of stationary processes requires two types of conditions on the spectral density f (λ).The first type controls the singularities of f (λ), and involves the dependence (or memory) structure of the process, while the second type -controls the smoothness of f (λ).
Dependence (memory) structure of the model.We will distinguish the following types of stationary models: (a) short memory (or short-range dependent), (b) long memory (or long-range dependent), (c) intermediate memory (or anti-persistent).The memory structure of a stationary process is essentially a measure of the dependence between all the variables in the process, considering the effect of all correlations simultaneously.Traditionally memory structure has been defined in the time domain in terms of decay rates of long-lag autocorrelations, or in the frequency domain in terms of rates of explosion of low frequency spectra (see, e.g., Beran [7], Guegan [38], Robinson [52], and references therein).
It is convenient to characterize the memory structure in terms of the spectral density function.
Short-memory models.Much of statistical inference is concerned with short-memory stationary models, where the spectral density f (λ) of the model is bounded away from zero and infinity, that is, there are constants A typical short memory model example is the stationary Autoregressive Moving Average (ARMA)(p, q) process whose covariance r(u) is exponentially bounded: and the spectral density is a rational function.
A long-memory model is defined to be a stationary process with unbounded spectral density, and an anti-persistent model -a stationary process with vanishing spectral density.
In the discrete context, a basic long-memory model is the Autoregressive Fractionally Integrated Moving Average (ARFIMA)(0, d, 0)) process X(t) defined by where B is the backshift operator BX(t) = X(t − 1) and ε(t) is a discrete-time white noise.The spectral density of X(t) is given by (1.7) A typical example of an anti-persistent model is the ARFIMA(0, d, 0) process Data can also occur in the form of a realization of a "mixed" short-longintermediate-memory stationary process X(t) with spectral density where f I (λ), f L (λ) and f S (λ) are the intermediate, long-and short-memory components, respectively.A well-known example of such process, which appears in many applied problems, is an ARFIMA(p, d, q) process with spectral density where h(λ) is the spectral density of an ARMA(p, q) process.Observe that for 0 < d < 1/2 the model X(t) specified by (1.8) displays long-memory, for d < 0 -intermediate-memory, and for d = 0 -short-memory.For d ≥ 1/2 the function f (λ) is not integrable, and thus it cannot represent a spectral density of a stationary process.Also, if d ≤ −1, then the series X(t) is not invertible in a sense that it cannot be used to recover a white noise ε(t) by passing X(t) through a linear filter (see [9,11]).The ARFIMA(p, d, q) processes, first introduced by Granger and Joyeux [36], and Hosking [41], became very popular due to their ability in providing a good characterization of the long-run properties of many economic and financial time series.They are also very useful for modeling multivariate time series, since they are able to capture a larger number of long term equilibrium relations among economic variables than the traditional multivariate ARIMA models (see, e.g., Henry and Zaffaroni [40] for a survey on this topic).
Another important long-memory model is the fractional Gaussian noise (fGn).To define fGn first consider the fractional Brownian motion (fBm) {B(t) := B H (t), t ∈ R} with Hurst index H, 0 < H < 1, defined to be a centered Gaussian H-self-similar process having stationary increments, that is, B H (t) satisfies the following conditions: where σ 2 0 = VarB(1), and where the symbol d = stands for equality of the finitedimensional distributions.Then the increment process called fractional Gaussian noise (fGn), is a discrete-time centered Gaussian stationary process with covariance function and spectral density function where c is a positive constant.It follows from (1.10) that f (λ) ∼ c |λ| 1−2H as λ → 0, that is, f (λ) blows up if H > 1/2 and tends to zero if H < 1/2.Also, comparing (1.7) and (1.10), we observe that, up to a constant, the spectral density of fGn has the same behavior at the origin as ARFIMA(0, d, 0) with Thus, the fGn For more details we refer to Samorodnisky and Taqqu [57] and Taqqu [61].
Continuous-time long-memory and anti-persistent models.In the continuous context, a basic process which has commonly been used to model long-range dependence is fractional Brownian motion (fBm) B H with Hurst index H, defined above.It can be regarded as Gaussian process having a spectral density: where (1.11) can be understood in a generalized sense since the fBm B H is a nonstationary process (see, e.g., Anh et al. [3] and Gao et al. [20]).
Comparing (1.11) and (1.12), we observe that the spectral density of fBm is the limiting case as β → 0 that of fRBm with Hurst index H = α − 1/2.

A Link Between Stationary Processes and the Trace Problem.
As it was mentioned above, Toeplitz matrices and operators arise naturally in the theory of stationary processes, and serve as tools, to study many topics of the spectral and statistical analysis of discrete-and continuous-time stationary processes.
To understand the relevance of the trace approximation problem to stationary processes, consider a question concerning the asymptotic distribution (as T → ∞) of the following Toeplitz type quadratic functionals of a Gaussian stationary process {X(u), u ∈ U} with spectral density f (λ), λ ∈ Λ and covariance function r(t) := f (t), t ∈ U (here U and Λ are as in Section 1.2): in the continuous-time case T j=1 g(k − j)X(k)X(j) in the discrete-time case, (1.13) where g(t) = Λ e iλt g(λ) dλ, t ∈ U is the Fourier transform of some real, even, integrable function g(λ), λ ∈ Λ.We will refer g(λ) as a generating function for the functional Q T .
The limit distributions of the functionals (1.13) are completely determined by the spectral density f (λ) and the generating function g(λ), and depending on their properties the limit distributions can be either Gaussian (that is, Q T with an appropriate normalization obeys central limit theorem), or non-Gaussian.The following two questions arise naturally: a) Under what conditions on f (λ) and g(λ) will the limits be Gaussian?b) Describe the limit distributions, if they are non-Gaussian.These questions will be discussed in detail in Section 4.2.
Let A T (f ) be the covariance matrix (or operator) of the process {X(u), u ∈ U}, that is, A T (f ) denote either the T × T Toeplitz matrix, or the T -truncated Toeplitz operator generated by the spectral density f , and let A T (g) denote either the T ×T Toeplitz matrix, or the T -truncated Toeplitz operator generated by the function g.
Our study of the asymptotic distribution of the quadratic functionals (1.13) is based on the following well-known results (see, e.g., [37,42]): 1.The quadratic functional Q T in (1.13) has the same distribution as the sum , where {ξ k , k ≥ 1} are independent N (0, 1) Gaussian random variables and {λ k , k ≥ 1} are the eigenvalues of the operator A T (f )A T (g).(Observe that the sets of non-zero eigenvalues of the operators A T (f )A T (g), A T (g)A T (f ) and 3.The k-th order cumulant χ k (•) of Q T is given by Thus, to describe the asymptotic distributions of the quadratic functionals (1.13), we have to control the corresponding traces of the products of Toeplitz matrices(or operators), yielding the trace approximation problem with generating functions specified by (1.5).

The Trace Problem for Toeplitz Matrices
Let f (λ) be an integrable real symmetric function defined on T = (−π, π].For T = 1, 2, . . .denote by B T (f ) the (T × T ) Toeplitz matrix generated by function f , that is, where are the Fourier coefficients of f .Observe that What happens when the matrix B T (f ) is replaced by a product of Toeplitz matrices?Observe that the product of Toeplitz matrices is not a Toeplitz matrix.The idea is to approximate the trace of the product of Toeplitz matrices by the trace of a Toeplitz matrix generated by the product of the generating functions.More precisely, let H = {h 1 , h 2 , . . ., h m } be a collection of integrable real symmetric functions defined on T. Define and let Observe that by (2.3) How well is S B,H (T ) approximated by M T,H ?What is the rate of convergence to zero of approximation error ∆ B,T,H (T ) as T → ∞?These are Problems (A) and (B).

Problem (A) for Toeplitz Matrices
Recall that Problem (A) involves finding conditions on the functions h 1 (λ), In Theorem 2.1 and Remark 2.2 we summarize the results concerning Problems (A) for Toeplitz matrices in the case where the exponents τ k , k = 1, m (see (1.2)) are all equal to 1 as in (2.4).
Remark 2.4.It would be of interest to extend the results of (A3) and (A4) to arbitrary m > 4.
We now consider the case when the product in (1.2) involves also inverse matrices, that is, τ k = (−1) k , k = 1, m.We assume that m = 2ν, and the functions from the collection H = {h 1 , h 2 , . . ., h m } that involve Toeplitz matrices we denote by g i , i = 1, ν, while those involving inverse Toeplitz matrices we denote by f i , i = 1, ν.We set The following theorem was proved by Dahlhaus (see [15], Theorem 5.1).
It is easy to see that the conditions of (B5) are satisfied, and hence we have (2.14) with γ as in (2.15).
The next results (cf.Ginovyan [25]) show that for special case m = 2 the rates in Theorem 2.3 (B4) and (B5) can be substantially improved.

The Trace Problem for Toeplitz Operators
In this section we consider Problems (A) and (B) for Toeplitz operators, that is, in the case where the generating functions are defined on the real line.Again, Problem (A) involves o(1) approximation and Problem (B) involves O(T −γ ) approximation with γ > 0. The theorems in this section are proved in Section 5.
Let f (λ) be an integrable real symmetric function defined on R. The analogue of the Fourier coefficients f (k) in (2.2) is the Fourier transform f (t) of f (λ): The f in (3.1) will play the role of kernel in an integral operator.Given T > 0 and an integrable real symmetric function f (λ) defined on R, the T -truncated Toeplitz operator generated by f (λ), denoted by W T (f ), is defined by the following equation (see, e.g., [30,37,42]): where f is as in (3.1).It follows from (3.1), (3.2) and the formula for traces of integral operators (see, e.g., [35] We pose the same question as in the case of Toeplitz matrices: what happens when the single operator W T (f ) is replaced by a product of such operators?Observe that the product of Toeplitz operators again is not a Toeplitz operator.The approach is similar to that of Toeplitz matrices -to approximate the trace of the product of Toeplitz operators by the trace of a Toeplitz operator generated by the product of generating functions.More precisely, let H = {h 1 , h 2 , . . ., h m } be a collection of integrable real symmetric functions defined on R. Define and let

Problem (A) for Toeplitz Operators
In Theorem 3.1 and Remark 3.1 we summarize the results concerning Problems (A) for Toeplitz operators in the case where τ k = 1, k = 1, m.

Applications to Stationary Processes
In this section we provide some applications of the trace problem to discreteand continuous-time stationary processes: ARFIMA and Fractional Riesz-Bessel motions; central and non-central limit theorems, Berry-Esséen bounds, and large deviations for Toeplitz quadratic forms and functionals.

Applications to ARFIMA time series and fractional Riesz-Bessel motions
In this subsection we apply the results of Sections 2 and 3 to the important special cases where the generating functions are spectral densities of a discrete-time ARFIMA(0, d, 0) stationary processes or continuous-time stationary fractional Riesz-Bessel motions.We use the following notation: m = 2ν; and where either imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014

Applications to ARFIMA time series
The next theorem gives an error bound for ∆ 2,B (T ) in the case where the corresponding Toeplitz matrices are generated by spectral densities of two discretetime ARFIMA(0, d, 0) stationary processes.
Theorem 4.1.Let f i (λ), i = 1, 2, be the spectral density functions of two ARFIMA(0, d, 0) stationary processes defined as It is clear that the conditions of Theorem 2.3 (B5) are satisfied with α i = 2d i and C 1i = C 2i = σ 2 i , i = 1, 2, and the result follows.The next theorem, which was proved in Lieberman and Phillips [48], gives an explicit second-order asymptotic expansion for S 1,B (T ) in the case where the Toeplitz matrices are generated by the spectral densities given by (4.3), and shows that in this special case a second-order asymptotic expansion successfully removes the singularity and delivers a substantially improved approximation.Theorem 4.2.Let f i (λ), i = 1, 2, be the spectral density functions of two ARFIMA(0, d, 0) stationary processes defined by (4.3) with 0 < σ 2 i < ∞ and 0 as T → ∞, where Remark 4.1.The asymptotic relation (4.7) in Lieberman and Phillips [48] was established by direct calculations using the explicit forms of functions f i given by (4.3).On the other hand, as it follows from (4.6), the functions f i (i = 1, 2) satisfy conditions of Theorem 2.5 with α i = 2d i (i = 1, 2), and hence (4.7) is a special case of Theorem 2.5.

Applications to fractional Riesz-Bessel motions
Now we assume that the underlying model is a continuous-time stationary process specified by a fractional Riesz-Bessel motion.The following result is an immediate consequence of Theorem 3.1 (A1).
Proof.The result follows from Theorem 3.2 (B3) and the following lemma, which is proved in the Appendix.
The next theorem, which is a continuous version of Theorem 4.2, contains an explicit second-order asymptotic expansion for S 1,W (T ) in the case where the Toeplitz operators are generated by the spectral densities given by (4.12), and shows that in this special case a second-order asymptotic expansion successfully removes the singularity and delivers a substantially improved approximation.
Theorem 4.6.Let f i (λ), i = 1, 2, be as in (4.12).Then under as T → ∞, where The proof is based on the following lemma, which contains an asymptotic formula for the covariance function of a fRBm process.It is proved in the Appendix.

Limit Theorems for Toeplitz Quadratic Functionals
In this section we examine the limit behavior of quadratic forms and functionals of discrete-and continuous-time stationary Gaussian processes with possibly long-range dependence.The matrix and the operator that characterize the quadratic form and functional are Toeplitz.
Let {X(u), u ∈ U} be a centered real-valued Gaussian stationary process with spectral density f (λ), λ ∈ Λ and covariance function r(t) := f (t), t ∈ U, where U and Λ are as in Section 1.2.We are interested in the asymptotic distribution (as T → ∞) of the following Toeplitz type quadratic functionals of the process X(u): is the Fourier transform of some real, even, integrable function g(λ), λ ∈ Λ.We will refer g(λ) as a generating function for the functional Q T .In the discrete-time case the functions f (λ) and g(λ) are assumed to be 2π-periodic and periodically extended to R.
The limit distributions of the functionals (4.29) are completely determined by the spectral density f (λ) and the generating function g(λ), and depending on their properties the limit distributions can be either Gaussian (i.e., Q T with an appropriate normalization obeys central limit theorem), or non-Gaussian.The following two questions arise naturally: a) Under what conditions on f (λ) and g(λ) will the limits be Gaussian?b) Describe the limit distributions, if they are non-Gaussian.

Central limit theorems for Toeplitz quadratic functionals
We first discuss the question a), that is, finding conditions on the spectral density f (λ) and the generating function g(λ) under which the functional Q T , defined by (4.29), obeys central limit theorem.This question goes back to the classical monograph by Grenander and Szegö [37], where the problem was considered for discrete time processes, as an application imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014 of the authors' theory of the asymptotic behavior of the trace of products of truncated Toeplitz matrices.
Let Q T be as in (4.29).We will use the following notation: By Q T we denote the standard normalized quadratic functional: The notation will mean that the distribution of the random variable Q T tends (as T → ∞) to the centered normal distribution with variance σ 2 .
Our study of the asymptotic distribution of the quadratic functionals (4.29) is based on the following representation of the k-th order cumulant χ k (•) of Q T , which follows from (1.15) (see, also, [37,42]): where A T (f ) and A T (g) denote either the T -truncated Toeplitz operators (for continuous-time case), or the T × T Toeplitz matrices (for discrete-time case) generated by the functions f and g respectively, and tr[A] stands for the trace of an operator A.
The next result contains sufficient conditions in terms of f (λ) and g(λ) ensuring central limit theorems for standard normalized quadratic functionals Q T both for discrete-and continuous-time processes.
Below we assume that f, g ∈ L 1 (Λ), and with no loss of generality, that g ≥ 0. Also, we set with σ 2 0 given by (4.34).
Proposition 4.1.There exist a spectral density f (λ) and a generating function g(λ) such that and lim that is, the condition (4.39) does not guarantee convergence in (4.36).
To construct functions f (λ) and g(λ) satisfying (4.39) and (4.40), for a fixed p ≥ 2 we choose a number q > 1 to satisfy 1/p + 1/q > 1, and for such p and q we consider the functions f 0 (λ) and g 0 (λ) defined by where m is a positive integer.For an arbitrary finite positive constant C we set g ± (λ) = g 0 (λ) ± C. Then the functions f = f 0 and g = g + or g = g − imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014 satisfy (4.39) and (4.40) (for details we refer to Ginovyan and Sahakyan [29].Consequently, for these functions the standard normalized quadratic form Q T does not obey CLT, and it is of interest to describe the limiting non-Gaussian distribution of Q T in this special case.

Non-central Limit Theorems
The problem b) for discrete-time processes, that is, the description of the limit distributions of the quadratic form if it is non-Gaussian, goes back to the papers by Rosenblatt [53]- [55].
Later this problem was studied in a series of papers by Taqqu, and Terrin and Taqqu (see, e.g., in [61], [64], [65], [66], and references therein).Specifically, suppose that the spectral density f (λ) and the generating function g(λ) are regularly varying functions at the origin: where L 1 (λ) and L 2 (λ) are slowly varying functions at zero, which are bounded on bounded intervals.The conditions α < 1 and β < 1 ensure that the Fourier coefficients of f and g are well defined.When α > 0 the model {X(t), t ∈ Z} exhibits long memory.It is the sum α + β that determines the asymptotic behavior of the quadratic form Q T .If α + β ≤ 1/2, then by Theorem 4.7(E) the standard normalized quadratic form converges in distribution to a Gaussian random variable.If α + β > 1/2, convergence to Gaussian fails.Consider the embedding of the discrete sequence {Q T , T ∈ N} into a continuous-time process {Q T (t), T ∈ N, t ∈ R} defined by where [ ] stands for the greatest integer.Denote by Z(•) the complex-valued Gaussian random measure defined on the Borel σ-algebra B(R), and satisfying The next result, proved in Terrin and Taqqu [65], describes the non-Gaussian limit distribution of the suitable normalized process Q T (t).
Remark 4.9.The limiting process in (4.47) is real-valued, non-Gaussian, and satisfies EQ(t) = 0 and , that is, the processes {Q(at), t ≥ 0} and {a H Q(t), t ≥ 0} have the same finite dimensional distributions for all a > 0.
Remark 4.10.In [53] (see also [55]) Rosenblatt showed that if a discrete-time centered Gaussian process X(t) has covariance function r(t) = (1 + t 2 ) α/2−1/2 with 1/2 < α < 1, then the random variable has a non-Gaussian limiting distribution, and described this distribution in terms of characteristic functions.This is a special case of Theorem 4.8 with t = 1, 1/2 < α < 1 and β = 0.In [61] (see also [64]) Taqqu extended Rosenblatt's result by showing that the stochastic process weakly to a process (called Rosenblatt process) which has the double Wiener-Itô integral representation The distribution of the random variable Q(t) in (4.49) for t = 1 is described in Veillette and Taqqu [67].
imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014 Remark 4.11.The slowly varying functions L 1 and L 2 in (4.44) are of importance because they provide a great flexibility in the choice of spectral density f and generating function g.Observe that in Theorem 4.8 the functions L 1 and L 2 influence only the normalization (see (4.46)), but not the limit Q(t).Theorem 4.7(E) shows that in the critical case α + β = 1/2 the limit distribution of the standard normalized quadratic form Q T is Gaussian and essentially depends on slowly varying factors L 1 and L 2 .Note also that the critical case α + β = 1/2 was partially investigated by Terrin and Taqqu in [66].Starting from the limiting random variable Q(1) = Q(1; α, β), which exists only when α+β > 1/2, they showed the random variable converges in distribution to a Gaussian random variable as α + β approaches to 1/2.Remark 4.12.For continuous-time processes the problem b) has not been investigated, and it would be of interest to describe the limiting non-Gaussian distribution of the quadratic functional Q T .

Berry-Esséen Bounds and Large Deviations for Toeplitz Quadratic Functionals
In this section, we briefly discuss Berry-Esséen bounds in CLT and large deviations principle for quadratic functionals both for continuous-and discretetime Gaussian stationary processes (for more about these topics we refer to [6,12,14,19,44,49,59], and reference therein).
Berry-Esséen Bounds.Let Q T and Q T be as in (4.29) and (4.31), respectively.
Denote Q T := Q T / Var( Q T ), and let Z be the standard normal random variable: Z ∼ N (0, 1).The CLT for Q T (Theorem 4.7) tells us that Q T −→ Z in distribution as T → ∞.The natural next step concerns the closeness between the distribution of Q T and standard normal distribution, which means asking for the rate of convergence in the CLT.Results of this type are known as Berry-Esséen bounds (or asymptotics).
In discrete-time case, for special quadratic functionals, Berry-Esséen bounds were established in Tanoguchi [59], while for continuous-time case, Berry-Esséentype bounds were obtained in Nourdin and Peccati [49].The next theorem captures both cases.Theorem 4.9.Let Q T be as in (4.31), The following assertions hold.
1.If 1/p + 1/q ≤ 1/4, then there exists a constant C = C(f, g) > 0 such that for all T > 0 we have 2. If 1/p + 1/q ≤ 1/8 and Λ f 3 (λ)g 3 (λ) dλ = 0, then there exist a constant c = c(f, g) > 0 and a number T 0 = T 0 (f, g) > 0 such that T > T 0 implies More precisely, for any z ∈ R, we have as Remark 4.13.In the continuous-time case, Theorem 4.9 was proved in Nourdin and Peccati [49], by appealing to a general CLT of Section 4.2 (Theorem 4.7), and Stein's method.The proof, in the discrete-time case, is similar to that of the continuous-time case.
Large Deviations.We now present sufficient conditions that ensure large deviations principle (LDP) for Toeplitz type quadratic functionals of stationary Gaussian processes.For more about LDP we refer to Bryc and Dembo [12], Bercu et al. [6], Sato et al. [56], Taniguchi and Kakizawa [60], and references therein.
First observe that large deviation theory can be viewed as an extension of the law of large numbers (LLN).The LLN states that certain probabilities converge to zero, while the large deviation theory focuses on the rate of convergence.Specifically, consider a sequence of random variables {ξ n , n ≥ 1} converging in probability to a real constant m.Note that ξ n could represent, for instance, the n-th partial sum of another sequence of random variables: ξ n = 1 n n k=1 η k , where the sequence {η k } may be independent identically distributed, or dependent as in an observed stretch of a stochastic process.By the LLN, we have for ε > 0 IP{|ξ n − m| > ε} → 0 as n → ∞. (4.53) It is often the case that the convergence in (4.53) is exponentially fast, that is, where R(•) = R(ε, m, n) is a slowly varying (relative to an exponential) function of n and I(ε, m) is a positive quantity.Loosely, if (4.54) holds, we say that the sequence {ξ n } satisfies a large deviations principle.One of the basic problems of the large deviation theory is to determine I(ε, m) and R(ε, m, n).To be more precise, we recall the definition of Large Deviation Principle (LDP) (see, for instance, [12,60]).Now let Q T be the Toeplitz type quadratic functionals of a process X(u) defined by (4.29) with spectral density f (λ) and generating function g(λ).
The next result states sufficient conditions in terms of f (λ) and g(λ) to ensure that the LDP for normalized quadratic functionals { 1 T Q T } holds both for discrete-and continuous-time processes.
5. Proof of Theorems 3.1 -3.4 We only prove the results concerning Toeplitz operators (Theorems 3.1 -3.4).The proofs of the corresponding results for Toeplitz matrices are similar.First we state a number of technical lemmas, which are proved in the Appendix.
For m = 3, 4 . . .and δ > 0 we denote d) for any δ > 0 there exists a constant C δ > 0 such that The proof of the next lemma can be found in [33], p. 161.
To prove the continuity of ϕ(u) at the point 0 we consider three cases.Case 1. p i < ∞, i = 1, . . ., m.For an arbitrary ε > 0 we can find δ > 0 satisfying (see (5.13)) (5.14) We fix u = (u 1 , . . ., u m−1 ) with |u| < δ and denote Then in view of (5.12) we have It follows from (5.14) that h i pi ≤ ε, i = 2, . . ., m. Observe that each of the integrals comprising W contains at least one function h i and can be estimated as follows: Hence according to (5.13) we have h i ∈ L p ′ i i = 1, . . ., m and ϕ is continuous at 0 as in the case 1.
Case 3. p i ≤ ∞, i = 1, . . ., m, m i=1 1 pi = 1.First observe that at least one of the numbers p i is finite.Suppose, without loss of generality, that p 1 < ∞.For any ε > 0 we can find functions h where the functions ϕ ′ and ϕ ′′ are defined as ϕ in (5.12) with h 1 replaced by h ′ Proof of Theorem 3.2.We start with (B1).First observe that the condition for some constant A > 0. By Lemma 5.2 we have Making the change of variables and observing that t we get (below we use the notation: It follows from (3.4), (5.17) and (5.19) that Hence by (5.16) and (5.20) we have Proof of (B2).By Lemma 5.2, 3) and Lemma 5.3, b), and (3.5) we have It follows from (3.11) and (5.5) that for u Let ε ∈ (0, γ).Then, applying Lemma 5.1 with δ = 1+ε−γ m , and using (5.23) and (5.24), we can write we can apply Lemmas 5. Proof of (B3).According to (B2) it is enough to prove that the function where , and with some positive constant C provided that To prove (5.27) we fix u = (u 1 , . . ., u m−1 ) ∈ R m−1 and denote Since h i ∈ Lip(R; p i , γ) we have (5.30)By (5.26) and (5.29), Each of the (2 m−1 − 1) integrals comprising W contains at least one function f i , and in view of (5.30), can be estimated as follows: This completes the proof of (B3).

Appendix. Proof of Technical Lemmas
In this section we give proofs of technical lemmas stated and used in Sections 4 and 5.
Proof of Lemma 5.3.The proof of properties a) -c) can be found in [5], Lemma 3.2 (see also [30], Lemma 2).To prove d) first observe that for T > 0 It is enough to estimate I 1 (I 2 , . . ., I n can be estimated in the same way).We have From (6.9) -(6.13) we obtain (5.6).Lemma 5.3 is proved.
Proof of Lemma 5.5.Using Lemma 5.4 and the notation du = du n du n−1 • • • du 1 , we can write Proof of Lemma 5.6.We have (6.14) Using Lemma 5.4 we get The quantities I 2 , . . ., I n can be estimated in the same way, and by (6.14) the result follows.

Definition 4 . 1 .
Let {ξ n , n ∈ Z} be a sequence of real-valued random variables defined on the probability space (Ω, F , IP).We say that {ξ n } satisfies a Large Deviation Principle (LDP) with speed a n → 0 and rate functionI : R → [0, ∞], if I(x) is lower semicontinuous, that is, if x n → x then lim inf n→∞ I(x n ) ≥ I(x),and lim inf n→∞ a n log IP{ξ n ∈ A} ≥ − inf x∈A I(x) imsart-ps ver.2011/11/15 file: GST-ps.texdate: May 11, 2014 for all open subsets A ⊂ R, while lim sup n→∞ a n log IP{ξ n ∈ B} ≤ − inf x∈B I(x) for all closed subsets B ⊂ R. The function I(x) is called a good rate function if its level sets are compact, that is, the set {x ∈ R : I(x) ≤ b} is compact for each b ∈ R.