Asymptotic analysis of ML-covariance parameter estimators based on covariance approximations

Given a zero-mean Gaussian random field with a covariance function that belongs to a parametric family of covariance functions, we introduce a new notion of likelihood approximations, termed truncated-likelihood functions. Truncated-likelihood functions are based on direct functional approximations of the presumed family of covariance functions. For compactly supported covariance functions, within an increasing-domain asymptotic framework, we provide sufficient conditions under which consistency and asymptotic normality of estimators based on truncated-likelihood functions are preserved. We apply our result to the family of generalized Wendland covariance functions and discuss several examples of Wendland approximations. For families of covariance functions that are not compactly supported, we combine our results with the covariance tapering approach and show that ML estimators, based on truncated-tapered likelihood functions, asymptotically minimize the Kullback-Leibler divergence, when the taper range is fixed.


On infill-and increasing-domain asymptotics
Maximum likelihood (ML) estimators for covariance parameters are highly popular in inference for random fields.Aiming towards asymptotic properties of such estimators, one needs to specify how the observation points and the associated sampling domain behave as the number of observation points increases.Two well-studied asymptotic frameworks are referred to as infill-domain asymptotics (also termed fixed-domain asymptotics) and increasing-domain asymptotics (see [13], p. 100 for an introduction of terms).In infill-domain asymptotics, observation points are sampled within a bounded sampling domain, whereas in increasing-domain asymptotics, the sampling domain grows as the number of observation points increases.When referring to infill-and increasingdomain asymptotics, one often places additional assumptions on the minimum distance between any two distinct observation points.In increasing-domain asymptotics, the latter distance is often assumed to be bounded away from zero, while in infill-domain asymptotics, one frequently assumes that distinct observation points can be sampled arbitrarily close to each other (see for example [37]).There is a fair amount of literature which demonstrates that asymptotic properties of ML estimators for covariance parameters can be quite different under the two mentioned asymptotic frameworks (see [37] or more lately [6]).For example, it is known that some covariance parameters can not be estimated consistently under an infill-domain asymptotic framework ( [34], [36]), whereas they can be estimated consistently, under given regularity conditions, within an increasing-domain asymptotic framework ( [25], [4]).It is worth noting that in infill-domain asymptotics, these results can depend on the dimension d of the Euclidean space R d , where the random field is assumed to be observed.For example, when the true covariance function belongs to the Matérn family ( [26]), and smoothness parameters are given, it is shown in [36], that for d = 1, 2, 3, the scale and variance parameters can not be estimated consistently via an ML approach in an infill-domain asymptotic framework.The case where d = 4 is still open, but for d ≥ 5, it is shown in [2] that under infill-domain asymptotics, all covariance parameters of the Matérn family can be estimated consistently using an ML approach.

Compactly supported covariance functions
In recent years, the dataset sizes have steadily increased such that statistical analyses on random fields can become quite expensive in terms of computational resources (see for example [15] for a recent discussion).One prominent issue with large datasets is the large size of covariance matrices, constructed upon applying an underlying covariance function to given data.However, in certain fields of application, observed correlations are assumed to vanish beyond a certain cut-off distance (see [18] p. 750 − 751, and references therein, for an example in meteorology or also [10] and [19]).On the other hand, in the context of real valued random fields, it is common practice to multiply a presumed covariance function with a known positive-definite and compactly supported covariance function, called the covariance taper.The resulting compactly supported covariance function is referred to as the tapered covariance function.For an introduction to covariance tapering we refer to [17].The use of compactly supported covariance functions can thus be of great importance for some fields of application.Not only do they potentially reflect the nature of the underlying covariance structure, but also, their application can lead to sparse covariance matrices.The latter are helpful in terms of the high computational costs in the context of large datasets.An excellent introduction to the construction of compactly supported covariance functions, associated to stationary and isotropic Gaussian random fields, is given in [21].Additional results are available in [35], [28] and [11].

Motivation
The parametric family of generalized Wendland covariance functions represents one example of a family of compactly supported covariance functions which allows, similar to the Matérn family, for a continuous parametrization of smoothness (in the mean square sense) of the underlying random field.Its origin is due to Wendland ([32]) and an early adaptation for statistical applications was given by Gneiting ([20]).In its general form (see [21] and [28] for special cases) the generalized Wendland covariance function with smoothness parameters ν and κ, variance parameter σ 2 and range parameter β is given by if t ∈ [0, β) and is zero otherwise.In the above display, B is the beta function.For technical details about valid parameter values, we refer to [9] or Section 6 of the present article.Clearly, in comparison with closed-form covariance functions, computing (1) is cumbersome, as it involves numerical integration.Depending on the support β and a set of locations s 1 , . . ., s n ∈ R d , the n × n covariance matrix Σ i,j = ϕ(∥s i − s j ∥) requires at most n(n − 1)/2 calculations of (1).One strategy, which facilitates computing Σ, is to reduce the number of times (1) must be calculated.As an illustration, we give three examples which involve approximations φi , i = 1, 2, 3, of ϕ (respectively approximations Σ i of Σ): ( φ1 ) Truncation of the support ( φ2 ) linear interpolation ( φ3 ) addition of a nugget effect For φ1 , we truncate ϕ to obtain φ1 which has a smaller support compared to ϕ.This becomes especially interesting, when the original function ϕ tails off slowly (high degree of differentiability at the origin).As a result, Σ 1 will be more sparse compared to Σ. Example φ2 is to predefine the numbers at which (1) is calculated.This is achieved by introducing a partition 0 < t 1 < . . .< t N = β of the support of ϕ.Then, φ2 results in N calculations of ϕ.This defines a closed form approximation of ϕ.Notice that t 1 , . . ., t N do not need to be equispaced.Finally, φ3 can be interpreted as a tuning option for a given approximation φ * of ϕ: φ3 (t) := φ * (t) + δ, t = 0, φ * (t), t ̸ = 0, δ ≥ 0.
With regard to practical usage, this form of approximation increases numerical stability.Further, it allows for more flexibility in practice, where the number of observations n is given and Σ * based on φ * might not be positive-definite.Following up the above examples, we picture an approximation φ of ϕ (respectively approximation Σ of Σ).Several questions arise: • What are conditions on φ to ensure that Σ is asymptotically (as n − → ∞) equivalent to Σ and eventually (for n large enough) remains positivedefinite?
• In terms of ML estimators for covariance parameters, how shall a loglikelihood approximation based on φ be defined?
• Under which conditions on φ are ML estimators based on φ consistent and asymptotically normal?
In the more general setting of a given parametric family of covariance functions, the present study gives a concrete context, where the latter questions are answered by introducing the notion of truncated-ML estimators.

Framework and contribution
Truncated-ML estimators for covariance parameters are based on truncatedlikelihood functions.The latter are defined upon parametric families of sequences of functions, which approximate a presumed family of covariance functions on a common domain.Colloquially we will call these parametric sequences of functions covariance approximations.The respective matrices, constructed upon applying covariance approximations to a given collection of observation points, will be termed covariance matrix approximations.We will allow for covariance matrix approximations that are not necessarily positive semi-definite.Therefore, truncated-likelihood functions are more general than existing likelihood approximations methods such as low-rank, Vecchia, or covariance tapering approaches (see [22] for a summary of commonly used methods).
We work in an increasing-domain asymptotic framework, where collections of observation points are realizations of finite collections of a randomly perturbed regular grid (see also [4]).We consider a stationary Gaussian random field, with a zero-mean function and a true unknown covariance function that belongs to a given parametric family of covariance functions.If the presumed family of covariance functions is compactly supported, we provide sufficient conditions under which truncated-ML estimators and (regular) ML estimators for covariance parameters are consistent and asymptotically normal.Some conditions imposed on families of covariance functions are identical to the conditions that were already considered in [4].The main difference is that we work with compactly supported covariance functions.Therefore, it is possible to simplify some of the conditions that were set up in [4].As for statistical applications, we apply these results to the family of generalized Wendland covariance functions.In contrast to the infill-domain asymptotic framework considered in [9], we show that under the studied increasing-domain asymptotic framework, under some conditions on the parameter space, (regular) ML estimators for variance and range parameters are consistent and asymptotically normal.Further, we show that the same asymptotic results are recovered for truncated-ML estimators, based on various generalized Wendland approximations, such as truncations, linear interpolations and added nugget effects.
Additionally, we provide an extension to families of covariance functions which are not compactly supported.We combine our results with the covariance tapering approach.That is, we study covariance taper approximations and their asymptotic influence on the conditional Kullback-Leibler divergence of the misspecified distribution from the true distribution (see also [5]).We show that the latter divergence is minimized by truncated-tapered ML estimators.

Structure of the article
The rest of the article is organized as follows.Section 2 establishes the context.We introduce some primary notation, define the sampling domain and the random field itself.In Section 3 we introduce regularity conditions on covariance functions and approximations.In Section 4 we present intermediate asymptotic results on covariance matrices and approximations.Section 5 contains our main results: We introduce truncated-ML estimators and present results on consistency and asymptotic normality.In Section 6, we apply our results to the family of generalized Wendland covariance functions and discuss several examples of generalized Wendland approximations.Then, in the context of non-compactly supported covariance functions, Section 7 contains results on the asymptotic influence of taper approximations on the Kullback-Leibler divergence.Section 8 gives an outlook and some final comments.The Appendix is split into three parts.Covariance approximations for isotropic random fields are discussed in Appendix A. Appendix B contains additional supporting results, whereas all the proofs are left for Appendix C.

Primary notation
The set N + and R + shall represent the set of positive integers and non-negative real numbers, respectively.For d ∈ N + , we use the notation B(x; r) (B[x; r]) for the open (closed) ball of radius r > 0 with center x ∈ R d .Given n ∈ N + , for some set A ⊂ R n , we write B(A) for the Borel σ-algebra on A. For a real n × n matrix A, ∥A∥ 2 = max {z : z t z=1} ⟨z, A t Az⟩ 1/2 denotes the spectral norm of A. We write , where S n×n (R) represents the space of real symmetric n × n matrices.
We use the notation ∇f (x) = ∂f ∂x1 (x), . . ., ∂f ∂xp (x) for the gradient of f at x, where x → f (x) is any differentiable, real valued function, defined on some E ⊂ R p .Further, for a vector valued, differentiable function g(x) = (g 1 (x), . . ., g m (x)), with values in R m , defined on some U ⊂ R p , we write for the Jacobi-matrix of g at x.A mapping Y from a probability space (Ω, F, P) to a measure space (E, A) will be called a random element if it is F/A measurable.If we write that Y : (Ω, F) → (E, A) is measurable, we mean that it is F/A measurable.If (Y n ) n∈N+ denotes a sequence of random elements, where for any n ∈ N + , Y n is a mapping form a probability space (Ω, F, P) to a measure space (E, A), we use the notation to indicate convergence of (Y n ) n∈N+ to a random element Y in probability and in distribution, respectively.Note that for convergence in distribution, the introduced notation indicates that the limit Y has law L on (E, A).A sequence of estimators ( θn ) n∈N+ for θ 0 ∈ R p will be referred to as consistent if it converges in probability to θ 0 .Finally, N (µ, Σ) indicates a multivariate normal distribution with mean vector µ and covariance matrix Σ.

Random sampling scheme
On a probability space (Ω, F, P), we consider a real valued Gaussian random function Z, which has sample functions on R d .We assume that Z is stationary (homogeneous) with zero-mean function and covariance function c θ0 (s), s ∈ R d , where θ 0 ∈ Θ, with Θ ⊂ R p , compact and convex.Thus, we consider a real valued random field {Z s : s ∈ R d }, which has true and unknown covariance function c θ0 that belongs to a family of covariance functions {c θ : θ ∈ Θ}.
Let Q := [−1, 1] d and X : Ω → Q N+ be a stochastic process, defined on the same probability space (Ω, F, P), but independent of Z.We assume that the sequence (X i ) i∈N+ is a sequence of independent random vectors with common law on Q, which has a strictly positive probability density function on Q (see also Remark 2.1).Given τ ∈ [0, 1/2) and a sequence of deterministic points (v i ) i∈N+ , with v i ∈ N d + , we define a randomly perturbed regular grid S as the process where we assume that for all G and any first I d coordinates are in {1, . . ., I} d +τ Q (see also Figure 1).At this point we remark that if nothing is mentioned, the parameter τ ∈ [0, 1/2) and the sequence (v i ) i∈N+ shall be fixed.Let X (n) := (X 1 , . . ., X n ) and S (n) := (S 1 , . . ., S n ) denote finite collections of X and S, respectively.We use the notation x (n) := (x 1 , . . ., x n ) for a vector that contains the first n entries of a given sequence in On (Ω, F, P), we define the random vector which denotes Z observed at a finite collection of S. The situation, where a Gaussian random field is assumed to be observed at a randomly perturbed regular grid, with parameter τ and deterministic points (v i ) i∈N+ , as introduced above, is also considered in [4].Given θ ∈ Θ, we let Σ θ (s (n) ) := [c θ (s i − s j )] 1≤i,j≤n denote the non-random n×n covariance matrix based on an arbitrary s (n) ∈ G n .On (Ω, F, P), we write for the n × n random covariance matrix based on a finite collection S (n) of S.
Figure 1: For τ = 0.4 and I ∈ N + , a random field Z is observed at two realizations s i and s j of S i = v i + τ X i and S j = v j + τ X j , i ̸ = j, 1 ≤ i, j ≤ I 2 .Dotted and dashed lines mark the borders of the ranges of S i and S j , respectively.
Remark 2.1.Some technical remarks are worth pointing out.We assume that the random function Z(s, ω) That is to say that Z is (jointly) measurable.This condition makes sure that the components ω → Z Si(ω) (ω) = Z(S i (ω), ω), i = 1, . . ., n, of (3) are F/B(R) measurable as the composition of the measurable functions ω → (S i (ω), ω) and (s, ω) → Z(s, ω).Thus, the random vector Z (n) is well defined.Since Z and S are independent, it is readily seen that the conditional distribution of Z (n) given S (n) = s (n) is Gaussian, with characteristic function exp(−(1/2)a t Σ n,θ0 (ω)a), a ∈ R n .In addition, we note that for fixed ω ∈ Ω, S[N + ](ω) is not bounded and if we define ∆ τ := 1 − 2τ , we are given some fixed ∆ τ > 0, which is independent of n ∈ N + and θ ∈ Θ, such that inf Hence, we are in an increasing-domain asymptotic framework where the minimum distance between any two distinct observation points is bounded away from zero.The assumption that for any given i ∈ N + , X i has strictly positive probability density function on Q, is purely technical (see also the proof of Theorem 5.2).As it can be seen from the mentioned proof, if τ = 0, the assumption becomes redundant.
3 Regularity Conditions on covariance functions and covariance approximations

Regularity conditions on the family of covariance functions
Assumption 3.1 (Regularity conditions on c θ ).
Remark 3.1.Note that (1) and (2) of Assumption 3.1 are different to the conditions assumed in [4] (compare also to Condition 3.2 imposed in [5], or Condition 4 stated in [7]).In [4] it is assumed that a given covariance function k θ is not only bounded on R d , but also it decays sufficiently fast in the Euclidean norm on R d .Explicitly, it is assumed in Condition 2.1 of [4] that there exists a finite constant A, which is independent of θ ∈ Θ, such that for any s ∈ R d , |k θ (s)| ≤ A/(1 + ∥s∥ d+1 ).This polynomial decay condition on k θ can be interpreted as a summability condition on the entries of the respective covariance matrices K θ (s (n) ) i,j := k θ (s i − s j ), which guaranties that the maximal eigenvalues of K θ (s (n) ) are uniformly bounded in n ∈ N + , s (n) ∈ G n and θ ∈ Θ (see Lemmas D.1 and D.5 in [4]).Note that the exponent d + 1 can be replaced by d + α, with α > 0 some fixed constant (see also (6) in [6]).In the present study we show that under the assumption of a minimal spacing between any two distinct observation points, if c θ has compact support on R d , the number of possible observation points, which are covered by the support of c θ , must be bounded uniformly in n ∈ N + , s (n) ∈ G n and θ ∈ Θ (see Lemma B.1).This, together with the condition that c θ is also uniformly bounded on Θ and R d , will be sufficient to conclude that the maximal eigenvalues of Σ θ (s (n) ) are uniformly bounded in n ∈ N + , s (n) ∈ G n and θ ∈ Θ (see Lemmas 4.1 and B.3).Similar remarks can be made with regard to the conditions imposed on the partial derivatives of c θ with respect to θ (see Lemma B.5).In addition, (3) of Assumption 3.1 is also imposed in [4] (compare also to [8] and [7]).It guarantees that the minimal eigenvalues of Σ θ (s (n) ) are bounded from below, uniformly in n ∈ N + , s (n) ∈ G n and θ ∈ Θ (see Lemmas 4.1 and B.3).Finally, we remark that within the framework of compactly supported covariance functions, the given conditions are very minimal and can be considered as classical in the context of ML estimation.Especially, if one is not interested in the asymptotic distribution, and rather seeks conditions under which ML estimators are consistent (with regard to a concrete example, we refer to Remark 6.2).

Regularity conditions on the family covariance approximations
Given θ ∈ Θ, we let (c m,θ ) m∈N+ denote a sequence of real valued functions defined on R d .The families (c m,θ ) m∈N+ : θ ∈ Θ can be put under the following assumption.
(2) For any m ∈ N + , cm,θ satisfies (1) of Assumption 3.1, where respective constants C and L can be further chosen independently of m ∈ N + . ( (4) For any m ∈ N + , cm,θ satisfies (2) of Assumption 3.1, where respective constants C ′ and L ′ can be further chosen independently of m ∈ N + .
(5) For any q = 1, 2, 3, i 1 , . . ., i q ∈ {1, . . ., p}, we have that To make the notation easier, we write (c m,θ ) := (c m,θ ) m∈N+ .In the following, we formally introduce covariance matrix approximations (random and non-random versions).To do so, let r : ,j≤n denote the non-random n × n matrix based on a given family (c m,θ ) : θ ∈ Θ .Then, on (Ω, F, P), if (c m,θ ) : θ ∈ Θ is a family of Borel measurable sequences of functions, we write for the n × n random matrix based on a finite collection S (n) of S. Colloquially we will use the term covariance approximation when we refer to a given family (c m,θ ) : θ ∈ Θ , which can approximate a family of covariance functions {c θ : θ ∈ Θ} in the sense of Assumption 3.2.In these terms {c θ : θ ∈ Θ} itself is a covariance approximation.The expression covariance matrix approximation will be used for both, Σ θ (s (n) ) and its random version Σ n,θ .Similar, we use the expression covariance matrix for both, Σ θ (s (n) ) and Σ n,θ .Remark 3.2.(1), ( 2) and (4) of Assumption 3.2 are natural extensions of (1) and (2) of Assumption 3.1.Notice that the measurability condition imposed in (1) of Assumption 3.2 makes sure that Σ n,θ is F/B(R n 2 ) measurable.Condition (3) of Assumption 3.2 specifies in which sense a family (c m,θ ) : θ ∈ Θ approximates the family {c θ : θ ∈ Θ}.We require that (c m,θ ) converges uniformly on R d to c θ , where the convergence is also uniform on the parameter space Θ.In fact, we will show (see Lemmas B.3 and 4.1) that the uniform convergence of (c m,θ ) to c θ , together with the condition that the families {c θ : θ ∈ Θ} and (c m,θ ) : θ ∈ Θ have uniformly bounded compact support, are, among others, sufficient criteria to proof that the matrices Σ θ (s (n) ) and Σ θ (s (n) ) are asymptotically (as n − → ∞) equivalent, uniformly on Θ and G. Condition (5) of Assumption 3.2 will allow us to conclude that a similar result holds true for the first, second and third order partial derivatives (with respect to θ) of Σ θ (s (n) ) and Σ θ (s (n) ).For concrete examples of covariance approximations, where the conditions of Assumption 3.2 are verified, we refer to Section 6.

Uniform asymptotic equivalence of covariance matrices and covariance matrix approximations
This section presents intermediate results on covariance matrices and approximations.In particular, Lemma 4.1 gives precise conditions under which Σ n,θ eventually (for n large enough) remains positive-definite with P probability one.2) and ( 3) of Assumption 3.2.Then, we have that P a.s.
In particular we can conclude that P a.s.
Further, it is true that P a.s.
and there exists N ∈ N + such that P a.s.

Truncated-ML estimators
Given a square matrix A, we define det + (A) to be the product of the strictly positive eigenvalues of A. If all of the eigenvalues are less or equal to zero, det + (A) = 1.Further, we use the notation A + for the pseudoinverse of A (sometimes called Moore-Penrose inverse).For the given collection {c θ : θ ∈ Θ}, we define, on (Ω, F, P), for any n ∈ N + and θ ∈ Θ, the random variable Given ω ∈ Ω, θ → l n (θ)(ω) shall be called the truncated-modified log-likelihood function based on {c θ : θ ∈ Θ}.A sequence of estimators θn (c) n∈N+ , defined on (Ω, F, P), will be called a sequence of truncated-ML estimators for θ 0 based on Similarly, on (Ω, F, P), for a given collection of sequences of real valued functions (c m,θ ) : θ ∈ Θ , we introduce, for any n ∈ N + and θ ∈ Θ, the random variable Then, for ω ∈ Ω, the function θ → ln (θ)(ω) denotes the truncated-modified log-likelihood function based on (c m,θ ) : θ ∈ Θ .A sequence of estimators θn (c) n∈N+ , defined on (Ω, F, P), will be called a sequence of truncated-ML estimators for θ 0 based on (c m,θ ) : At this point is important to note that for a given ω ∈ Ω, it is in general not true that l n (θ)(ω) and ln (θ)(ω) are continuous in θ for any n ∈ N + .Nevertheless, a consequence of Lemma 4.1 is the following proposition: Proposition 5.1.Assume that the family {c θ : θ ∈ Θ} satisfies ( 1) and ( 3) of Assumption 3.1.Consider (c m,θ ) : θ ∈ Θ that satisfies (1), ( 2) and ( 3) of Assumption 3.2.Then, we have that for any n ∈ N + , P a.s., Further there exists N ∈ N + such that for any n ≥ N , P a.s., and we have that Using Proposition 5.1, we notice that if, for any s ∈ R d and m ∈ N + , both θ → c θ (s) and θ → cm,θ (s) are k times differentiable, we have that θ → l n (θ)(ω) and θ → ln (θ)(ω) are k times differentiable for n large enough, respectively.
For the rest of the article, if we refer to truncated-ML estimators (without mentioning further whether estimators are based on families of covariance functions or approximations), we refer to both, truncated-ML estimators based on families of covariance functions and approximations.The same is applied for the notion of truncated-modified log-likelihood functions based on either covariance functions or approximations.However, if {c θ : θ ∈ Θ} satisfies the assumptions of Proposition 5.1, a sequence of truncated-ML estimators θn (c) n∈N+ shall be simply called a sequence of ML estimators for θ 0 .Similarly, we will simply refer to a modified log-likelihood function when the given family {c θ : θ ∈ Θ} is under the assumptions of Proposition 5.1.
Remark 5.1.The introduction of truncated-modified log-likelihood functions is not standard.Modified refers to the fact that the log-likelihood for the Gaussian density function of a random vector (Z s1 , . . ., Z sn ) is scaled by −2/n.This is common practice in the literature about ML estimators for covariance parameters under an increasing-domain asymptotic framework (see for instance [4], [5] and also [7]).The matrices Σ n,θ (ω) and Σ n,θ (ω) are not necessarily positive-definite.In particular, Σ n,θ (ω) can be negative-definite.If the matrices Σ n,θ (ω) and Σ n,θ (ω) are not positive-definite, we truncate the log-likelihood by a pseudo-determinant and -inverse to obtain the functions θ → l n (θ)(ω) and θ → ln (θ)(ω).Hence, the use of the expression "truncated".Remark 5.2.As it was mentioned in Remark 2.2 of [5], for given ω ∈ Ω, we allow the functions θ → l n (θ)(ω) and θ → ln (θ)(ω) to have more than one minimizer.In which case the asymptotic results given in Section 5.1 hold true for any given sequence of truncated-ML estimators.With regard to the existence of a minimizer we refer to Remark 2.1 in [4].

Consistency and asymptotic normality of truncated-ML estimators
The main results of this section are that under suitable conditions on the families of covariance functions and approximations, truncated-ML estimators for covariance parameters are not only consistent (Theorem 5.2 and Corollary 5.3) but also asymptotically normal (Theorem 5.4 and Corollary 5.5).In particular, we will make use of the conditions presented in Assumptions 3.1 and 3.2.However, in the context of random fields that are observed at randomly perturbed regular grid locations as defined in (2), we will further make use of the following two technical conditions that were also imposed in [4].Associated to the common range Q, of the process X, we define the set where Q} denotes the set of differences between two points in Q.
Before we present the results about asymptotic normality, it is helpful to consider some additional notation.Let K ∈ N + , such that for any ω ∈ Ω, the sequences of functions are differentiable with respect to θ.Note that if {c θ : θ ∈ Θ} satisfies Assumption 3.1 and the collection (c m,θ ) : θ ∈ Θ satisfies Assumption 3.2, then we know about the existence of such a K under application of Proposition 5.1.For the given K ∈ N + , on (Ω, F, P), we introduce the sequence of random functions where for n ∈ N + and θ ∈ Θ, the random vector G n,K (θ) has components G j,n,K (θ), j = 1, . . ., p, with and thus Similarly, on (Ω, F, P), we introduce the sequence of random functions where for any n ∈ N + and θ ∈ Θ, the components of G n,K (θ) are given by and thus If the collection {c θ : θ ∈ Θ} satisfies Assumption 3.1, we simply write, for any n ∈ N + , for the random Jacobi-matrix of θ → G n,1 (θ) evaluated at θ 0 .
Corollary 5.5.Suppose that {c θ : θ ∈ Θ} satisfies Assumptions 3.1, 5.1 and 5.2.Then, we can conclude that a sequence of ML estimators θn (c) n∈N+ for θ 0 is such that We assume that the covariance function of the random field Z is given by ϕ θ0 (∥s∥), s ∈ R d , θ 0 ∈ Θ, where ϕ θ0 belongs to the family {ϕ θ : θ ∈ Θ} which is defined by where compare to (1) of Section 1.3.We treat κ and ν as given but such that κ > 0 and ν ≥ (d + 1)/2 + κ.Notice that the latter restriction on κ and ν makes sure that for any θ ∈ Θ, ϕ θ belongs to the class Φ d , the class of real valued and continuous functions, defined on R + , which are strictly positive at the origin and such that for any finite collection of points in R d , evaluation at the Euclidean norm of pairwise differences between points of the collection results in a nonnegative definite matrix (see for example [21]).Actually, in the latter reference it is argued that for κ > 0, ϕ ν,κ ∈ Φ d if and only if ν ≥ (d + 1)/2 + κ.For the respective family defined on R d , we use the notation w θ (s) := ϕ θ (∥s∥).
Remark 6.1.The restriction β min > 1 − 2τ is imposed to proof that the family {w θ : θ ∈ Θ} satisfies Assumptions 5.1 and 5.2 (see the proof of Propositions 6.2).This is not surprising, as 1 − 2τ defines the minimal spacing between pairs of distinct observation points of the randomly perturbed regular grid, defined in (2) of Section 2.2.Further, as we have noted that ϕ ν,κ ∈ Φ d if and only if ν ≥ (d + 1)/2 + κ, the two smoothness parameters ν and κ can not be estimated without further constraints.
Using Propositions 6.1 and 6.2, under application of Theorems A.1 and A.2 (recall also Corollaries 5.3 and 5.5), we obtain the following result: Proposition 6.3.Let κ > 4. A sequence θn (ϕ) n∈N+ of ML estimators for θ 0 based on {ϕ θ : θ ∈ Θ} is consistent.Further there exists a non-random symmetric p × p matrix Λ ≻ 0 such that It is worth to note that the restriction κ > 4 is only needed for the asymptotic distribution of ML estimators, respectively truncated-ML estimators.In particular, in Proposition 6.1, if one only demands conditions involving first order partial derivatives of ϕ θ , with respect to θ, κ > 2 is sufficient.With regard to consistency of the estimator θn (ϕ) n∈N+ in Proposition 6.3, κ > 2 is sufficient as well.The same applies for the truncated-ML estimators considered in Examples 6.1, 6.2, 6.3 and 6.4.Keeping in mind the differentiability conditions imposed in Assumption A.1, the given restrictions on κ are not surprising (compare also to [9], within the infill-domain asymptotic framework).
We discuss four examples of generalized Wendland approximations.
In the following we let M < ∞ denote a real constant, which is independent of β ∈ [β min , β max ] such that β max ≤ M .Example 6.2 (Trimmed Bernstein polynomials).Let {ϕ θ : θ ∈ Θ} be as in Proposition 6.1.We consider a family (P m,θ ) : θ ∈ Θ defined as follows: For θ ∈ Θ and m ∈ N + , we set for t ∈ R + , the distance between adjacent points converge to zero as m approaches infinity.See also [12] for an introduction of Bernstein polynomials on unbounded intervals.
Using Propositions 6.1, 6.2 and 6.5, under application of Theorems A.1 and A.2, we have proven the following result: Proposition 6.6.A sequence θn (P) n∈N+ of truncated-ML estimators for θ 0 based on (P m,θ ) : θ ∈ Θ is consistent and we have that where Λ is defined as in Proposition 6.3. where Thus, for a given m ∈ N + , L m,θ represents a linear interpolation of the function ϕ θ on the interval [0, M ].
Using Propositions 6.1, 6.2 and 6.7, under application of Theorems A.1 and A.2, we have further proven the following result: Proposition 6.8.A sequence θn (L) n∈N+ of truncated-ML estimators for θ 0 based on (L m,θ ) : θ ∈ Θ is consistent and we have that where Λ is defined as in Proposition 6.3.
Remark 6.3.As it was already mentioned in the introduction, computing (13) is costly.However, if κ is a positive integer, closed form solutions of (13) exist.More specifically, if κ = k ∈ N + , then where P k is a polynomial of order k and A ν+k the Askey function ( In addition, if κ ∈ (N + − 1/2), a positive half-integer, it is shown in [28] that further closed form solutions of (13), involving polynomial, logarithmic and square root terms, exist.Thus, in the specific example of generalized Wendland covariance functions, covariance approximations will facilitate computing (13) when κ / ∈ N + ∪ (N + − 1/2).

Covariance taper approximations: Beyond compactly supported covariance functions
Asymptotic properties of (regular) tapered-ML estimators were addressed in both the infill-and increasing-domain asymptotic framework (see [23], [14], [30] and [16]).The direct functional approximation approach studied here can be combined with covariance tapering.Given observations of S, it is known that under weak assumptions on the presumed covariance function, ML estimators based on tapered covariance functions (tapered-ML estimators) preserve consistency (see [16], in particular Corollary 2 in the increasing-domain framework).However, this is the case for covariance tapers that have a compact support which is not fixed, but rather grows to the entire R d as the number of observations from S increases.Within an increasing-domain asymptotic framework, given a fixed compact support of the covariance taper, one can in general not expect tapered-ML estimators to be consistent.Still, under suitable conditions, tapered-ML estimators asymptotically minimize the Kullback-Leibler divergence (see for instance Theorem 3.3 in [5]).Given the theory developed here, we can readily recover the same result for truncated-tapered ML estimators, ML estimators based on tapered covariance function, where the covariance taper is replaced with a functional approximation of it.To be more formal, let us remain in the setting of Section 2, but assume that Z has true and unknown covariance function k θ0 , θ 0 ∈ Θ, which belongs to a family {k θ : θ ∈ Θ} which satisfies:

is continuously differentiable
• There exist constants A < ∞ and α > 0 such that for all i = 1, . . ., p, for all s ∈ R d and for all θ ∈ Θ, The given assumptions are very weak and satisfied for instance for the Matérn family (see also Condition 2.1 in [4] or Remark 3.1).Then, we consider a fixed covariance taper s → t θ ′ 0 (s), θ ′ 0 ∈ Θ ′ , Θ ′ ⊂ R l , compact and convex.We assume that t θ ′ 0 belongs to a family of tapers {t θ ′ : θ ′ ∈ Θ ′ } that satisfies Assumption 3.1 (regarding (2), q = 1 and the continuity of first order partial derivatives is sufficient).As we have seen in Section 6 (Proposition 6.1), we may choose, with θ ′ 0 = (β 0 , 1), κ > 2, ν ≥ (d + 1)/2 + κ, a generalized Wendland taper (see also Remark 6.2).In the given context it is more convenient to write t β0 := t θ ′ 0 , where β 0 is the taper range, that is t β0 (s) = 0 for ∥s∥ ≥ β 0 .Based on a finite collection S (n) of S, on (Ω, F, P), we then define the tapered n × n covariance matrix R n,θ i,j := k θ (S i − S j )t β0 (S i − S j ), 1 ≤ i, j ≤ n.Additionally, we consider a covariance matrix approximation , where ( tm,θ ′ 0 ) is a sequence of functions that belongs to a family of taper approximations {( tm,θ ′ ) : θ ′ ∈ Θ ′ }, for which Assumption 3.2 applies.Again, we write tm,θ ′ 0 := tm,β0 , m ∈ N + , to highlight the fixed range parameter.We note that the results of Lemma 4.1 and Proposition 5.1 remain true with Σ n,θ and Σ n,θ replaced with R n,θ and R n,θ , respectively.We know (see Remark 2.1) that the conditional distribution of Z (n) given S (n) is given by the random variable ω → N (0, K n,θ0 (ω)).On the other hand, we can assume a misspecified distribution ω → N (0, R n,θ (ω)), where the true covariance matrix is replaced with the tapered covariance matrix R n,θ (ω), θ ∈ Θ.Then, we define the scaled (see [5]) conditional Kullback-Leibler divergence of N (0, R n,θ ) from N (0, K n,θ0 ), The distribution N (0, R n,θ ) shall be called a regular taper miss-specified distribution.If we choose n ≥ N (N as in Proposition 5.1), we can even further misspecify the distribution of ).This gives rise to the scaled conditional Kullback-Leibler divergence of We use the notation θn (kt) n∈N+ and θn (k t) n∈N+ for ML and truncated-ML estimators for θ 0 with respect to {k θ t β0 : θ ∈ Θ} and {(k θ tm,β0 ) : θ ∈ Θ}, respectively.In accordance with the literature about tapered-ML estimators, the estimators θn (kt) n∈N+ and θn (k t) n∈N+ are then further referred to as tapered-ML estimators and truncated-tapered ML estimators, respectively.We can now state the following theorem: Theorem 7.1.We have that P a.s.
where δ n Therefore, in the given scenario, truncated-tapered ML estimators asymptotically minimize the conditional Kullback-Leibler divergence of taper misspecified distributions from the true distribution (compare also to Theorem 3.3 in [5]).Thus, in terms of Kullback-Leibler divergence, truncated-tapered ML estimators and tapered-ML estimators perform asymptotically equally well.

Discussion and outlook
With the introduction of truncated-likelihood functions, we allow for more farreaching forms of covariance approximations, such as linear interpolations or polynomial approximations.Our approximation approach relates directly to the presumed covariance function.Thus, combinations with existing approximation methods such as low-rank or covariance tapering approaches are well possible.We studied the quality of truncated-ML estimators from an asymptotic point of view.For compactly supported covariance functions, the conditions imposed in Sections 3 and 5 permit us to obtain truncated-ML estimators that are asymptotically well-behaving.That is, we obtain estimators that are consistent and asymptotically normal.Our proof strategies were strongly influenced by [4].We have provided a comprehensive analysis for the family of generalized Wendland covariance functions.That is, we give precise conditions on smoothness, variance and range parameters, under which ML estimators for variance and range parameters are consistent and asymptotically normal.To our knowledge, this does not exists in the literature so far (compare also to [9], within the infill-domain asymptotic context).Further, we gave four examples of generalized Wendland approximations, for which truncated-ML estimators preserve consistency and asymptotic normality.
We now discuss some open questions.Our results on consistency and asymptotic normality depend on the condition that correlations vanish beyond a certain distance.It would be of interest to recover the consistency and asymptotic normality results for truncated-ML estimators, where the assumption of a compact support is dropped.To this end, we recall that the imposed conditions on covariance functions and approximations resulted in the uniform asymptotic equivalence of covariance matrices and approximations.Using this, we established the existence of a positive integer N , after which covariance matrix approximations remain positive-definite.Expanding to non-compactly supported covariance function, this result remains unchanged, as long as covariance matrices and approximations are uniformly asymptotically equivalent (uniformly on the parameter and sample space).Thus, in this case, consistency and asymptotic normality can be recovered, even when presumed covariance functions are no longer compactly supported.However, as a mere condition, the asymptotic equivalence of covariance matrices and approximations is of little practical importance.Thus, the case of non-compactly supported covariance functions deserves further attention.
From a more applied point of view, our results provide a strong theoretical basis for further research.It remains to test and extend the given examples of covariance approximations.The four examples of generalized Wendland approximations and their effect on parameter estimations were discussed from a theoretical point of view.An important next step is to provide numerical implementations and practical comparisons.
In conclusion, for large datasets built upon correlated data, the present work provides an essential missing piece in the area of covariance approximations.
Then, we have that Lemma B.8. Suppose that {c θ : θ ∈ Θ} satisfies Assumption 3.1 and 5.2 (regularity conditions for partial derivatives up to order q = 2 are sufficient).Suppose further that (c m,θ ) : θ ∈ Θ satisfies Assumption 3.2 (regularity conditions for partial derivatives up to order q = 2 are sufficient).Let N be as in Proposition 5.1 and define (G n,N (θ)) n∈N+ : θ ∈ Θ and G n,N (θ) n∈N+ : θ ∈ Θ as in (10) and (11), respectively.We then have that Further, we conclude that the random p × p matrix J G n,N (θ 0 ) converges in probability P to a non-random matrix 2Λ, where S p×p ∋ Λ ≻ 0.

C.1 Proof of results in Appendix B
Proof of Lemma B.1.
and thus ∥s i − s j ∥ ≥ C as well (since |w| ∞ ≤ ∥w∥ for any w ∈ R d ).Therefore, ( 18) follows since we have assumed that g has compact support S ⊂ B [0; C].The proof of ( 17) depends on the fact that there exists a minimal spacing ∆ τ > 0 between any two distinct observation points (see ( 4)).This allows us to show that for some arbitrary i ∈ N + , if N si,C denotes the cardinality of the set {j ∈ N + : For a complete argument one could for example consider the proof of Lemma 4 in [16].Using this we can estimate, and thus also (17) is proven.
Proof of Lemma B.2.The proof is similar to the proof of Lemma B.1 and hence we consider the lemma as proven.
Proof of Lemma B.3.Let C, L and C, L be defined as in (1) of Assumption 3.1 and (2) of Assumption 3.2, respectively.We use Lemma B.1 to show that there exists a real constant M > 0, which does not depend on Similarly, by (2) of Assumption 3.2 we then use Lemma B.2, together with Gershgorin circle theorem, to show that for any n ∈ N + , s (n) ∈ G n and θ ∈ Θ, ∥ Σ θ (s (n) )∥ 2 ≤ M as well.This shows (27).Thus, we have established that and therefore (21) of Lemma B.3 is verified.It is shown in [4] (Proposition D.4) that because of the increasing-domain setting, where there exists a minimal distance between any two observation points (see ( 4)), and since (3) of Assumption 3.1 is satisfied, This shows (23) of Lemma B.3.Using this result, we can fix some δ > 0 (small enough, independent of n ∈ N + , s (n) ∈ G n and θ ∈ Θ), such that for any For the above δ > 0, we can then find N ∈ N + such that, This is valid since for the given ε > 0, by the uniform convergence of (c r(n),θ ) to c θ (see (3) of Assumption 3.2), we find N ∈ N + such that for any n ≥ N , for any s (n) ∈ G n and 1 ≤ i, j ≤ n, Then, if we define and θ ∈ Θ, we can conclude that (28) must be satisfied.Using (28), we have, for n ≥ N , s (n) ∈ G n and θ ∈ Θ, and for vectors a such that ∥a∥ = 1, that under application of the Cauchy-Schwarz inequality.In conclusion we have for vectors a such that ∥a∥ = 1, for n ≥ N , But we know that inf n≥N inf s (n) ∈Gn inf θ∈Θ min ∥a∥=1 ⟨a, Σ θ (s (n) )a⟩ > 0 and δ > 0 was chosen small enough (but otherwise arbitrary).Thus, we have also proven (24) of Lemma B.3.Notice that ( 22) is proven with (28), hence the proof of Lemma B.3 is complete.
Proof of Corollary B.4.This follows from Lemma B.3.
Proof of Lemma B.5.We omit a formal argument and argue that one can proof Lemma B.5 using the same way of reasoning as in the proof of Lemma B.3.
Proof of Lemma B.6.For n ≥ N (N as in the statement) and θ ∈ Θ, we can write P a.s.
Note that B n,θ and B n,θ are random symmetric matrices.Further, for each of the random symmetric matrices In addition, since P a.s.for k = 1, . . ., I, by assumption we also have that and Using ( 29), ( 30) and (31), we pick δ > 0 arbitrary and define such that for some given integer N * ≥ N , P a.s., We can then estimate (32) from above and below as .
But for the given ε > 0, for n ≥ N * , we have that P a.s.
On the other hand, by (31), we also have that for n ≥ N * , P a.s.
Since δ > 0 was arbitrary and independent of θ ∈ Θ, the lemma is proven.
Proof of Lemma B.7. First, using the Cauchy-Schwarz inequality and the compatibility of the spectral norm with the Euclidean norm, we can estimate P a.s.
Let us fix some arbitrary ε > 0 such that for n large enough we have that P a.s., Then, let δ > 0 be arbitrary and notice that where V n is a Gauss vector, defined on (Ω, F, P), with zero-mean vector and identity covariance matrix.Then, we use Markov's inequality to estimate where the latter term is bounded uniformly in s (n) ∈ G n and n ∈ N + (see Lemma B.3).Thus we conclude that sup and thus the proof is complete.
Similar expressions can then be calculated for l n,N based on Σ h(n),θ0 .We can further calculate, for n ∈ N + , for 1 ≤ k, l ≤ p, P a.s., where and In addition, for n ∈ N + , we also have that P a.s., Again, similar expressions can be obtained for l n,N based on Σ h(n),θ0 , where for n ∈ N + , 1 ≤ k, l ≤ p, the respective terms A kl 1,h(n),θ0 and A kl 2,h(n),θ0 are defined as in (33) and (34), respectively, but Σ h(n),θ0 is replaced with Σ h(n),θ0 .Then, we have for n ∈ N + , for k, l = 1, . . ., p, P a.s., We can apply Lemma B.7 to the sequence of random matrices A kl 2,h(n),θ0 n∈N+ and A kl 2,h(n),θ0 n∈N+ to conclude under application of Lemma 4.1 (see also Corollary B.4 and Lemma B.5) that We also have P a.s.
In addition, we have that for any k, l = 1, . . ., p, P a.s., the expression which again, under application of the triangular inequality, von Neumann's trace inequality and Lemma 4.1 (see also Corollary B.4 and Lemma B.5), converges to zero P a.s.Hence, we have shown that for any k, l = 1, . . ., p, which concludes the proof of (26).Now it is shown in [4] where Λ is the P a.s.limit of a sequence of p × p matrices H h(n) (θ 0 ) n∈N+ defined as Further, by Assumption 5.2, it is concluded that the limit Λ is such that Λ ≻ 0. But then, we use (26) to show that as well, which concludes the proof of Lemma B.8.

C.2 Proof of results in Section 4
Proof of Lemma 4.1.We rely on Lemma B.3 and a proof is evident.
Proof of Proposition 5.1.The statement is verified as a consequence of Lemmas 4.1, B.6 and B.7.
Proof of Theorem 5.2.Let N ∈ N + be as in Lemma 4.1 (or Proposition 5.1) and define, for any ω ∈ Ω, the sequence ln,N (θ) (ω) n∈N+ as in (9) of Section 5.1.We note that, under the given assumptions of Theorem 5.2, the first order partial derivatives with respect to θ exist for the sequence ln,N (θ) (ω) n∈N+ .Then, we define the sequence of estimators θn,N n∈N+ := θn+N−1 n∈N+ .Therefore θn,N minimizes ln,N (θ) P a.s.for any n ∈ N + .To prove that θn We consider a similar approach as given in [4].As N is fixed, we write for n ∈ N + , h(n) = n + N − 1.Under the assumptions of the theorem we have that P a.s.Var ln,N (θ To see it, we remark that P a.s.(using Proposition 5.1), .
From here, we can use von Neumann's trace inequality to show that P a.s.
Let V h(n) be a Gauss vector on (Ω, F, P), with zero-mean vector and h(n) × h(n) identity covariance matrix.Then (see also Remark 2.1), for any finite M > 0, we have that the probability is bounded from above by Therefore, P a.s., is bounded from above by To continue, we define the sequences of random variables For any n ∈ N + , we have that P a.s., Similarly, For any n ∈ N + , we have that P a.s.
Notice that because of (39) we have that Further, it is shown in [4] (see the proof of Proposition 3.1) that under application of Lemma B.3 there exists some constant B > 0 (which does not depend on n ∈ N + ) such that P a.s.where in this case Notice that because of the assumption that (X i ) i∈N+ is independent with common law that has a strictly positive probability density function, the function f is strictly positive almost everywhere with respect to the Lebesgue measure on D τ (see the end of the proof of Proposition 3.1 in [4]).In either case, we can thus conclude that where for any α > 0, because of Assumption 5.1, inf θ : |θ−θ0|≥α D ∞,θ,θ0 > 0, and the limit D ∞,θ,θ0 is deterministic.We now want to show that there exists some N 2 ≥ N such that for any n ≥ N 2 , for any θ ∈ Θ, P a.s., as well.In this case, with D 2,h(n),θ,θ0 a random function on Ω and D ∞,θ,θ0 a deterministic function of θ ∈ Θ, we would have for any fixed τ ≥ 0, and for any given α > 0, where the sequence of estimators θn,N n∈N+ , is such that for n ≥ N 2 , P a.s., and we can conclude the proof of Theorem 5.2, using Theorem 5.7 of [31].Hence, it remains to show (40).We write P a.s., where .
By Lemma 4.1, Corollary B.4 and Lemma B.6, we already conclude that P a.s.
converges to zero uniformly in θ ∈ Θ as n − → ∞.Further, we can conclude that P a.s.
uniformly in θ ∈ Θ, by application of Corollary B.4, we can also see that P a.s.
converges to zero as n − → ∞, uniformly in θ ∈ Θ.Using a similar argument we can also show that P a.s. the term A 3,h(n),θ,θ0 − A 3,h(n),θ,θ0 converges to zero as n → ∞, uniformly in θ ∈ Θ.Hence, we have shown that P a.s.
, where But Λ is the P a.s.limit of H h(n) (θ 0 ) : n ∈ N + and hence we conclude that Λ is also the P a.s.limit of H h(n) (θ 0 ) : n ∈ N + .Then, we can apply Proposition D.9 of [4] to conclude that Notice that because the family (c m,θ ) : θ ∈ Θ satisfies Assumption 3.2, we have that for fixed ω ∈ Ω, θ → G n,N (ω, θ) is twice differentiable in θ and we can argue exactly as in the proof of Theorem 5.2 to conclude that the sequence is bounded in probability P. In addition, by Lemma B.8, we also have that Finally, the sequence of estimators θn,N n∈N+ is consistent and such that Thus we conclude, using for example Proposition D.10 in [4] that Since N was fixed, we can conclude that as well.
Proof of Corollary 5.5.The result follows from Theorem 5.4, a proof is evident when we define the family cm,θ : θ ∈ Θ as in the proof of Corollary 5.3.

C.4 Proof of results in Appendix A
Proof of Theorem A.1.The proof is similar to the proof of Theorem 5.2.
Proof of Theorem A.2.The proof is similar to the proof of Theorem 5.4.
In addition, we notice that for any q = 1, 2, 3, i 1 , . . ., i q ∈ {1, . . ., p}, for any θ ∈ Θ, where g → P m (g) is the Bernstein polynomial operator for a function g with support included [0, M ]: for t ≤ M and zero otherwise.Therefore, we can rely on the same arguments that we have used to show that (2) and (3) of Assumption A.2 are satisfied, to show that also (4) and ( 5) of Assumption A.2 must be satisfied.This then concludes the proof of Proposition 6.5.

2 d 1 / 2
For a vector (w 1 , . . ., w d ) = w ∈ R d , we write ∥w∥ = w 2 1 + • • • + w for the Euclidean norm of w on R d .In the case of d = 1 we use the notation |•| for the Euclidean norm.For two vectors w, w ′ ∈ R d , ⟨w, w ′ ⟩ = w t w ′ = d i=1 w i w ′ i represents the inner product that induces ∥•∥ on R d .Given D ⊂ R d , we write B C (D; S) for the space of real valued, uniformly bounded functions on D, having compact support S ⊂ D. If f ∈ B C (D; S) and f is also continuous, we use the notation C C (D; S) instead of B C (D; S).For f ∈ C C (D; S) we write ∥f ∥ ∞ = sup{|f (h)| : h ∈ D} for the uniform norm on C C (D; S).For vectors w ∈ R d , |w| ∞ = max i=1,...,d |w i | denotes the uniform norm on R d .

,
the Bernstein polynomial of the function ϕ θ on [0, b m ), where b m m→∞ − −−− → ∞ and we assume that b m = o(m).Thus, for any 0

Lemma B. 1 .
Let C, L < ∞ be some real constants.Consider g : R d → R + such that g ∈ B C (R d ; S), with S ⊂ B[0; C] and ∥g∥ ∞ ≤ L.Then, for any i ∈ N + , for any sequence (s j ) j∈N+ ∈ G, j∈N+ g(s i − s j ) ≤ LR(d, C, τ ), To see this, let C * := max{C, C} and L * := max{L, L}.Using (1) of Assumption 3.1, we have that for any θ ∈ Θ, c θ ∈ B C (R d ; S θ ), where now S θ ⊂ B [0; C * ] and ∥c θ ∥ ∞ ≤ L * , with C * and L * finite constants that are independent of n ∈ N + and θ ∈ Θ.Thus we can write, for any n ∈ N + , s (n) ∈ G n and θ ∈ Θ, by Gershgorin circle theorem, s) , n ≥ N, since we have assumed that the families {c θ : θ ∈ Θ} and (c m,θ ) : θ ∈ Θ have compact supports, which belong to B [0; C * ], we have that g r(n) (s) = 0 for ∥s∥ ≥ C * .Thus, by Gershgorin circle theorem, under application of Lemma B.2, for n ≥ N and s the smallest eigenvalue is strictly greater that zero, P a.s., uniformly in n ≥ N and θ ∈ Θ and hence we have that inf n≥N inf θ∈Θ λ n (B n,θ ) > 0 and inf n≥N inf θ∈Θ λ n B n,θ > 0, P a.s.
), with Λ as in Theorem 5.4.6 Example of application: Generalized Wendland functionsIn this section we work in the same setting as in Section 2.2, but we additionally assume that Z is isotropic.Explicitly, for the given family of covariance functions {c θ : θ ∈ Θ}, we assume that there exists a parametric family{φ θ : θ ∈ Θ} such that for any θ ∈ Θ, s ∈ R d , c θ (s) = φ θ (∥s∥).The family {φ θ : θ ∈ Θ} is called the radial version of {c θ : θ ∈ Θ}.We can recycle the notation of Section 3 and easily translate Assumptions 3.1 and 3.2 by considering families of approximations ( φm,θ ) : θ ∈ Θ for {φ θ : θ ∈ Θ} on R + .This allows us to readily recover the results of Sections 4 and 5 for isotropic random fields.For the details we refer to Assumptions A.1 and A.2, as well as Theorems A.1 and A.2 in Appendix A. In terms of an explicit family of radial covariance functions, we reconsider the generalized Wendland covariance function which we have already introduced in (1) of Section 1.3.Let Θ ∋ θ := (σ 2 , β), where Θ (see Propositions D.7 and D.8 and also consider the proofs of Propositions 3.2 and 3.3), under application of Lemmas B.1, B.2, 4.1, B.5 and Corollary B.4, that 5, there exist constants M 1 , M 2 > 0 (which are independent of n ∈ N + , s (n) ∈ G n and θ ∈ Θ) such that P a.s.