New multivariate central limit theorems in linear structural and functional error-in-variables models

This paper deals simultaneously with linear structural and functional error-in-variables models (SEIVM and FEIVM), revisiting in this context generalized and modified least squares estimators of the slope and intercept, and some methods of moments estimators of unknown variances of the measurement errors. New joint central limit theorems (CLT's) are established for these estimators in the SEIVM and FEIVM under some first time, so far the most general, respective conditions on the explanatory variables, and under the existence of four moments of the measurement errors. Moreover, due to them being in Studentized forms to begin with, the obtained CLT's are a priori nearly, or completely, data-based, and free of unknown parameters of the distribution of the errors and any parameters associated with the explanatory variables. In contrast, in related CLT's in the literature so far, the covariance matrices of the limiting normal distributions are, in general, complicated and depend on various, typically unknown parameters that are hard to estimate. In addition, the very forms of the CLT's in the present paper are universal for the SEIVM and FEIVM. This extends a previously known interplay between a SEIVM and a FEIVM. Moreover, though the particular methods and details of the proofs of the CLT's in the SEIVM and FEIVM that are established in this paper are quite different, a unified general scheme of these proofs is constructed for the two models herewith.


Introduction
First, we present linear structural and functional error-in-variables models, assumptions that will be used in each of them, and estimators of unknown parameters of interest under study (cf. respective Sections 1.1-1.3). We then discuss the essence of the main results of this paper in Section 1.4.

Linear structural and functional error-in-variables models
(SEIVM and FEIVM) In the linear structural and functional error-in-variables models (EIVM's) of this paper we observe pairs (y i , x i ) ∈ IR 2 according to where ξ i are unknown explanatory/latent variables, the real-valued slope β and intercept α are to be estimated, and δ i and ε i are unknown measurement error terms/variables, 1 ≤ i ≤ n, n ∈ IN. EIVM (1.1) is also known as a measurement error model, or structural/functional relationship, or regression with errors in variables. It is a generalization of the simple linear regression of form y i = βξ i + α + δ i in that in (1.1) it is assumed that, in addition to the two variables η := βξ +α and ξ being linearly related, now not only η, but also ξ, are observed with respective measurement errors δ i and ε i . The explanatory variables ξ i are assumed to be independent identically distributed (i.i.d.) random variables (r.v.'s) that are independent of the error terms when we deal with the structural EIVM (SEIVM) (cf. upcoming condition (S2)), and deterministic in case of the functional EIVM (FEIVM).

Assumptions in SEIVM and FEIVM
Both in the SEIVM and FEIVM versions of (1.1), the following assumptions are made on the error terms. In the SEIVM, we suppose that the explanatory variables {ξ, ξ i , i ≥ 1} obey (S1) and (S2) as follows: (S1) {ξ, ξ i , i ≥ 1} are i.i.d.r.v.'s in the domain of attraction of the normal law (DAN), i.e., there are constants a n and b n , b n > 0, for which ( n i=1 ξ i − a n ) b −1 n D → N (0, 1), as n → ∞. (S2) ξ is independent of (δ, ε). Remark 1.1. Further to the definition of DAN in (S1), it is known that a n can be taken as nE ξ and b n = n 1/2 ℓ ξ (n), where ℓ ξ (n) is a slowly varying function at infinity (i.e., ℓ ξ (az)/ℓ ξ (z) → 1, as z → ∞, for any a > 0), defined by the distribution of ξ. Moreover, ℓ ξ (n) = √ Var ξ > 0, if Var ξ < ∞, and ℓ ξ (n) ր ∞, as n → ∞, if Var ξ = ∞. Also, ξ has moments of all orders less than 2, and the variance of ξ is positive, but need not be finite. Remark 1.2. One of the several necessary and sufficient conditions for i.i.d.r.v.'s {Z, Z i , i ≥ 1} to be in DAN is commonly associated with O'Brien [19] (for more details see also Remark (iii) in Maller [13], p.194), and it reads as follows: max 1≤i≤n Z 2 i n i=1 Z 2 i P → 0, n → ∞. In Remark (iv) of Maller [13] it is pointed out that this "negligibility" condition has appeared and played an important role in the asymptotic theory of many stochastic models. In addition to the models listed in [13], Z ∈ DAN has also been used for CLT's in a simple linear regression (cf. Maller [12]) and, as frequently an optimal, or nearly optimal, condition on the explanatory variables, for various marginal CLT's in the SEIVM (1.1) (cf. Martsynyuk [14,15,17]).
For the deterministic ξ i in the FEIVM (1.1), we assume the following (F1)-(F3) conditions that, in view of Remarks 1.1 and 1.2, are natural companions to (S1).
For some further comments on the introduced assumptions on the explanatory variables in the SEIVM (1.1) and FEIVM (1.1), we refer to Remarks 2.10 and 2.11 of Section 2.2.
The identifiability assumption (1) corresponds to orthogonal regression estimation in (1.1), while (3) is likely to be realistic in many applications (cf. Carroll and Ruppert [1], Carroll et al. [2], Cheng and Van Ness [4] and Fuller [6] for some further discussions along these lines). Under each of the identifiability assumptions in (1)-(3), we are to estimate the slope β, intercept α, and the respective, typically unknown, error variance λθ, or θ, that is denoted throughout by γ for convenience in notations.

Estimators for slope β, intercept α and unknown error variances
When (1) is assumed, it has been common to estimate β and α with generalized least squares estimators (GLSE's) β 1n and α 1n that are simply derived under (A) without assuming the finiteness of the fourth error moments (cf., e.g., Section 3.4 in [4]). For estimating γ = θ, a method of moments estimator (MME) γ 1n is a usual one. Provided that S xy = 0, these estimators are given by β 1n = sign(S xy ) λS xx − S yy 2S xy where sign(S xy ) denotes the sign of S xy . When either (2) or (3) is assumed, modified least squares estimators (MLSE's) for β and α (cf. Section 3.5 in [4]), and MME's for the respective unknown error variances γ = θ and γ = λθ are available. The MLSE's and MME under (2) are assuming that S xy − µ = 0 and S yy − λθ > 0, while those under (3) are provided that S xx − θ > 0.

Introduction to main results
In this paper, we revisit the triples of the estimators in (1.3)-(1.5) in the SEIVM and FEIVM versions of (1.1), and prove two joint central limit theorems (CLT's) for each of the triples (cf. Theorem 2.1), under the respectively introduced assumptions on the explanatory variables that are, to the best of our knowledge, the most general ever used so far in this context (cf. Remarks 2.2, 2.3). As to the conditions (A) and (B) on the error terms here, they seem to be the least restrictive that have been considered in the literature thus far (cf. Remarks 2.2, 2.3, 2.8, 2.9). Further to the special features of our CLT's in Theorem 2.1, these CLT's are in Studentized forms to begin with and, as a result, are automatically nearly, or completely, data-based. Namely, as compared to the related CLT's for (β, α, γ) in the literature, the ones in Theorem 2.1 are a priori free of any unknown parameters associated with the explanatory and error variables (cf. Remarks 2.5 and 2.6).
The CLT's of Theorem 2.1 also extend a previously known interplay between a SEIVM and a FEIVM as in Gleser [8]. This extension is due to a synchronized choice of the respective conditions on the explanatory variables in the SEIVM and FEIVM of the present paper (cf. Remark 2.11). Consequently, the CLT's of Theorem 2.1 are universal in form for the SEIVM (1.1) and FEIVM (1.1) (cf. Remark 2.12).
The idea of establishing the joint CLT's of Theorem 2.1 for the estimators in (1.3)-(1.5) has originated from a wish to extend and unify the Studentized marginal CLT's for each of these estimators that are proved, among other things, for the SEIVM (1.1) in Martsynyuk [14,15,17], and for the FEIVM (1.1) in Martsynyuk [15,16], under nearly the same respective model assumptions as those in Theorem 2.1. When establishing the multivariate CLT's of the present paper, it was important for us to preserve and build on the new assumptions on the explanatory variables that had first been introduced and used in the SEIVM (1.1) and FEIVM (1.1) in [14,15] (cf. Remarks 2.10 and 2.11 on the crucial roles of (S1) and (F1)-(F3) in [14,15]). However, it was even more desirable that the CLT's for (β, α, γ) should be of suitable Studentized forms, like their marginal predecessors, and so that they would also be universal in form both for the SEIVM (1.1) and FEIVM (1.1).
Theorem 2.1 would have remained a wishfull thinking only if not for the auxiliary results in Section 3 that bridge the context of the SEIVM (1.1) and FEIVM (1.1) with recent advances in Studentization of random vectors by a matrix, the generalized domain of attraction of the multivariate normal law (GDAN), and the domain of attraction of the univariate normal law (DAN). For the sake of providing convenient reference in Section 3, we summarize some of these advances in the subsidiary Section 4. Among the auxiliary results of Section 3, the key Theorems 3.1 and 3.2 of Sections 3.1 and 3.2, respectively, namely, a CLT for a multivariate Student statistic that is based on independent but not necessarily identically distributed random vectors that satisfy the Lindeberg condition, and a special characterization of GDAN, may also be of interest beyond the scope of the present paper. Also, the auxiliary CLT's of Section 3.3 are rather versatile, and can also be used to prove multivariate CLT's for estimators other than (1.3)-(1.5) in the SEIVM (1.1) and FEIVM (1.1), and in the respective no-intercept versions of these models, where α is assumed to be zero (cf. Remarks 3.4 and 3.5). Although the particular methods and details of the proofs of Theorem 2.1 for the SEIVM and FEIVM versions of (1.1) are fundamentally different, a unified general scheme of these proofs is constructed for the two models (cf. Sections 3.3 and 3.4).
This paper is based on parts of the author's Ph.D. thesis Martsynyuk [15], written under the supervision of Miklós Csörgő, and on parts of Martsynyuk [14].

Main results with remarks
The (a) and (b) parts of Theorem 2.1, namely the Studentized CLT's for the triples of the estimators in (1.3)-(1.5) in the SEIVM (1.1) and FEIVM (1.1), constitute the main results of this paper.
In the sequel, all vectors are row-vectors. For vectors Z 1 , · · · , Z n and W 1 , For a positive definite matrix A (A > 0), notation A 1/2 stands both for the (left) Cholesky and symmetric positive definite square roots of A. We recall that the (left) Cholesky square root A 1/2 of A > 0 is the uniquely existing lower triangular matrix with positive diagonal elements that is such that Clearly, it is invertible. As to the symmetric positive definite square root A 1/2 of matrix A > 0, the latter exists and satisfies ( Notation diag(·, · · · , ·) stands for a block-diagonal matrix, where in the brackets square matrix blocks that are on its diagonal are listed.
Remark 2.2. In view of Remark 1.1, the assumption ξ ∈ DAN of (S1) in the SEIVM (1.1) is weaker than the following one: (2.7) Indeed, convergence of ξ 2 to the finite positive limit M implies that n −1 ξ 2 n = ξ 2 − n −1 (n − 1)(n − 1) −1 n−1 i=1 ξ 2 i → 0, n → ∞, and the latter convergence, via a proof by contradiction, leads to n −1 max 1≤i≤n ξ 2 i → 0, n → ∞, and thus, to (F3). While conditions (2.6) and (2.7) have commonly been used for the CLT's in the SEIVM's and FEIVM's in the literature so far, our (S1) and the group of assumptions of (F1)-(F3) are believed to be new respective assumptions for these models. More precisely, (S1) and (F1)-(F3) have first been introduced respectively for the SEIVM (1.1) and FEIVM (1.1) in Martsynyuk [14,15] (cf. also Remarks 2.10 and 2.11), and have not yet been used in EIVM's by other authors. As to the conditions on the error terms in (A) and (B) here, they seem to be the least restrictive that have been considered for CLT's thus far (we will elaborate further on the assumption (B) in Remarks 2.8, 2.9). Remark 2.3. In the literature, the vectors of the estimators ( β jn , α jn , γ jn ), 1 ≤ j ≤ 3, are known to be √ n−asymptotically normal (cf., e.g., Gleser [7,8] for j = 1, Cheng and Van Ness [3] for j = 2 and 3), under the respective identifiability assumption in (1)-(3), (A), and (2.6) and (S2) in the SEIVM, or (2.7) in the FEIVM, and, in case of [3], also under the condition that ξ, δ and ε are indepedent normal r.v.'s in the SEIVM, and that δ and ε are independent normal r.v.'s in the FEIVM. In view of Remark 2.2, if (2.6) and (2.7) are not assumed, then the CLT's of Theorem 2.1 appear to be first time ones. Otherwise, under (2.6) (in the SEIVM), or (2.7) (in the FEIVM), subject to the above mentioned additional normality conditions in [3] and comments on (B) in upcoming Remarks 2.8 and 2.9, the CLT's of [3,7,8] follow respectively from those in Theorem 2.1. For the proof of this statement, we refer to the very end of Section 3.4.
Remark 2.5. Due to them being in Studentized forms to begin with, the CLT's of Theorem 2.1 are a priori free of any unknown parameters (such as moments) of the distribution of (δ, ε) depend only on the error moments that are assumed to be known according to the corresponding identifiability assumption in (1)-(3)), and do not contain any parameters associated with ξ in the SEIVM and {ξ i , i ≥ 1} in the FEIVM. In addition, the CLT's of the (b) part of Theorem 2.1 are completely data-based. Hence, the latter CLT's are readily applicable to constructing large-sample approximate confidence regions for (β, α, γ).
Remark 2.6. A priori Studentized forms, and the corresponding features described in Remark 2.5, of our CLT's make them new even under the stronger conditions in (2.6) and (2.7) that were used in the CLT's of [3,7,8]. Indeed, as opposed to Theorem 2.1, the expression (the same for the SEIVM and FEIVM) for the covariance matrix of the asymptotic normal distribution of ( β 1n , α 1n , γ 1n ) that is due to Gleser [7,8] is complicated, and in addition to the unknown parameters β, m and M , where m and M are as in (2.6) and (2.7), it involves typically unknown cross-moments and moments of order ≤ 4 of the error terms that are hard-to-estimate from data. Then, in order to be able to estimate the covariance matrix of the CLT in [7] (in the FEIVM), it is additionally assumed that the moments of δ and ε are like those of two independent normal r.v.'s. Consequently, the covariance matrix of the latter CLT becomes much simpler in form and contains only the unknown, but estimable β, m, M and γ = λθ. As to the respective asymptotic covariance matrices of ( β jn , α jn , γ jn ) in [3], j = 2 and 3, similarly, what results in their simple forms, and hence straightforward estimability, is the normality conditions on independent ξ, δ, ε in the SEIVM and independent δ, ε in the FEIVM that are required for the CLT's in [3].
Remark 2.7. Studentized bivariate CLT's for ( β jn , α jn ), ( β jn , γ jn ) and ( α jn , γ jn ), 1 ≤ j ≤ 3, that are similar in form and features to those in (a) and (b) of Theorem 2.1 also hold true. The proofs of such CLT's are based on the auxiliary CLT's in Theorem 3.4 and are like the proofs of (a) and (b) of Theorem 2.1.
Remark 2.8. Condition (B) on the error terms in Theorem 2.1 imposes hardly any restrictions, and is only assumed there for the sake of checking (3.72) that, in particular, implies that in the (a) and (b) parts, the respective matrices V z(j,n,β)z(j,n,β) and V z(j,n, βjn)z(j,n, βjn) are positive definite on sets whose probabilities go to one, as n → ∞. Moreover, along the lines of the proof of (3.72), it is not hard to see that the Studentized bivariate version of Theorem 2.1 for ( β jn , α jn ) does not require assuming (B) at all. Remark 2.9. Under (2.6) and (2.7), matrices are natural estimators for the respective asymptotic covariance matrices of ( β jn , α jn , γ jn ) obtained in [3,7,8] (for the proof in the case of j = 3 see (3.80), (3.87) and (3.89)-(3.91)). These matrices are different from the respective estimators in [3,7,8] that were constructed under the additional normality, or normality like, conditions specified in Remark 2.6, and also work when these conditions fail. These normality conditions also ensured positive definiteness of the asymptotic covariance matrices of ( β jn , α jn , γ jn ) in [3,7,8], 1 ≤ j ≤ 3, but they seem to be more restrictive than our weak assumptions of (B) that guarantee such positivity on account of (3.72) and (3.89)-(3.91).

Interplay between SEIVM and FEIVM
Remark 2.10. We elaborate further on assumption (S1) on the explanatory variables in the SEIVM (1.1). (S1) is inherited from the author's previous work in [14], where it was introduced the first time around for SEIVM's. In [14] that also led to [15,17], the motivations for introducing (S1) into the asymptotic theory in the SEIVM (1.1) amounted to more than just aiming at a generalization of the usual assumptions in (2.6) that had been used in the literature before. From the empirical standpoint, by letting ξ i in (1.1) to have an infinite deviation (Var ξ = ∞), we make them more dominant over the errors with finite variances. This, in turn, renders observations y i and x i to be more robust to noise (errors) and thus, more precise. Moreover, from a rigorous mathematical point of view, condition ξ ∈ DAN is optimal, or nearly optimal for the marginal CLT's for β jn and α jn in [14,15,17], 1 ≤ j ≤ 3 (cf., e.g., Proposition 1.1 in [14]). In additon, some distinctive features of the SEIVM's under Var ξ = ∞ were discovered (cf., e.g., Corollary 1 and Remarks 5, 6, 7, 9 in [17]).
Remark 2.11. Assumptions (F1)-(F3) on the explanatory variables in the FEIVM (1.1) were first introduced and used in the FEIVM (1.1) in [15]. In [15], for the sake of achieving a strong similarity between the marginal CLT's for β jn , α jn and γ jn in the FEIVM (1.1) of [15] and those in the SEIVM (1.1) of [14], assumptions (F1)-(F3) on the deterministic ξ i were introduced in such a way that, due to Remarks 1.1 and 1.2, they would be natural companions for the DAN condition in (S1) on stochastic ξ i in [14]. An empirical rationale behind allowing lim n→∞ ξ 2 = ∞ as in (F2) is similar to that behind possibly having Var ξ = ∞ as in (S1) (cf. Remark 2.10). Also, (F3) is in some sense optimal for the marginal CLT's for β jn and α jn in [15], 1 ≤ j ≤ 3 (cf. Remark 2.1.8 of [15]). A further similarity of (S1) and (F2) relates to the fact that their respective partial cases Var ξ = ∞ and lim n→∞ ξ 2 = ∞ make the SEIVM (1.1) and FEIVM (1.1) behave as if they were the simple regressions y i = βx i + α + δ i (cf. Remarks 1.1.6, 2.1.10 and Sections 1.1.5, 2.1.5 in [15]). Remark 2.12. Between the SEIVM under (2.6) and the FEIVM under (2.7) there is an interplay established by Gleser [8] that yields, in particular, that the CLT's for ( β jn , α jn , γ jn ) as in [3,7], j = 1, 3, that are proved in the FEIVM under (2.7) also hold true in the SEIVM with {ξ, ξ i , i ≥ 1} satisfying (2.6). Similarly, the identity in form of the marginal CLT's for β jn , α jn and γ jn in the FEIVM under (F1)-(F3) in [15] to those in the SEIVM under (S1) of [14] establishes an asymptotic interplay between these two more general models. The CLT's for ( β jn , α jn , γ jn ) as in Theorem 2.1 and the respective bivariate CLT's as in Remark 2.7 are also universal in form for the latter two models, invariant as to whether the explanatory variables have a deterministic nature, as in the FEIVM, or a stochastic nature, as in the SEIVM, and thus, further contribute to the models' interplay in terms of their asymptotics.

Auxiliary results and proofs of main results
3.1. CLT for a multivariate Student statistic that is based on independent random vectors satisfying the Lindeberg condition In this section we state and prove Theorem 3.1, a key auxiliary CLT required for the proofs in the FEIVM of this paper. It may also be of independent interest. For random vectors Z 1 , · · · , Z n in IR d , we introduce a multivariate Student statistic as follows: ZZ is either the (left) Cholesky, or the symmetric positive definite, square root of the matrix Hereafter, notations · , 1 1 {·} and Z (j) respectively stand for the Euclidean norm in IR d , an indicator function and the j th component of vector Z ∈ IR d , d ≥ 1. When we will write that a (random) matrix converges (in probability) to another (random) matrix of the same size, it will mean that each entry of the converging matrix goes (in probability) to the corresponding entry of the limiting matrix.
with some limiting d × d matrix Σ, and that the Lindeberg condition is satisfied, namely, Then, for the Student statistic St n (Z(n)) as in (3.1), as n → ∞, St n (Z(n)) Proof. According to the Lindeberg-Feller theorem (cf., e.g., 2.27 Proposition in [21] The proof of (3.4) goes first for Z i (n) that are such that the limiting matrix Σ = I d in (3.2).

Lindeberg's condition (3.3) implies that the same condition holds true for {Z
as n → ∞ (cf., e.g., respective conclusions (3.6) and (3.7) in [16]). Consequently, and for having (3.4), it suffices to show the convergence in probability of the off-diagonal entries of the matrix (n − 1)V Z(n)Z(n) to the corresponding entries of the matrix Σ = I d , namely, for Then, for (3.6) to hold true, by Theorem 3 in [20] on p.210, it is sufficient to show that for any ν > 0 and some τ > 0, as n → ∞, then (3.7) with any ν > 0 follows from (3.3). As to (3.8), we have For any φ > 0 and sufficiently large n, where, on account of (3.2) with Σ = I d and that Z Hence, (3.8) with τ = 1 holds true. As to the convergence in (3.9), it is valid because (3.3) and the assumption that This completes the proof of (3.6). Now, since (3.6) holds true and The latter follows from (3.2) and Markov's inequality for any 1 ≤ j ≤ d and φ > 0, namely, This also completes the proof of (3.4) for Z i (n) with Σ = I d in (3.2).
Remark 3.1. Though the CLT of Theorem 3.1 appears to be quite natural, especially in view of the well-known multivariate Lindeberg-Feller CLT (cf., e.g., 2.27 Proposition in [21]) and Theorem 4.1, one essential link for its conclusion, namely, an appropriate version of (4.1) as in (3.4) used to be missing. Our (3.4) is essentially a multivariate extension of a part of Raikov's theorem (cf. Theorem 4 on p.143 in [9]) that amounts to saying that for r.v.'s {X i (n), 1 ≤ i ≤ n, n ≥ 1} that are independent in each row and satisfy the Lindeberg [22], the authors of Theorem 4.1 as in Section 4 of the present paper, with applications of their theorem in mind, pose the question of having (4.1) with some matrix V n as in Theorem 4.1 that would correspond to a case of independent nonidentically distributed random vectors, at least when the latter have finite covariance matrices. Hence, our (3.4) can also be viewed as a partial answer to this question.

Special characterization of the generalized domain of attraction of the multivariate normal law
The purpose of this subsection is to establish a special, convenient characterization of the generalized domain of attraction of the multivariate normal law (GDAN) as in Theorem 3.2. It will enable us to apply Theorem 4.3 for obtaining an auxiliary CLT of (a) of Theorem 3.3 for proving (a) of Theorem 2.1 in the SEIVM (1.1). By establishing Theorem 3.2, we also give an example of special vectors Z = (Z (1) , · · · , Z (d) ) for which the fact that Z (j) ∈ DAN for all 1 ≤ j ≤ d characterizes that Z ∈ GDAN. In this example we get away from the condition that components Z (j) of vector Z are identically distributed as in the related example with spherically symmetric Z at the end of Remark 4.5.
We recall that random vector Z is a full vector, or has a full distribution, if Z, u is a nondegenerate r.v. for all deterministic unit norm vectors u. In the sequel, for Z ∈ IR d , when d ≥ 2, Z (k,k+l) = (Z (k) , Z (k+1) , · · · , Z (k+l) ) denotes a subvector of Z that has all the components of Z starting with Z (k) and ending with Z (k+l) , 1 ≤ k ≤ d − 1, 1 ≤ l ≤ k + l ≤ d. Notation diag(·, · · · , ·) stands for a block-diagonal matrix, where in the brackets square matrix blocks that are on its diagonal are listed.
(3.10) For vector Z formed by all the components, if any, of Z whose second moments exist, assume that Z is full.
Then the following two statements are equivalent: Proof. That (a) implies (b) is explained in Remark 4.5. Conversely, assume that (b) holds true. By equivalence of the (a) and (b) parts of Theorem 4.2, the proof of (a) of this lemma reduces to verifying convergence in (b) of Theorem 4.2 for suitably chosen matrices B n , provided that vector Z is full. Clearly, we are concerned with the case of d ≥ 2 only.
If E(Z (j) ) 2 < ∞ for all 1 ≤ j ≤ d, then Z = Z is full by (3.11). Hence, Cov Z > 0 and, due to a weak law of large numbers applied to each entry of Suppose now that, without loss of generality, First, note that such vector Z is full and thus, Theorem 4.2 is applicable. Indeed, for any unit norm scalar vector u, , and, if |u (1) | + · · · + |u (m) | > 0, on account of (3.10), Var Z, u = ∞, while when u (1) = · · · = u (m) = 0, Var Z, u > 0 by (3.11). Next, we introduce the blockdiagonal matrix where ℓ (1) (n) ր ∞, · · · , ℓ (m) (n) ր ∞ are slowly varying functions at infinity that correspond to Z (j) ∈ DAN (cf. Remark 1.1), namely, we have for 1 ≤ j ≤ m, (3.14) Note that Cov Z (m+1, d) −1/2 in (3.13) is well-defined on account of (3.11). For B n as in (3.13), we are to verify convergence in the (b) part of Theorem 4.2, namely, that where matrix E n consists of the following matrix blocks: with some absolute constants in (3.17) depending on j and k, but not on n, and As n → ∞, on account of (3.14) and (4.7) of Remark 4.4, e jj n P → 1 for all 1 ≤ j ≤ m, (3.19) and, due to (3.10) and the fact that while, clearly, This concludes the proof of the convergence in (3.15) with B n as in (3.13). Finally, the third type of vectors satisfying (3.10), (3.11) and having all their components in DAN consists of vectors Z with E(Z (j) ) 2 = ∞ for all 1 ≤ j ≤ d. Such vectors Z are full, since for any deterministic unit norm vector u, on account of (3.10), Var Z, u = ∞. Put where ℓ (j) (n) are as in (3.14), and ℓ (j) (n) ր ∞, n → ∞, for all 1 ≤ j ≤ d. For all elements e jk of matrix E n in (3.15) defined now via B n of (3.22), by (4.7) and (3.10), as n → ∞. Hence, convergence in probability in (3.15) with B n of (3.22) is valid.
Remark 3.2. Relationship in (3.10) may follow from independence of Z (j) and Z (k) and the existence of their respective first moments. According to the Cauchy-Schwarz inequality, (3.10) is also satisfied when, e.g., Z (1) ∈ DAN with E(Z (1) ) 2 = ∞ (from Remark 1.1, E(Z (1) ) 2−a < ∞ for any a ∈ (0, 2]), while E(Z (j) ) 2+∆ < ∞ for some ∆ > 0, for all 2 ≤ j ≤ d.  [14,15,16,17]. This subsection is written with the genuinely multivariate case of d > 1 in mind. This is understood from, but not spelled out in, some of the conditions and notations that are to appear (cf., e.g., (3.25)). However, simply omitting the arguments that are suitable and used only for the case of d > 1, makes the results of this subsection also valid for the case of d = 1. Let Using (3.24) and vectors of constants b 1 , · · · , b d in IR 7 , whose components are such that if

Auxiliary
Var ξ = ∞, in the SEIVM, lim n→∞ ξ 2 = ∞, in the FEIVM, then b (1) we define vectors In the SEIVM (1.1), we will also consider random vectors ζ 0 and J 0 that respectively generate ζ i and J i , namely, Next, for a special case of the multivariate Student statistic in (3.1), we prove Theorem 3.3, the first of the two auxiliary theorems of this subsection.  CovJ i > 0, if lim n→∞ ξ 2 < ∞ and/or b 1 | > 0.
Further, from (26) of the proof of Lemma 6 in [17], by noticing that condition (24) with ζ 0 , b j in place of ζ 0 , b in that lemma is now a part of (3.29), we have that ζ 0 , b j ∈ DAN, for all 1 ≤ j ≤ d. Finally, on account of (b) implying (a) in Theorem 3.2, J 0 ∈ GDAN.
Thus Assume that lim n→∞ ξ 2 = ∞ and |b . It was shown earlier in the proof that Z i (n) of (3.31) obey the conditions of Theorem 3.1. Combining (3.4) and (4.5) with A n = Σ > 0 therein, where matrix Σ is as in (3.2), we conclude that and hence, also, that 39) as n → ∞. Next, we observe that in view of (3.4), or, equivalently, in  In general, condition (3.29) is checked on a case-by-case basis, depending on the vectors b 1 , · · · , b d in hand. The first line of (3.29) amounts to saying that r.v. ζ 0 , u 1 b 1 + · · · + u d b d is nondegenerate for any vector u = (u 1 , · · · , u d ), u = 1. In particular, on account of the proof of Lemma 6 in [17], the latter statement holds true when Var ξ < ∞ and the first two components of vector u 1 b 1 + · · · + u d b d are not simultaneously zero for any u = (u 1 , · · · , u d ), u = 1, namely, when the nonrandom vectors b On using Theorem 3.3, we are to study another Studentized partial sum in Theorem 3.4, the second auxiliary theorem of this subsection. This Studentized partial sum is a prototype for the main terms in the expansions for the Studentized estimators of (β, α, γ) as in Theorem 2.1. Let c 1 , c 2 , · · · , c d ∈ IR 5 be nonzero vectors of constants, and   Proof of the (a) part of Theorem 3.4. We have, where vector R(n) = √ n K(n) − √ n J, and, according to (41) of [17] and (3.25) for b j , components R (j) (n) of vector R(n) are such that (3.50) Note that B n is well-defined on account of (A) and (3.29). Interpreting (3.48) as a degenerate weak convergence in (IR d , · ), by Theorem 4.1 from the Appendix, for (3.48) it suffices to show that, as n → ∞, Convergence in (3.51) follows from the fact that J 0 is as in (3.10) and (3.11) (this was shown in the proof of the (a) part of Theorem 3.3), and is argued the same way as convergence in (3.15) with therein matrices B n as in (3.12) or (3.13). In this regard, we also note that the correspondence of B n in (3.50) to B n in (3.12) or (3.13) is seen by noticing that if Var ξ = ∞ and |b

369
N (0, 1), n → ∞. As to (3.52), when Var ξ < ∞ and/or b (2) 1 = 0, it is a direct consequence of (3.47). Otherwise, namely under Var ξ = ∞ and |b 1 | > 0, it is due to (3.47) and observing that This completes the proof of (3.48). For establishing (3.49), it suffices to prove that, as n → ∞, and with matrix B n given by (3.50). Since then (3.53) follows from the convergence in (3.51) via (4.5). Similarly, (3.54) is a consequence of (4.5) and The rest of the proof is concerned with verifying (3.55). Defining vector we have Due to (49) of Lemma 8 in [17] (condition (24) in [17] with b j as in (3.45) in place of e of (37) in [17] is satisfied on account of (3.25) and (3.29)), as n → ∞, . The latter and the Cauchy-Schwarz inequality applied to each off-diagonal entry of matrix V Q(n)Q(n) yeild that V Q(n)Q(n) P → O (zero matrix), and therefore, with matrix B n of (3.50), as n → ∞. Now, in view of (3.51), (3.57) and (3.59), convergence in (3.55) is a consequence of which, in turn, follows from (3.51), (3.59) and the Cauchy-Schwarz inequality applied to each of the entries of the converging matrix product in (3.60). While the latter implication is easily seen when Var ξ < ∞ and/or b 61) which follows from (3.53) in [16] and (3.25). In place of B n of (3.50), we choose here matrix where matrix Σ > 0 is as in (3.2) that reads for Z i (n) of (3.31), while matrix D n is defined in (3.37). Then, convergence in (3.51) with B n of (3.62) is due to having (3.31), (3.4) and (3.40). As to the validity of (3.52) in the context of the FEIVM (1.1), with B n of (3.62), it is based on (3.61) and convergence which amounts to (3.28) in [16]. Thus, (3.48) in the FEIVM is verified via the appropriate versions of (3.51) and (3.52). Now, we are to argue (3.49) in the present context. In fact, we only need to show (3.59) with B n of (3.62). By (3.54) in [16] and (3.25), for Q i (n) as in (3.56), for all 2 ≤ j ≤ d.  Remark 3.5. We note that, via [14,15,16,17], Theorem 3.3 can be adapted for the no-intercept versions of the SEIVM (1.1) and FEIVM (1.1), where the intercept α is known to be zero, and respective CLT's for estimators that are appropriately based on vector (y, x, y 2 , xy, x 2 ) can thus be established.

Proof of Theorem 2.1
The proof of Theorem 2.1 is given for the MLSE's β 3n and α 3n , and the MME γ 3n only, simultaneously for the SEIVM (1.1) and FEIVM (1.1). The corresponding CLT's for ( β jn , α jn , γ jn ) for j = 1 and 2 can be established in similar ways and thus, the respective proofs are omitted here.
Proof of the (a) part of Theorem 2.1. From the proofs of Theorems 1-3 in [17], and the proofs of the (a) parts of Theorems 2.2 and 2.3 in [16], and also by following the lines of the proofs of Theorems 1.1.2c and 2.1.2c in [15], we have s. that means that δ − βε would have to be discretely distributed. However, this would violate (B).

Proof of the
and hence, also (3.87).
Proof related to Remark 2.3. In view of (3.83), it suffices to show that under (2.6) (in the SEIVM) and (2.7) (in the FEIVM), subject to the additional normality model conditions in [3] and comments on (B) in Remarks 2.8 and 2.9, the (a) part of Theorem 2.1 implies corresponding √ n−asymptotic normality of ( β jn , α jn , γ jn ), 1 ≤ j ≤ 3, as in [3,7,8]. Due to similar arguments, we consider here the case of j = 3 only.
in the SEIVM, , in the FEIVM.
where matrix D −1 AD −1 reduces to the covariance matrix of the corresponding asymptotically normal distribution that is obtained in [3], subject to the therein assumed additional normality model conditions spelled out in Remark 2.3 that replace our weaker assumption (B) that was used to conclude (3.91) here (for a summary on (B) we refer to Remarks 2.8 and 2.9).

Appendix: some results on Studentization of random vectors by a matrix and the generalized domain of attraction of the multivariate normal law
This appendix of some well-known results on recent advances in Studentization of random vectors by a matrix and the generalized domain of attraction of the multivariate normal law (GDAN) is provided here for convenience in references that are required for reading Section 3 of this paper. Motivated by the importance of having matrix Studentized CLT's for random vectors converging in distribution to a spherically symmetric random vector Z, which is such that all its Euclidean inner products Z, u with a deterministic vector u of unit Euclidean norm have the same ditribution coinciding with that of each single component of Z, Vu, Maller and Klass [22] established a rather general result of such a nature. Their result that follows herewith is essentially a general recipe of matrix "Slutskying" for random vectors. where Z is a spherically symmetric random vector in IR d , then Remark 4.1. We note that (4.1) amounts to saying that each entry of matrix C n V n C T n converges in probability to the corresponding entry of I d . As noted in [22], (4.1) implies that V n > 0 on sets whose probabilities converge to one, as n → ∞. Hence, the Cholesky and the symmetric positive definite square roots of V n and, consequently, V −T /2 n in (4.3) are well-defined.
Remark 4.2. This is a technical remark related to Theorem 4.1 and frequently used in Section 3. It is noted in [22] that if instead of (4.1) and (4.2) one assumes, as n → ∞,  [13] that if (4.6) holds, then E Z α < ∞ for all 0 ≤ α < 2 and a n can be taken as nEZ, while norming matrix B n is invertible for large enough n and may be chosen to be symmetric. Also, B n → 0, as n → ∞. As a general fact, it is also known that (4.6) implies that Z is full (cf. Lemma 3.3.3 in [18]).
Below we review some known equivalent characterizations of GDAN that are found in Theorem 1.1 of Maller [13] and are most relevant to the aims of this paper.  where b n is such that n i=1 (Z i − E Z)/b n D → N (0, 1), as n → ∞.
Remark 4.5. It is easy to see that if Z ∈ GDAN, then each component Z (j) of Z is in DAN, 1 ≤ j ≤ d. Indeed, the (c) part of Theorem 4.2 implies that The latter convergence is known as Lévy's necessary and sufficient condition for Z (j) to be in DAN (cf. [11]). On the other hand, assuming that all Z (j) ∈ DAN is not sufficient alone to guarantee that Z ∈ GDAN. Indeed, suppose that EZ = 0 and all Z (j) are identically distributed and belong to DAN. Then, from (4.7) with b n chosen to be the same for all sequences {Z Further, from Remark (ii) on p.193 of [13], via p.236 of [5], (4.9) is equivalent to (4.8) with Z in place of |Z (j) |, and such a form of (4.8) does not imply (c) of Theorem 4.2. However, the class of spherically symmetric random vectors Z is pointed out in Remark (ii) on p.217 of [13] as an exception in this regard, on account of each projection of Z having the same distribution coinciding with that of each single component Z (j) of Z, 1 ≤ j ≤ d.
For a multivariate Student statistic based on i.i.d. random vectors {Z, Z i , i ≥ 1} in GDAN, by Maller [13] and Vu, Maller and Klass [22], the following CLT holds true. Remark 4.6. As noted in Remarks of [22], the proof of Theorem 4.3 goes by Theorem 4.1, via combining (4.6) and the (b) part of Theorem 4.2 that presents a special case of (4.1). It is pointed out in [22] that, though Studentization in Theorems 4.1 and 4.3 can be performed by both the Cholesky and symmetric positive definite square roots, one's preference would depend on the problem in hand. It is also conjectured in [22] that for the purpose of transforming n i=1 (Z i − EZ) so as to have approximately a spherically symmetric distribution, the symmetric positive definite square root is likely a better choice to accomplish this in small samples.