Central limit theorems in linear structural error-in-variables models with explanatory variables in the domain of attraction of the normal law

Linear structural error-in-variables models with univariate observations are revisited for studying modified least squares estimators of the slope and intercept. New marginal central limit theorems (CLT's) are established for these estimators, assuming the existence of four moments for the measurement errors and that the explanatory variables are in the domain of attraction of the normal law. The latter condition for the explanatory variables is used the first time, and is so far the most general in this context. It is also optimal, or nearly optimal, for our CLT's. Moreover, due to the obtained CLT's being in Studentized and self-normalized forms to begin with, they are a priori nearly, or completely, data-based, and free of unknown parameters of the joint distribution of the error and explanatory variables. Consequently, they lead to a variety of readily available, or easily derivable, large-sample approximate confidence intervals (CI's) for the slope and intercept. In contrast, in related CLT's in the literature so far, the variances of the limiting normal distributions, in general, are complicated and depend on various, typically unknown, moments of the error and explanatory variables. Thus, the corresponding CI's for the slope and intercept in the literature, unlike those of the present paper, are available only under some additional model assumptions.


Introduction
In the linear error-in-variables model (EIVM) of this paper we observe pairs (y i , x i ) ∈ IR 2 according to where ξ i are unknown explanatory/latent variables, the real-valued slope β and intercept α are to be estimated, and δ i and ε i are unknown measurement error terms/variables, 1 ≤ i ≤ n, n ∈ IN. EIVM (1) is also known as a measurement error model, or structural/functional relationship, or regression with errors in variables. It is a generalization of the simple linear regression of form y i = βξ i + α + δ i in that in (1) it is also assumed that the two variables η := βξ + α and ξ are linearly related, and now not only η, but also ξ, are observed with respective measurement errors δ i and ε i . In this paper we deal with the so-called structural EIVM (SEIVM), the model where the explanatory variables ξ i are assumed to be independent identically distributed (i.i.d.) random variables (r.v.'s) that are independent of the error terms (cf. (C) below). The case of (1) with α known to be zero is distinguished in the literature as the model without intercept. Convenient notations that are introduced in this paper (cf. (5), (6)) allow us to study both the no-intercept model and the model with unknown α simultaneously.
Apart from making assumptions on the distribution of (ξ, δ, ε), to ensure identifiability of unknown parameters such as, for example, β and α in model (1), it is common in the literature to make use of some side conditions in this regard, usually as conditions on the matrix Γ of (2) in (A). There are only a few frequently used identifiability assumptions, and here we deal with two of them that read as follows: Var δ = λθ and cov(δ, ε) = µ are known, while Var ε=θ is unknown; (3) Var ε = θ and cov(δ, ε) = µ are known, while Var δ=λθ is unknown.
For further use throughout, for real-valued variables {u i , 1 ≤ i ≤ n} and {v i , 1 ≤ i ≤ n}, we put with constant c = 0 , if intercept α is known to be zero, 1 , if intercept α is unknown.
We note that the estimators in (7) and (8) coincide in form with the respective maximum likelihood estimators for β and α that are derived and studied in the model (1) under the assumption that vector (ξ, δ, ε) is normally distributed.
In this paper, we revisit the estimators in (7) and (8), and prove several central limit theorems (CLT's) for each of them (cf. Theorems 1-3, and Corollary 1), under the distribution-free DAN assumption in (B) on the explanatory variables that, to the best of our knowledge, is the most general ever used so far in this context (cf. Remark 2). Moreover, according to Theorem 1, it is also optimal for the CLT's therein, and is nearly optimal for the CLT's in Theorems 2 and 3 (cf. Remark 3). As to the condition (A) on the error terms here, it seems to be the least restrictive that has been considered in the literature thus far.
Further to the special features of our CLT's in Theorems 1-3, all these CLT's are in Studentized or self-normalized forms to begin with and hence are automatically nearly, or completely, data-based. Namely, as compared to the CLT's for β and α in the literature, Theorems 1-3 are a priori free of any unknown parameters of the distribution of (ξ, δ, ε) (cf. Remarks 3, 4), and, consequently, the corresponding large-sample approximate confidence intervals for the slope β and intercept α are readily available, or easily derivable as in our main Theorem 4 (cf. the corresponding subsection right below Remark 8 for details).
Throughout Section 2, we pay a special attention to the SEIVM's (1) when Var ξ = ∞ (as allowed by (B) in view of Remark 1). Distinctive features of such models are obtained in Remarks 5-7 (cf. also Remark 9) that are seen to underpin our informal observation: the impact of the errors with finite variances in x i of (1) becomes automatically negligible as compared to that of the explanatory variables with Var ξ = ∞, and such SEIVM's are then close in spirit to, and behave like, the simple linear regression models y i = βx i + α + δ i .
All the CLT's of this paper have strongly been inspired and influenced by recent advances in DAN via Studentization and self-normalization that are summarized in Csörgő, Szyszkowicz and Wang (2004). These developments prompted us to enrich the traditional two-moment space of the explanatory variables that has been used so far for CLT studies in SEIVM's (1) by allowing ξ i the first time to be simply in DAN with possibly infinite variance. For the use of DAN in some other regression models, we refer to Maller (1981) and Remark (iv) in Maller (1993).

Main results
We establish Studentized and self-normalized CLT's for each of the estimators in (7) and (8) in Theorems 1-3, and, together with the CLT's of Corollary 1 and the confidence intervals of Theorem 4, they constitute the main results of this paper.
Theorem 1. Assume that the intercept α in (1) is known to be zero, {ξ, ξ i , i ≥ 1} are i.i.d.r.v.'s, E|ξ| < ∞, and (A) and (C) hold true. Put (9) Then for j = 1 and 2, (B) is equivalent to any one of the following CLT's: as n → ∞, Yu. V.  Theorem 2. Assume (A)-(C). Let Then, for j = 1 and 2, the CLT's in the (a) and (b) parts of Theorem 1 hold true, and also, Theorem 3. Assume (A)-(C). For j = 1 and 2, let where U (j, n), u i (j, n) and u i (j, n) are as in (9) and (10). Then, for j = 1 and 2, as n → ∞, Remark 2. It follows from Cheng and Tsai (1995) that, under (A), 0 < Var ξ < ∞, (C), independence of δ and ε, and E ξ 4 < ∞, β jn and α jn are √ nasymptotically normal, for j = 1 and 2. In Theorem 1.2.1 of Fuller (1987), among other things, the same conclusion is derived for β 2n and α 2n under the condition that (ξ, δ, ε) is normally distributed with a positive definite diagonal covariance matrix. While Theorems 1-3 seem to present the first CLT's for SEIVM's (1) if Var ξ = ∞ (as allowed by (B) in view of Remark 1), they also imply the just mentioned CLT's for β jn and α jn when Var ξ < ∞. Indeed, under the conditions of the specified CLT's in the literature, using the arguments from the proof of upcoming Corollary 1, expressions U −2 (j, n) n i=1 (u i (j, n) − u(j, n) ) 2 /(n − 1) and n i=1 (v i (j, n)−v(j, n) ) 2 /(n−1) from the respective (a) parts of Theorems 1 and 3, j = 1, 3, can be seen to converge in probability to positive constants that are the variances of the asymptotic normal distributions of the corresponding estimators obtained in Cheng and Tsai (1995) and Fuller (1987).
Remark 3. Due to their Studentized and self-normalized forms, the CLT's of Theorems 1-3 are invariant with respect to the distribution of (ξ, δ, ε) satisfying (A)-(C), and are strikingly free of any unknown parameters of this distribution (depend only on the error moments that are assumed to be known in the identifiability assumptions (3) or (4)). In addition, (11) of Theorem 2 and the (b) part of Theorem 3 present completely data-based CLT's, while β is the only unknown parameter appearing in the respective normalizers ( n i=1 (u i (j, n) − u(j, n) ) 2 /(n − 1)) −1/2 , n i=1 u i (j, n) 2 −1/2 and ( n i=1 (v i (j, n) − v(j, n) ) 2 /(n − 1)) −1/2 of the CLT's of Theorem 1 that, according to Theorem 2, also holds true for the case of unknown α, as well as of the CLT's of the (a) part of Theorem 3. Consequently, large-sample approximate confidence intervals for β and α are readily available from (11) of Theorem 2 and the (b) part of Theorem 3, while those that follow from (a) and (b) of Theorem 1 are easily derivable. For the expressions for, and further discussions on, all these confidence intervals, we refer to the corresponding subsection, right below Remark 8. We also note that while (B) is optimal for the CLT's of Theorem 1 in the case of the no-intercept version of model (1), this condition is also optimal for the model (1) with an unknown intercept for the main terms in the expansions for the Studentized and self-normalized β jn and α jn as in the CLT's of Theorems 2 and 3 (cf. respective proofs of Theorems 2 and 3, proof of Lemma 8, and conclusion of Lemma 7).
Remark 4. On account of their a priori Studentized and self-normalized forms, and also due to their respective features described in Remark 3, the CLT's of Theorem 1-3 appear to be also new when Var ξ < ∞ (a special case of (B) in view of Remark 1). Indeed, as opposed to Theorems 1-3, in the CLT's for β jn and α jn in Cheng and Tsai (1995), j = 1 and 2, that are proved under Var ξ < ∞ (cf. Remark 2 for details), the expressions for the variances of the asymptotic normal distributions of β jn and α jn are complicated and involve typically unknown, hard-to-estimate from data, moments of order ≤ 4 of the error terms, in addition to the unknown parameters β, E ξ and Var ξ. Then, in order to be able to estimate the variances of the therein derived CLT's, it is additionally assumed that the errors δ and ε are normally distributed. Consequently, the variances of the latter CLT's become simpler in form and contain only the unknown, but estimable β, E ξ, Var ξ and λθ (or θ). To handle similar difficulties with estimating the respective asymptotic variances of β 2n and α 2n in Theorem 1.2.1 of Fuller (1987), the condition that (ξ, δ, ε) is normally distributed and has a positive definite diagonal covariance matrix is used. Consistent estimators for the respective asymptotic variances of β jn and α jn that are proposed in the mentioned works are different from the expressions U −2 (j, n) n i=1 u 2 i (j, n)/n and n i=1 ( v i (j, n) − v(j, n) ) 2 /(n − 1), respectively taken from Theorems 2 and 3. When Var ξ < ∞, due to (57) with η i (n) = u i (j, n) and η i (n) = v ′ i (j, n), (55), (58), (60), (61), (64), (68) and (69), the latter expressions appear to be the first consistent estimators for the just mentioned variances simply under (A).
In Theorems 1-3, the rates of convergence to normality of β jn and α jn are not apparent. For the sake of explicitly displaying these rates, we introduce the following direct consequence of Theorems 1-3.
is a typically unknown slowly varying function at infinity as in Remark 1 that converges to infinity when Var ξ = ∞, and equals to a positive constant when Var ξ < ∞, while c j and d j are positive constants.
Remark 5. From Corollary 1, β jn are seen to be √ nℓ ξ (n)-asymptotically normal estimators of β. In this regard we note that when Var ξ = ∞, the degree of precision of β jn increases as compared to the case Var ξ < ∞. This effect rhymes well with our empirical expectation in that, intuitively, by letting ξ i in (1) to have an infinite deviation, we make them more dominant over the errors with finite variances. This, in turn, renders observations y i and x i to be more robust to noise (errors) and thus, more precise. As to the estimators α jn , according to Corollary 1, they are √ n−asymptotically normal, regardless of whether Remark 6. We observe that if Var ξ = ∞, then (3) and (4), as well as any other identifiability conditions, are unnecessary for constructing consistent estimators for β and α. For example, using similar arguments to those in (60) and (61), it can be shown that S yy /S xy and S xy /S xx are consistent estimators for β. The existence of these consistent estimators implies that β is identifiable. The latter fact, when (δ, ε) has a normal distribution, can also be concluded from Reiersøl (1950), as accordingly, β is identifiable if and only if ξ is not normally distributed.
As to consistent estimators for α under Var ξ = ∞, one hasȳ − x S yy /S xy and y − x S xy /S xx .
Remark 7. Condition Var ξ = ∞ can also be related to one of the frequently used identifiability assumptions for SEIVM's (1) that reads as follows: where c is as in (6). In the case of Var ξ < ∞, the coefficient k ξ plays a key role in the large sample theory of regression with errors in variables (1). In particular, k ξ adjusts the ordinary least squares estimator S xy /S xx of the simple linear regression y i = βx i + α + δ i for consistency in (1) (that holds under (A) with µ = 0 in (2), (C) and 0 < Var ξ < ∞) as follows: k −1 ξ S xy /S xx . Now, defining k ξ of (13) to be 1 if Var ξ = ∞, we have k −1 ξ S xy /S xx = S xy /S xx , where the latter expression is one of the two proposed estimators for β under Var ξ = ∞ in Remark 6 that, in turn, coincides with the ordinary least squares estimator and does not require an adjustment via k ξ in (1) any more. The following view of the SEIVM (1) under Var ξ = ∞ may shed light on this phenomenon. In x i of (1), the impact of the error terms with finite variances is negligible as compared to that of the explanatory variables with Var ξ = ∞, and the model becomes close in spirit to, and behaves like, the simple linear regression y i = βx i + α + δ i . We also note that if Var ξ < ∞, then the reliability ratio k ξ usually has to be estimated from prior information for the sake of further use in inference in EIVM's. However, under Var ξ = ∞, no estimation of k ξ := 1 is necessary.
Remark 8. The present paper constitutes a part of Martsynyuk (2004), where, among other things, in the same context and spirit, the author studies weak/strong consistency and asymptotic normality of least squares estimators for the slope and intercept, as well as of methods of moments estimators for the error variances, under (3) and (4), and yet another identifiability condition that assumes that the matrix Γ of (2) is known at least up to an unknown multiple θ = Var ε.
As to the problem of proving new similarly featured CLT's for estimating (β, α) under the same model assumptions as those used in Theorem 1-3, namely under (A)-(C), we again refer to Martsynyuk (2004). The author's Ph.D. thesis Martsynyuk (2005), among other things, also extends the just mentioned contributions of Martsynyuk (2004) regarding the SEIVM's to their traditional companions, the functional EIVM's (1), where the explanatory variables ξ i are assumed to be deterministic.

Confidence intervals for slope β and intercept α
Abbreviations LSA and CI stand respectively for large-sample approximate and confidence interval, while z γ/2 denotes the 100(1 − γ/2) th percentile of the standard normal distribution, 0 < γ < 1.
In the SEIVM's (1) studied under (3) or (4), LSA CI's seem to be the only source of CI's for the slope β and intercept α. In order to work out completely data-based LSA CI's for β and α from the corresponding CLT's in the literature, additional conditions of normality for the errors alone, or together with those on the explanatory variables have been used (cf. Remark 4). In contrast, on account of Theorems 1-3, computable LSA CI's for β and α are now available under the very general distribution-free assumptions in (A)-(C) (cf. Remark 3), and also the first time under Var ξ = ∞, as a special case of (B). Thus, for j = 1 and 2, completely data-based CLT's of (11) of Theorem 2 and (b) of Theorem 3 imply readily available respective LSA 1 − γ CI's for β and α, 0 < γ < 1, as follows: Moreover, under the usual assumption that Var ξ < ∞, (14) and (15) (3) and (4), in addition to the CI of (14), another two new LSA CI's for β are also within reach. Namely, these are the LSA CI's for β from the Studentized and self-normalized CLT's in (a) and (b) of Theorem 1 that, according to Theorem 2, also hold true for the SEIVM (1) with an unknown intercept α. In these CLT's, β is left unestimated in the nor- , respectively, j = 1 and 2, as opposed to the corresponding CLT's of (11) of Yu. V.  Theorem 2 that are used to obtain (14). In Theorem 4, we obtain such CI's only under the identifiability assumption (3) (case j = 1). The case when that of (4) is being assumed can be handled similarly.
Theorem 4. Assume (A) and (C) and that E|ξ| 8/3 < ∞, which, in turn, implies (B). Then, both for the no-intercept and unknown intercept versions of model (1), from the (a) and (b) parts of Theorem 1 with j = 1, respective LSA 1 − γ CI's for β are where for l = 1 and 2, with and Individual and, in case of the CI's for β, also comparative performances of the obtained CI's in (14), (15), and (16), together with its corresponding analogues in terms of β 2n , are to be further investigated, and is a subject of the author's ongoing research.
Remark 9. Further to our discussions in Remark 7 on the reliability ratio k ξ defined in (13), we note that this coefficient k ξ has also played a key role in the literature so far in determining reasonably accurate LSA CI's for β and α in SEIVM's (1). In particular, liabilities that some of LSA CI's for β in SEIVM's (1) under Var ξ < ∞ may suffer due to the so-called Gleser-Hwang effect (cf. Gleser and Hwang (1987)) are reasonably negligible in SEIVM's if the reliability ratio k ξ (k ξ < 1) is far enough from zero (cf. Gleser (1987) for details). Though Gleser (1987) only deals with a specific LSA CI for β, assuming in this regard that the ratio of the uncorrelated variances is known, the just mentioned main conclusion is likely to be true for other available LSA CI's for β, and those for α in the SEIVM (1), when Var ξ < ∞. In particular, it is desirable to rigorously support a reasonable belief that big enough k ξ will lead to accurate enough CI's in (14)-(16), as well as in the analogues of (16) in terms of β 2n , in the sense of having a negligible Gleser-Hwang effect. A common sense behind Gleser (1987) is that if k ξ = 0, then Var ξ = 0 and SEIVM (1) becomes degenerate, i.e., (1) reduces to y i = E ξ β + α + δ i and x i = E ξ + ε i , where the explanatory variables do not vary any more, and thus it becomes impossible to fit a unique straight line through the data points. As opposed to the latter degenerate model, in SEIVM's (1) under Var ξ = ∞, the explanatory variables are so well spread that they dominate the error terms in the sense that, according to our extended definition of k ξ in Remark 7, k ξ := 1. Hence, it is only natural to conjecture that the Gleser-Hwang effect in regards of LSA CI's for β and α disappears for such SEIVM's that in Remark 7 were seen to behave as if they were like the simple regression y i = x i β + α + δ i .

Some auxiliary results
In this subsection we state some well-known results on DAN as Lemmas 1-4, give a simple alternative proof in the context of this paper for the first part of Lemma 4 on a characterization of DAN, and also establish a companion characterization as Lemma 5. Further developments in Section 3.2 leading to the proofs of the main results of Section 2 are based on these lemmas.
Hereafter, abbreviation WLLN stands for the Kolmogorov weak law of large numbers.
One of the several necessary and sufficient conditions for {Z, Z i , i ≥ 1} to be in DAN is commonly associated with O'Brien (1980) as, e.g., in Giné, Götze and Mason (1997) (for more details see also Remark (iii) in Maller (1993), p.194), and it reads as follows.
The following result was rediscovered by Maller (1981), and is essentially a variation of Theorems 4 and 5 on pp. 143-144 in Gnedenko and Kolmogorov (1954).
The Giné, Götze and Mason (1997) fundamental characterization of DAN via Studentized or self-normalized partial sums can be stated as follows.
Then, conditions Z ∈ DAN and E Z = a are equivalent to any one of the following CLT's: The first part of the following Lemma 4 that is due to Maller (1981) says that the DAN class of r.v.'s is closed under multiplication operation. Its second part amounts to a converse. Lemma 4 is applied in Maller (1981) to prove the asymptotic normality of the regression coefficient in a linear regression when the error variance is not necessarily finite. The proof of Lemma 4 in Maller (1981) is quite technical, and is based on checking the classical conditions of Theorem 2 on p. 128 in Gnedenko and Kolmogorov (1954) guaranteeing similar convergence in distribution to that in (B) for suitably chosen constants b n . In the present context, we present a new, simpler and shorter proof of the first part of Lemma 4 under additionally assuming for V ∈ DAN therein that E V 4 < ∞. The latter assumption rhymes with our conditions E δ 4 < ∞ and E ε 4 < ∞ in (A), while assumption U ∈ DAN of the first part of Lemma 4 coincides with our condition (B) for ξ. The converse part of Lemma 4 is stated below without a proof.
Proof. . We prove here only the first part of Lemma 4, assuming additionally to V ∈ DAN that E V 4 < ∞.
If E U 2 < ∞, then, since U and V are independent and nondegenerate (both are in DAN), we have Var Suppose now that E U 2 = ∞. First, without loss of generality, we assume that E V 2 = 1 and prove the following key observation: Since U ∈ DAN , then U − E U ∈ DAN and, combining Lemma 3 and (3.7) of Giné, Götze and Mason (1997), one of the key results of that paper, we have For any ε > 0, on account of independence of U and V , and (ii), i.e., we have (i). Furthermore, without loss of generality, we can assume that E U = 0, since when E U = 0, it is easy to see that Indeed, if (U − E U )V ∈ DAN , then Lemma 2, (i) and the fact that E U 2 = ∞ yield, as n → ∞, where the slowly varying function ℓ(n) ր ∞ is such that ( Hence, on account of (iv) and (v), with ℓ(n) from (iv), as n → ∞, Continuing the proof when E U = 0 and E U 2 = ∞, by Lemma 1, one needs to verify that or, on account of (i), that The latter, in turn, easily follows from Markov's inequality, independence of U and V and (ii), as follows: for every ε > 0, Remark 10. Further to Lemma 4, from Remark on p.183 in Maller (1981) we learn that if U ∈ DAN , V ∈ DAN and E V 2 < ∞, then → N (0, 1), n → ∞, with the same slowly varying function at infinity ℓ(n), where, mutatis mutandis, ℓ(n) is as in Remark 1, i.e., accordingly featured for {U, U i , i ≥ 1}. A simple proof of this fact under additionally assuming that E V 4 < ∞ amounts to (iv), where E V 2 = 1.
The results of Lemma 4 will usually be coupled with those of the next one.
(b) Since E U 2 < ∞ and also E|U V | < ∞, then E|(U + V )(−U )| < ∞, and via applying the (a) part of Lemma 5 to U + V and −U , we conclude that V ∈ DAN .

Auxiliary results and proofs of main results
In the sequel, all vectors are row-vectors, and ·, · stands for Euclidean inner product of two vectors. If Z is a d−dimensional vector, then Z (j) is its j th component, while Z (k,k+l) = (Z (k) , Z (k+1) , · · · , Z (k+l) ) is a subvector of Z that has all the components of Z starting with Z (k) and ending with The proofs of the main results of Section 2 require some auxiliary results. First, in Lemmas 6 and 7, and Corollary 2, we will study Studentized and selfnormalized partial sums that are based on i.i.d.r.v.'s { ζ i , b , 1 ≤ i ≤ n}, where b ∈ IR 7 is a nonzero vector of constants and with constant c as in (6) and m := E ξ. Such sums are the respective prototypes of the Studentized and self-normalized β jn as in (a) and (b) of Theorem 1 in the no-intercept version of model (1), j = 1 and 2. In (1) with an unknown intercept, the Studentized and self-normalized partial sums that are based on { η i (n), d , 1 ≤ i ≤ n} play the same role, where and a nonzero vector of constants d ∈ IR 5 is such that The results in Lemma 8 and Corollary 3 for such partial sums will also be applied to derive (11) of Theorem 2 and Theorem 3 for α jn , j = 1 and 2. Moreover, the use of all the auxiliary results in this section can go beyond the immediate needs of this paper (cf. Remark 11). At the end of this subsection, we also prove Corollary 1 and obtain the CI's of Theorem 4. Introduce vector Lemma 6. Assume (A)-(C). When b (1) = b (2) = 0, assume additionally that Then, as n → ∞,
Suppose now that |b (1) | + |b (2) | > 0. Then, on account of the first part of Lemma 4 and the fact that ξ ∈ DAN if and only if ξ − c m ∈ DAN , where, due to the fact that Γ of (2) is positive definite, (27) implies (26). Otherwise, that are now both from DAN. Two conditions of (a) of Lemma 5 are left to be verified. First, from the finiteness of E ξ and the fourth error moments, and independence of ξ and (δ, ε), it is seen that If Var ξ = ∞, then the second assumption of the (a) part of Lemma 5, i.e., that P ( ζ 0 , b = 0) = 1, is automatically satisfied, since from |b (1) | + |b (2) | > 0, it follows that Var ζ 0 , b = ∞. It is left to be shown that Var ζ 0 , b > 0 under assuming Var ξ < ∞ and |b (1) | + |b (2) | > 0. Consider the covariance matrix and A T denote the transpose of matrix A. By performing a straightforward multiplication, it can be verified that (30) is the product of three 7 × 7 block diagonal matrices, namely, where O is a zero matrix of an appropriate size, I 3 is a 3 × 3 identity matrix and Γ is as in (2). Since Var ξ > 0 (ξ ∈ DAN ), b (1,2) = 0 (|b (1) | + |b (2) | > 0) and Γ is positive definite by (A), then, on using (32), we conclude that Corollary 2. Assume (A)-(C), and, when b (1) = b (2) = 0, also (24). Then, as n → ∞, where ℓ ξ (n) is a slowly varying function at infinity as in Remark 1.
Proof of Theorem 1. The proof is due to Lemma 7 and the following representations: When e (1) = e (2) = 0, assume additionally (24), with e in place of b. Then, as n → ∞, Proof. On account of (22), where vectors ζ i and e are as in (20) and (37), c is from (6) and term R i (n) is If intercept α is known to be zero, i.e., c = 0, then the Studentized and self-normalized partial sums in (25) with vector e in place of b, and respectively those in (38) coincide in view of (39). Therefore, the CLT's of Lemma 8 amount to those of Lemma 6.
Suppose now that α is unknown, i.e., c = 1. First, we will show that √ n R(n) = o P (1) , if Var ξ < ∞ and/or e (1) = e (2) = 0, with slowly varying function at infinity ℓ ξ (n) as in Remark 1. For the summands in from the CLT for δ, (B) and Remark 1, as n → ∞, we conclude √ n δ (ξ − m) and, similarly, (44) This proves (41) that, combined with (34) and (35) Now, in view of the first CLT in (25) of Lemma 6, (39) and (45), to complete the proof of the Studentized CLT in (38), it suffices to show that, as n → ∞, where and, as a consequence of (39), By (34), (35) and the Cauchy-Schwarz inequality, to show (46), it suffices to prove that On using the Cauchy-Schwarz inequality again, (49) follows from the following statements for the corresponding summands in (47) and (48): as n → ∞, Similarly, (50) holds true on account of having where we have applied WLLN, the CLT for δ and the Marcinkiewicz-Zygmund law of large numbers for (ξ i − m) 2 , where E|(ξ i − m) 2 | 1/2 < ∞. (51) is obtained in the same manner. All (52)-(54) are handled similarly, and easily result from the Cauchy-Schwarz inequality and the WLLN under (A). This completes the proof of (46), and hence also that of the first CLT in (38). The latter CLT implies that, as n → ∞, which combined with the Cauchy-Schwarz inequality proves that The Studentized CLT in (38) and (55) lead to the second, self-normalized CLT in (38).
Corollary 3. Let all the assumptions of Lemma 8 be satisfied. Then, as n → ∞, with slowly varying function at infinity ℓ ξ (n) as in Remark 1, and vector e of (37).
Remark 11. Lemmas 7, 8 and Corollaries 2, 3 are rather versatile and, apart from the needs of this paper, can also be applied to establish Studentized and self-normalized marginal CLT's for other estimators that are appropriately based on the vector (y, x, S yy , S xy , S xx ) in the context of the SEIVM (1) (cf., e.g., such CLT's for the weighted least squares estimators for β and α, and for methods of moments estimators for the error variances λθ and θ proved in Martsynyuk (2004)).
Proof of (a) and (b) of Theorem 1 under the conditions of Theorem 2. In view of Theorem 1, we only need to argue that (a) and (b) of Theorem 1 also hold true in the model (1) with an unknown intercept, provided (A)-(C) are assumed. This is a consequence of Lemma 8 and the representations in (36) for j = 1 and 2, where now u i (j, n) = η i (n), d j , with d 1 = (0, 0, 1, −β, 0) and d 2 = (0, 0, 0, 1, −β) that satisfy (22), and with vectors e 1 and e 2 corresponding to e as in (37) that are equal to b 1 and b 2 as specified in the proof of Theorem 1.
Proof of the (a) part of Theorem 3. Below we consider the case of α 2n only, as the respective CLT for α 1n can be proved in a similar way.
Proof of Corollary 1 for α jn . Results from the (a) part of Theorem 3, (64), (68) and (57) with η i (n) = v i (j, n) that are as in the proof of the (a) part of Theorem 3.
Proof of Theorem 4. Below we derive (16) for k = 1 only that corresponds to the Studentized CLT in the (a) part of Theorem 1 with j = 1 that, according to Theorem 2, also holds true in the SEIVM (1) with an unknown intercept. The proof for k = 2 is similar.
Consider the set On account of the CLT in the (a) part of Theorem 1 with j = 1, P (C 1n (β)) → 1 − γ, n → ∞.