The Annals of Statistics

Estimation in a Multivariate "Errors in Variables" Regression Model: Large Sample Results

Leon Jay Gleser

Full-text: Open access


In a multivariate "errors in variables" regression model, the unknown mean vectors $\mathbf{u}_{1i}: p \times 1, \mathbf{u}_{2i}: r \times 1$ of the vector observations $\mathbf{x}_{1i}, \mathbf{x}_{2i}$, rather than the observations themselves, are assumed to follow the linear relation: $\mathbf{u}_{2i} = \alpha + B\mathbf{u}_{1i}, i = 1,2,\cdots, n$. It is further assumed that the random errors $\mathbf{e}_i = \mathbf{x}_i - \mathbf{u}_i, \mathbf{x}'_i = (\mathbf{x}'_{1i}, \mathbf{x}'_{2i}), \mathbf{u}'_i = (\mathbf{u}'_{1i}, \mathbf{u}'_{2i})$, are i.i.d. random vectors with common covariance matrix $\Sigma_e$. Such a model is a generalization of the univariate $(r = 1)$ "errors in variables" regression model which has been of interest to statisticians for over a century. In the present paper, it is shown that when $\Sigma_e = \sigma^2I_{p+r}$, a wide class of least squares approaches to estimation of the intercept vector $\alpha$ and slope matrix $B$ all lead to identical estimators $\hat{\alpha}$ and $\hat{B}$ of these respective parameters, and that $\hat{\alpha}$ and $\hat{B}$ are also the maximum likelihood estimators (MLE's) of $\alpha$ and $B$ under the assumption of normally distributed errors $\mathbf{e}_i$. Formulas for $\hat{\alpha}, \hat{B}$ and also the MLE's $\hat{U}_1$ and $\hat{\sigma}^2$ of the parameters $U_1 = (\mathbf{u}_{11}, \cdots, \mathbf{u}_{1n})$ and $\sigma^2$ are given. Under reasonable assumptions concerning the unknown sequence $\{\mathbf{u}_{1i}, i = 1,2,\cdots\}, \hat{\alpha}, \hat{B}$ and $r^{-1}(r + p)\hat{\sigma}^2$ are shown to be strongly (with probability one) consistent estimators of $\alpha, B$ and $\sigma^2$, respectively, as $n \rightarrow \infty$, regardless of the common distribution of the errors $\mathbf{e}_i$. When this common error distribution has finite fourth moments, $\hat{\alpha}, \hat{B}$ and $r^{-1}(r + p)\hat{\sigma}^2$ are also shown to be asymptotically normally distributed. Finally large-sample approximate $100(1 - \nu){\tt\%}$ confidence regions for $\alpha, B$ and $\sigma^2$ are constructed.

Article information

Ann. Statist., Volume 9, Number 1 (1981), 24-44.

First available in Project Euclid: 12 April 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier


Primary: 62H10: Distribution of statistics
Secondary: 62E20: Asymptotic distribution theory 62F10: Point estimation 62F25: Tolerance and confidence regions 62H25: Factor analysis and principal components; correspondence analysis 62P15: Applications to psychology 62P20: Applications to economics [See also 91Bxx]

Errors in variables linear functional equations factor analysis multivariate linear regression least squares generalized least squares orthogonally invariant norm maximu likelihood strong consistency asymptotic distributions approximation confidence regions principal componets


Gleser, Leon Jay. Estimation in a Multivariate "Errors in Variables" Regression Model: Large Sample Results. Ann. Statist. 9 (1981), no. 1, 24--44. doi:10.1214/aos/1176345330.

Export citation