The Annals of Mathematical Statistics

Ante-dependence Analysis of an Ordered Set of Variables

K. R. Gabriel

Abstract

For a set of variables in a given order, $s$th ante-dependence will be said to obtain if each one of the variables, given at least $s$ immediate antecedent variables in the order, is independent of all further preceding variables. If the number of variables is $p$, ante-dependence is of some order between 0 and $p - 1.$ 0th ante-dependence and $(p - 1)$st ante-dependence are equivalent to complete independence and to completely arbitrary patterns of dependence, respectively, and are defined irrespective of the ordering of the variables. 1st to $(p - 2)$nd ante-dependence are defined in terms of a specific order only. If $X_1, X_2, \cdots, X_p$ are multivariate normal, $s$th ante-dependence is equivalent to each $X_i$, given $X_{i - 1}, X_{i - 2}, \cdots, X_{i - s}, \cdots, X_{i - s - z}$, being uncorrelated with $X_{i - s - z - 1}, X_{i - s - z - 2}, \cdots, X_2, X_1$ for any non-negative $z$. In other words, the partial correlation of $X_i$ and $X_{i - s - z - k}$, given all the variables $X_{i - 1}, X_{i - 2}, \cdots, X_{i - s - z}$, is zero for all $i, k$ and $z$. The hypothesis that the covariance matrix is such that all the above partial correlations vanish will be denoted by $D_s (s = 0, 1, \cdots, p - 1)$, so that, for the multivariate normal distribution, $D_s$ denotes the hypothesis of $s$th ante-dependence. It is shown that for any set of ordered variables, normal or otherwise, $D_s$ is equivalent to the following correspondence between the regression equations of $X_i$ on all other variables, and on $X_{i - s}, X_{i - s + 1}, \cdots, X_{i = 1}, X_{i + 1}, \cdots, X_{i + s}$ only: the multiple correlations are equal, and the regression coefficients of $X_{i - s}, X_{i - s + 1}, \cdots, X_{i + s}$ are equal in both equations, all other coefficients in the former equation being zero. It is also equivalent to the $(p - s)(p - s - 1)/2$ elements in the upper right (and also lower left) corner of the inverse covariance matrix being zero. Indeed, any null hypothesis on a set of elements of the inverse covariance matrix may be formulated, and tested, as a hypothesis $D_s$ if the variables can be so ordered as to put the zero elements in the upper right and lower left corners of the inverse. Maximum likelihood estimates are derived under $D_s$ for the normal case. Likelihood ratio tests of any one $D_s$ against any other follow immediately and may be expressed in terms of the sample partial correlations. Exact distributions are not investigated, but for large samples $\chi^2$ approximations are available. Thus a sequence of tests of $D_{p - 2}$ under $D_{p - 1}, D_{p - 3}$ under $D_{p - 2}, \cdots, D_0$ under $D_1$, is obtained which, in effect, forms a breakdown of the large sample test of independence, $D_0$, under the general alternative, $D_{p - 1}$. The assumptions of ante-dependence are clearly analogous to those of Markov processes and autoregressive schemes for time series; the motivation for the study and application of these models is also similar. The present model is more general in that it relaxes the usual autoregression assumptions of equal variances and, more crucial, of equal correlations between all pairs of equidistant variables (distance being meant in terms of the order of the set or time series). This greater generality requires analysis of a sample of observations for the study of ante-dependence, whereas for autoregressive schemes there are methods of analysis based on a single observation of the time series. The ante-dependence models can be generalized to the case of several variables at each stage of the ordering. This would be analogous to the study of multiple time-series. $s$-ante-dependent sets of variables may be generated by $s$ successive summations of independent variables. This may be relevant for some applications of such models. Ante-dependence models might be applicable to observations ordered in time or otherwise. Observations on growth of organisms up to each of several ages could be analyzed in such a manner. Where growth is recorded on several dimensions, e.g., height and weight, the analysis would proceed in terms of the multidimensional generalization of the model. Other possible fields of application include batteries of psychological tests increasing in complexity, and data on the successive location of travelling objects. A study of some such applications is now under way.

Article information

Source
Ann. Math. Statist., Volume 33, Number 1 (1962), 201-212.

Dates
First available in Project Euclid: 27 April 2007

https://projecteuclid.org/euclid.aoms/1177704724

Digital Object Identifier
doi:10.1214/aoms/1177704724

Mathematical Reviews number (MathSciNet)
MR145611

Zentralblatt MATH identifier
0111.15604

JSTOR