December, 1963 Efficient Utilization of Non-Numerical Information in Quantitative Analysis General Theory and the Case of Simple Order
Robert P. Abelson, John W. Tukey
Ann. Math. Statist. 34(4): 1347-1369 (December, 1963). DOI: 10.1214/aoms/1177703869

## Abstract

Suppose a single contrast $y = \sum c_j u_j$, where $\sum c_j = 0$, is to be tested as a basis for detecting differences among unknown parameters $\mu_j$, where $y_j = \mu_j + \epsilon_j$, and the $\epsilon_j$ are independent and normally distributed with mean zero and variance $\sigma^2$. Write $\mu_j = \alpha + \beta x_j$. Then the problem is to detect $\beta \neq 0$. If $\sum x_j = 0$, and $\sum x^2_j = 1$, the noncentrality of $y$, referred to its standard deviation, is $(\beta/\sigma)$ times the formal correlation coefficient $r$ between the $c_j$ and the $x_j$. If the $x_j$ are known, the $c_j$ can be chosen to make the correlation unity. If the $x_j$ are wholly unknown, no single contrast can guarantee power in detecting $\beta \neq 0$. Intermediate situations, where we know something but not everything about the $x_j$, occur frequently. If our knowledge can be placed in the form of linear inequalities restricting the $\mu_j$ (equivalently the $x_j$) the problem of choosing a contrast $\{c_j\}$ which will give relatively good power against the unknown (latent) configuration $\{x_j\}$ is a relatively manageable one. The problem is to obtain a large value of $r^2$ between $\{c_j\}$ which is at our choice, and $\{x_j\}$, which is only partially known. A conservative approach is to try to select the $\{c_j\}$ so that the minimum value of $r^2$ compatible with the restrictions on $\{x_j\}$ is maximized, or nearly so. The maximization of minimum $r^2$ when response patterns are constrained by linear homogeneous inequalities leads to the mathematical problem of finding the geometric direction whose maximum angle with a given set of directions is least. The solution to this problem is characterized and proven unique (Sections 8, 17-20). No useful algorithm which is absolutely certain to reach the solution in a few steps appears to exist. However, procedures are discussed (Sections 10 and 11) which reach a solution relatively rapidly in the instances we have considered. The procedures are illustrated on selected examples (Sections 15-16). The general theory is applied (Sections 13-14) to the latent configuration defined by $x_1 \leqq x_2 \leqq x_3 \leqq \cdots \leqq x_n$, which we call simple rank order. A formula is found for the maximum contrast which maximizes minimum $r^2$, and its coefficients are given for $n \leqq 20$. The "linear-2-4" contrast, constructed from the usual linear contrast by quadrupling $c_1$ and $c_n$, and doubling $c_2$ and $c_{n-1}$, is a reasonable approximation to the maximum contrast for small or medium $n$, and its minimum $r^2$ remains above 90{\tt\%} of the maximum possible for $n \leqq 50$ (Table 2). Knowing only simple rank order for the $\mu_j$, good practice seems to indicate the use of "maximum" or "linear-2-4" contrasts in careful work. If more information or insight about the $x_j$ is available, some other contrast may be preferable.

## Citation

Robert P. Abelson. John W. Tukey. "Efficient Utilization of Non-Numerical Information in Quantitative Analysis General Theory and the Case of Simple Order." Ann. Math. Statist. 34 (4) 1347 - 1369, December, 1963. https://doi.org/10.1214/aoms/1177703869

## Information

Published: December, 1963
First available in Project Euclid: 27 April 2007

zbMATH: 0121.13907
MathSciNet: MR156411
Digital Object Identifier: 10.1214/aoms/1177703869