## The Annals of Mathematical Statistics

### Relationship of Generalized Polykays to Unrestricted Sums for Balanced Complete Finite Populations

Edward J. Carney

#### Abstract

The polykays of Tukey [8] and the bipolykays of Hooke [7] were generalized by dayhoff [4] for arbitrary balanced complete finite population structures. The expected mean squares in the analyses of variance of such structures may by expressed as linear function of variance components corresponding to the factors classifying the population. Since the variance components serve to measure the relative influence of the factors it is often desired to estimate these quantities. Unbiased estimates may be obtained by substituting observed mean squares for population mean squares and solving the resulting linear equations for the variance components. Alternative expressions for the expected mean squares involve linear functions of quantities called capsigmas (Wilk [10], Zyskind [11], White [9]), and the variance component estimates may be expressed as linear functions of sample cap sigmas. Dayhoff [3] shows that cap sigmas are generalized polykays of degree two and that the variances and covariances of variance components are linear functions of generalized polykays of degree four. Since generalized polykays have the property of inherence on the average (i.e., averages of sample generalized polykays over all random samples are the same generalized polykays of populations responses, the variances and covariances of the unbiased variance component estimates may be estimated unbiasedly by linear functions of generalized polykays of degree four. The generalized polykays are in turn linear functions of generalized symmetric means (Hooke [6], Dayhoff [4]) which may be computed directly from the observations. However, such computations are very difficult to carry out by hand because the formulas may involve thousands of distinct generalized polykays and generalized symmetric means for moderately large numbers of levels of the factors is not only impossible to carry out by hand but even for too costly using the most advanced digital computers, because a single generalized symmetric mean may require hundreds of billion of operations in its evaluation [2]. In order to make the computations of the variance-covariance matrix for estimated variance components economically feasible several requirements must be met. First the variance-covariance formulas in terms of the generalized symmetric functions must be generated by computer algorithms. Secondly, a way of computing the generalized symmetric means must be developed which significantly reduced the number of additions and multiplications. Thirdly, the above algorithms must be made general enough for application to the many possible balanced complete response structures which may be encountered as the relationship of nesting among the factors varies. In meeting the above requirements it is necessary to determine a logical system of relationships which allows the development of a few relatively simple algorithms to perform the various tasks on the computer, and which may be applied generally to the many different structures possible. The present paper results from an attempt to find such logical relationships among the various quantities which may be used in programs for digital computation of the variance-covariance matrix of estimated variance components. Study of the patterns of subscript restrictions which specify the generalized symmetric means leads to the development of algebraic relationships between the generalized polykys and the generalized symmetric means, which may be formulated in terms of a lattice of ordered partitions. Similar relationships exist between the numerators of the generalized symmetric means and quantities called unrestricted sums. These latter quantities may be computed much more efficiently than the generalized symmetric means themselves. (Hooke [6] gives an example of such a computation for a two factor structure using quantities similar to the unrestricted sums.) The various relationships, in addition to their intrinsic theoretical interest provide the necessary logical basis for the development of digital computer algorithms for performance of the algebraic and numerical computations for estimation of the variances and covariances of the variance component estimates.

#### Article information

Source
Ann. Math. Statist., Volume 39, Number 2 (1968), 643-656.

Dates
First available in Project Euclid: 27 April 2007

https://projecteuclid.org/euclid.aoms/1177698423

Digital Object Identifier
doi:10.1214/aoms/1177698423

Mathematical Reviews number (MathSciNet)
MR225459

Zentralblatt MATH identifier
0167.47302

JSTOR