Some Remarks on Systematic Sampling

Werner Gautschi

doi:10.1214/aoms/1177706966

June, 1957 Some Remarks on Systematic Sampling

Werner Gautschi

Ann. Math. Statist. 28(2): 385-394 (June, 1957). DOI: 10.1214/aoms/1177706966

Abstract

Consider a finite population consisting of $N$ elements $y_1, y_2, \cdots, y_N$. Throughout the paper we will assume that $N = nk$. A systematic sample of $n$ elements is drawn by choosing one element at random from the first $k$ elements $y_1, \cdots, y_k$, and then selecting every $k$th element thereafter. Let $y_{ij} = y_{i + (j - 1)k}(i = 1, \cdots, k; j = 1, \cdots, n)$; obviously systematic sampling is equivalent to selecting one of the $k$ "clusters" $$C_i = \{y_{ij}; j = 1, \cdots, n\}$$ at random. From this it follows that the sample mean $\bar y_i = 1/n \sum^n_{j = 1} y_{ij}$ is an unbiased estimate for the population mean $\bar y = 1/N \sum^k_{i = 1} \sum^n_{j = 1} y_{ij}$ and that $\operatorname{Var} \bar y_i = 1/k \sum^k_{i = 1} (\bar y_i - \bar y)^2$. We will denote this variance by $V^{(1)}_{sy}$ indicating by the superscript that only one cluster is selected at random. $V^{(1)}_{sy}$ can be written as \begin{equation*}\tag{1}V^{(1)}_{sy} = S^2 - \frac{1}{k} \sum^k_{i = 1} S^2_i, \text{where} S^2 = \frac{1}{N} \sum^k_{i = 1} \sum^n_{j = 1} (y_{ij} - \bar y)^2,\end{equation*} \\ \begin{equation*} S^2_i = \frac{1}{n} \sum^n_{j = 1} (y_{ij} - \bar y_i)^2.\end{equation*} It is natural to compare systematic sampling with stratified random sampling, where one element is chosen independently in each of the $n$ strata $\{y_1, \cdots, y_k\}, \{y_{k + 1}, \cdots, y_{2k}\}, \cdots$, and with simple random sampling using sample size $n$. The corresponding variances of the sample mean will be denoted by $V^{(1)}_{st} V^{(n)}_{ran}$ respectively. We consider now the following generalization of systematic sampling which appears to have been suggested by J. Tukey (see [3], p. 96, [4], [5]). Instead of choosing at first only one element at random we select a simple random sample of size $s$ (without replacement) from the first $k$ elements and then every $k$th element following those selected. In this way we obtain a sample of $ns$ elements and, if $i_1, i_2, \cdots, i_s$ are the serial numbers of the elements first chosen, the sample mean $1/s(\bar y_{i_1} + \cdots + \bar y_{i_s})$ can be used as an estimate for the population mean. This sampling procedure is clearly equivalent to drawing a simple random sample of size $s$ from the $k$ clusters $C_i(i = 1, \cdots, k)$. It therefore follows (see, for example, [2], Chapter 2.3 to 2.4) that the sample mean is an unbiased estimate for the population mean and that its variance, which we denote by $V^{(s)}_{sy}$, is given by begin{equation*} \tag{2}V^{(s)}_{sy} = \frac{k - s}{ks} \frac{1}{k - 1} \sum^k_{i = 1} (\bar y_i - \bar y)^2 = \frac{1}{s} \frac{k - s}{k - 1} V^{(1)}_{sy}.\end{equation*} Again, it is natural to compare this sampling procedure with stratified random sampling, where a simple random sample of size $s$ is drawn independently in each of the $n$ strata $\{y_1, \cdots, y_k\}, \{y_{k + 1}, \cdots, y_{2k}\}, \cdots$ or with simple random sampling employing sample size $ns$. We denote the corresponding variances of the sample mean (which in both cases is an unbiased estimate for the population mean) by $V^{(s)}_{st} ,V^{(ns)}_{ran}$ respectively. From well-known variance formulae (see, for example, [2], Chapters 2.4 and 5.3) it follows that \begin{equation*}\tag{3}V^{(s)}_{st} = \frac{1}{s} \frac{k - s} {k - 1} V^{(1)}_{st},\\ V^{(ns)}_{ran} = \frac{N - ns}{s(N - n)} V^{(n)}_{ran} = \frac{1}{s} \frac{k - s}{k - 1} V^{(n)}_{ran}. \end{equation*} Thus the relative magnitudes of the three variances $V^{(s)}_{sy}, V^{(s)}_{st}, V^{(ns)}_{ran}$ are the same as for $V^{(1)}_{sy}, V^{(1)}_{st}, V^{(n)}_{ran}$, of which comparisons were made for several types of populations by W. G. Madow and L. H. Madow [6] and W. G. Cochran [1]. Some of the results will be reviewed in Section 3. The object of this note is to compare systematic sampling with $s$ random starts, as described above, with systematic sampling employing only one random start but using a sample of the same size $ns$. To make this comparison we obviously have to assume that $k$ is an integral multiple of $s$, say $k = ls$. The latter procedure then consists in choosing one element at random from the first $l$ elements $\{y_1, \cdots, y_l\}$ and selecting every $l$th consecutive element. We denote the variances of the sample mean of the two procedures by $V^{(s)}_k, V^{(1)}_l$ respectively, indicating by the subscript the size of the initial "counting interval." (In our notation $V^{(s)}_{sy} \equiv V^{(s)}_k$.) We shall show in Section 4 that $V^{(1)}_l = V^{(s)}_k$ in the case of a population "in random order," but $V^{(1)}_l < V^{(s)_k$ for a population with a linear trend or with a positive correlation between the elements which is a decreasing convex function of their distance apart. Some numerical results on the relative precision of the two procedures will be given in Section 5 for the case of a large population with an exponential correlogram.