## Abstract

Let $\{X_1, X_2, \cdots, X_N\}$ be an observed sequence from a stochastic process, where $X_i$ can take any one of $s$ values $1, 2, \cdots, s$. Let $f_\mathfrak{u}$ be the frequency of the $m$-tuple $\mathfrak{u} = (u_1, u_2, \cdots, u_m)$ in the sequence. Let $H'_n$ be the composite hypothesis that the process is a Markov chain of order $n$. Let $H_n$ be any simple hypothesis belonging to $H'_n$. Let $H^{\ast}_n$ be the maximum likelihood $H_n$. Let the expected value of $f_\mathfrak{u}$ in a new sequence of length $N$ given $H_n$ be $f_{\mathfrak{u},n}$, and given $H^{\ast}_n$ be $f^{\ast}_{\mathfrak{u},n}$. Let $$\Psi^2_{m,n} = \sum_\mathfrak{u} (f_\mathfrak{u} - f_{\mathfrak{u},n})^2/f_{\mathfrak{u},n},$$ $$\Psi^{\ast 2}_{m,n} = \sum_\mathfrak{u} (f_\mathfrak{u} - f^{\ast}_{\mathfrak{u},n})^2/f^{\ast}_{\mathfrak{u},n},$$ $$\Psi^{\ast 2}_{n + 1,n} = 0.$$ Good had proposed in [7] the following two conjectures: (a) that the asymptotic distribution $(N \rightarrow \infty)$ of $\Psi^{\ast 2}_{m,n}$, when $H'_n$ is true, is $$\ast^{m - n - 1}_{\lambda = 1} K_{g(\lambda)} (x/\lambda),$$ where $\ast$ denotes convolution, $g(\lambda) = (s - 1)^2s^{m - 1 - \lambda}$, and $K_i(x)$ is the $\chi^2$-distribution with $i$ degrees of freedom; (b) that the asymptotic distribution of $\Psi^2_{m,n}$, when $H_n$ is true, is $$\ast^{m - 1}_{\lambda = 1} K_{g(\lambda)}(x/\lambda)\ast K_{s - 1}(x/m),$$ mathematically independent of $n$. Conjectures (a) and (b) were proved by Billingsley [2] for the special case $n = 0$. For the special case $n = -1$ (by convention, $H'_{-1}$ is the hypothesis of equiprobable or perfect randomness (see [7])), Conjecture (b) was proved by Good [5] when $s$ is prime. In the present paper, Conjecture (a) will be proved for the general case $n \geqq -1$; conjecture (b) will be shown to be incorrect for $n > 0$, although a modified version of (b) will be proved for $n \geqq -1$. A third conjecture by Good [6] will also be proved here. It was assumed in these earlier papers, and it will be assumed here, that all transition probabilities in the Markov chain are positive; the results can be modified accordingly when some of these probabilities are zero (see [1] and [10]). Let $M_{m,n} = -2 \log\lambda_{n,m - 1}$, where $\lambda_{n,m - 1}$ is the ratio of the maximum likelihood given $H'_n$ to that given $H'_{m - 1}$ (see [6]). For $m = n + 2$, the statistics $\Psi^{\ast 2}_{m,n}$ is asymptotically equivalent, when $H'_n$ is true, to the likelihood ratio statistic $M_{m,n}$. For $m > n + 2, \Psi^{\ast 2}_{m,n}$ is asymptotically equivalent, when $H'_n$ is true, to $\sum^{m - n -1}_{\lambda = 1}\lambda M_{m + 1 - \lambda, m - 1 - \lambda}$, while $M_{m,n}$ is asymptotically equivalent to $$\sum^{m - n - 1}_{\lambda = 1} M_{m + 1 - \lambda, m - 1 - \lambda}$$ (see [6], [10]). Thus, $\Psi^{\ast 2}_{m,n}$ corresponds asymptotically to a weighted sum of the likelihood ratio statistics $M_{n + 2,n}, M_{n + 3, n + 1}, \cdots, M_{m,m - 2}$, with the weights $m - n - 1, m - n - 2, \cdots, 1$, respectively, while $M_{m, n}$ weights these statistics equally (see [13] and reference to [13] in Section 4 herein). Let $L_{m,n} = -2 \log \mu_{n,m - 1}$, where $\mu_{n,m - 1}$ is the ratio of the likelihood given $H_n$ to the maximum likelihood given $H'_{m - 1}$. For $m - 1 = n = 0$, the statistic $\Psi^2_{m,n}$ is asymptotically equivalent, when $H_n$ is true, to $L_{m,n}$. For $m - 1 > n = 0, \Psi^2_{m,n}$ is asymptotically equivalent, when $H_n$ is true, to $$\sum^{m - 1}_{\lambda = 1} \lambda M_{m + 1 - \lambda, m - 1 - \lambda} + mL_{n + 1,n},$$ while $L_{m,n}$ is asymptotically equivalent to $\sum^{m - 1}_{\lambda = 1} M_{m + 1 - \lambda, m - 1 - \lambda} + L_{n + 1, n}$. For $n > 0$, the relation between $\Psi^2_{m,n}$ and the likelihood ratio statistics $L_{m,n}$ and $M_{m,n}$ is not so straightforward. However, a modification $\Psi^{'2}_{m,n}$ of $\Psi^2_{m,n}$ (see Section 6 herein) is asymptotically equivalent, when $H_n$ is true, to $L_{m,n}$ for $m = n + 1$, and to $\sum^{m - n -1}_{\lambda = 1} \lambda M_{m + 1 - \lambda, m - 1 - \lambda} + (m - n)L_{n + 1, n}$ for $m > n + 1$; while the likelihood ratio statistic $L_{m,n}$ is asymptotically equivalent to $$\sum^{m - n - 1}_{\lambda = 1} M_{m + 1 - \lambda, m - 1- \lambda} + L_{n + 1,n}.$$ In [10], the $m$-tuple $\mathfrak{u}$ was "split" into an $(m - n - 1)$-tuple, an $n$-tuple, and a 1-tuple; thus obtaining $s^n$ "contingency tables" $(n \geqq 0)$ each $s^{m - n - 1} \times s$ (see [10]). The statistic $M_{m,n}$ can be seen to be asymptotically equivalent to the sum of the "likelihood ratio statistics" (for testing "independence" in each table) for the $s^n$ tables, and the asymptotic distribution, when $H'_n$ is true, of $M_{m,n}$ will be $\chi^2$ with $s^n(s^{m - n - 1} - 1)(s - 1) = s^m - s^{m - 1} - s^{n + 1} + s^n$ degrees of freedom. It is also possible to "split" the $m$-tuple $\mathfrak{u}$ into an $(m - n - 1 - r)$-tuple, and $\mathbf{n}$-tuple, and a $(1 + r)$-tuple $(0 \leqq r \leqq m - n - 2)$; thus obtaining $s^n$ "contingency tables," each $s^{m - n - 1 - r} \times s^{1 + r}$ (see [10]). The sum $_rM_{m,n}$ of the likelihood ratio (or any equivalent goodness of fit) statistics for the $s^n$ tables will have an asymptotic mean value, when $H'_n$ is true, of $$s^n(s^{m - n - 1 - r} - 1)(s^{1 + r} - 1) = s^m - s^{m - r - 1} - s^{n + 1 + r} + s^n.$$ but the asymptotic distribution will not be $\chi^2$ unless $r = 0$ or $m - n - 2$. It can be seen, using the methods developed in the present paper, that the statistic $_rM_{m,n}$ will be asymptotically equivalent, when $H'_n$ is true, to $$\sum^{m - n - 1}_{\lambda = 1} h(\lambda)M_{m + 1 - \lambda, m - 1 - \lambda},$$ where \begin{equation*}h(\lambda) = \begin{cases}\lambda \text{for} 0 < \lambda \leqq v\\v \text{for} v \leqq \lambda \leqq m - n - v\\(m - n - \lambda) \text{for} m - n - v \leqq \lambda \leqq m - n - 1,\end{cases}\end{equation*} and $v = \min \lbrack r + 1, m - n - r - 1\rbrack$. Thus, the asymptotic distribution $(N \rightarrow \infty)$, of $_rM_{m,n}$ (or the corresponding asymptotically equivalent goodness of fit statistics), when $H'_n$ is true, is $$\ast^{m - n - 1}_{\lambda = 1} K_{g(\lambda)}\lbrack x/(h(\lambda))\rbrack.$$ This result generalizes the earlier published results concerning the asymptotic distribution of the likelihood ratio statistic $M_{m,n}$ (or the corresponding asymptotically equivalent goodness of fit statistics) for testing the null hypothesis $H'_n$ within $H'_{m - 1}$, since $_rM_{m,n}$ for $r = 0$ or $m - n - 2$ is asymptotically equivalent to $M_{m,n}$ (see [6], [10]). A proof of this result will not be given since the method of proof is quite similar to that presented here for the asymptotic distribution of $\Psi^{\ast 2}_{m,n}$.

## Citation

Leo A. Goodman. "Asymptotic Distributions of "Psi-Squared" Goodness of Fit Criteria for $m$-th Order Markov Chains." Ann. Math. Statist. 29 (4) 1123 - 1133, December, 1958. https://doi.org/10.1214/aoms/1177706445

## Information