## The Annals of Mathematical Statistics

### Limiting Distributions for Some Random Walks Arising in Learning Models

M. Frank Norman

#### Abstract

Associated with certain of the learning models introduced by Bush and Mosteller [1] are random walks $p_1, p_2, p_3, \cdots$ on the closed unit interval with transition probabilities of the form \begin{equation*}\tag{1}P\lbrack p_{n + 1} = p_n + \theta_1(1 - p_n) \mid p_n\rbrack = \varphi(p_n)\end{equation*} and \begin{equation*}\tag{2}P\lbrack p_{n + 1} = p_n - \theta_2p_n \mid p_n\rbrack = 1 - \varphi(p_n)\end{equation*} where $0 < \theta_1, \theta_2 < 1$ and $\varphi$ is a mapping of the closed unit interval into itself. In the experiments to which these models are applied, response alternatives $A_1$ and $A_2$ are available to a subject on each of a sequence of trials, and $p_n$ is the probability that the subject will make response $A_1$ on trial $n$. Depending on which response is actually made, one of two events $E_1$ or $E_2$ ensues. These events are associated, respectively, with the increment $p_n \rightarrow p_n + \theta_1(1 - p_n)$ and the decrement $p_n \rightarrow p_n - \theta_2p_n$ in $A_1$ response probability. The conditional probabilities $\pi_{ij}$ of event $E_j$ given response $A_i$ do not depend on the trial number $n$. Thus (1) and (2) are obtained with $\varphi(p) = \pi_{11}p + \pi_{21}(1 - p)$. Since the linearity of the functions $\varphi$ which arise in this way is of no consequence for the work presented in this paper, we will assume instead simply that \begin{equation*}\tag{3}\varphi \varepsilon C^2(\lbrack 0, 1\rbrack).\end{equation*} We impose one further restriction on $\varphi$ which excludes some cases of interest in learning theory: \begin{equation*}\tag{4}\epsilon_1 = \min_{0 \leqq p \leqq 1} \varphi(p) > 0 \text{and} \epsilon_2 = \max_{0 \leqq p \leqq 1} \varphi(p) < 1.\end{equation*} It follows from a theorem of Karlin ([5], Theorem 37) that under (1)-(4) the distribution function $F^{(n)}_{\theta_1,\theta_2,\varphi}$ of $p_n$ (which depends, of course, on the distribution $F$ of $p_1$) converges as $n$ approaches infinity to a distribution $F_{\theta_1,\theta_2,\varphi}$ which does not depend on $F$. It is with the distributions $F_{\theta_1,\theta_2,\varphi}$ that the present paper is concerned. Very little is known about distributions of this family, though some results may be found in Karlin [5], Bush and Mosteller [1], Kemeny and Snell [6], Estes and Suppes [3], and McGregor and Hui [8]. The only theorem in the literature directly relevant to the present work is one of McGregor and Zidek [9] as a consequence of which, in the case $\theta_1 = \theta_2 = \theta, \varphi(p) \equiv \frac{1}{2}$, $\lim_{\theta \rightarrow 0} \lim_{n \rightarrow \infty} P\lbrack\theta^{-\frac{1}{2}}(p_n - \frac{1}{2}) \leqq x\rbrack = \Phi(8^{\frac{1}{2}}x)$ where $\Phi$ denotes the standard normal distribution function; that is, the distribution $F_{\theta,\theta,\frac{1}{2}}(\theta^{\frac{1}{2}}x + \frac{1}{2})$ converges to a normal distribution as the "learning rate" parameter $\theta$ tends to 0. We will prove, by means of another method, that this phenomenon is of much greater generality. Theorem 1 below shows that, for any positive constant $\zeta$ and any $\varphi$ with $\max_{0 \leqq p \leqq 1}\varphi'(p) < \min (1, \zeta)/\max (1, \zeta)$ there is a constant $\rho$ such that $F_{\theta,\zeta\theta,\varphi}(\theta^{\frac{1}{2}}x + \rho)$ converges to a normal distribution as $\theta \rightarrow 0$. A nonnormal limit is obtained if $\theta_1$ approaches 0 while $\theta_2$ remains fixed as is shown in Theorem 2. In this case $F_{\theta_1,\theta_2,\varphi}(\theta_1x)$ converges to an infinite convolution of geometric distributions. If $f(p,\theta) = p + \theta(1 - p)$ then (1) and (2) can be written in the form \begin{equation*}\tag{5}P\lbrack p_{n + 1} = f(p_n, \theta_1) \mid p_n\rbrack = \varphi(p_n)\end{equation*} and \begin{equation*}\tag{6}P\lbrack p_{n + 1} = 1 - f(1 - p_n, \theta_2) \mid p_n\rbrack = 1 - \varphi(p_n).\end{equation*} In Section 4 it is shown that the linearity of $f(p,\theta)$ in $p$ and $\theta$ is not essential to the phenomena discussed above. Theorems 3 and 4 present generalizations of Theorems 2 and 1, respectively, to "learning functions" $f(p,\theta)$ subject only to certain fairly weak axioms. A somewhat different learning model, Estes [2] $N$-element pattern model, leads to a finite Markov chain $p_1, p_2, p_3, \cdots$ with state space $S_N = \{jN^{-1}: j = 0, 1, \cdots, N\}$ and transition probabilities \begin{align*}\tag{7}P\lbrack p_{n + 1} &= p_n + N^{-1} \mid p_n\rbrack = \varphi(p_n), \\ \tag{8}P\lbrack p_{n + 1} &= p_n - N^{-1} \mid p_n\rbrack = \psi(p_n),\\ \end{align*} and \begin{equation*}\tag{9}P\lbrack p_{n + 1} = p_n \mid p_n\rbrack = 1 - \varphi(p_n) - \psi(p_n)\end{equation*} where $\varphi(p) = c\pi_{21}(1 - p), \psi(p) = c\pi_{12}p, 0 < c \leqq 1$, and for the sake of this discussion we suppose that $0 < \pi_{12}, \pi_{21} < 1$. In this case a limiting distribution $F_{N, \varphi,\psi}$ of $p_n$ as $n \rightarrow \infty$ exists and is independent of the distribution of $p_1$ by a standard theorem on Markov chains. Estes [2] showed that the limit is binomial over $S_N$ with mean $r = \pi_{21}/(\pi_{12} + \pi_{21})$. It then follows from the central limit theorem that $\lim_{N \rightarrow \infty}\lim_{n \rightarrow \infty} P\lbrack N^{\frac{1}{2}}(p_n - r) \leqq x\rbrack = \Phi\lbrack x/(r(1 - r))^{\frac{1}{2}}\rbrack.$ In Section 5 it is shown that our method permits an extension of this result to much more general $\varphi$ and $\psi$.

#### Article information

Source
Ann. Math. Statist., Volume 37, Number 2 (1966), 393-405.

Dates
First available in Project Euclid: 27 April 2007

https://projecteuclid.org/euclid.aoms/1177699521

Digital Object Identifier
doi:10.1214/aoms/1177699521

Mathematical Reviews number (MathSciNet)
MR192535

Zentralblatt MATH identifier
0139.34802

JSTOR