## Abstract

For each $n$ let $t_n$ be an estimate (based on $n$ independent and identically distributed observations) of a real valued parameter $\theta$. Suppose that, for each $\theta, n^{\frac{1}{2}}(t_n - \theta)$ is asymptotically normally distributed with mean zero and variance $v(\theta)$. According to R. A. Fisher we then have \begin{equation*}\tag{(1)}v(\theta) \geqq I^{-1}(\theta),\end{equation*} where $I$ is the information contained in a single observation. It is known however that, in the absence of regularity conditions on the sequence $\{t_n\}$, (1) does not necessarily hold for each $\theta$. On the other hand, according to LeCam (1952, 1953, 1958) the set of points $\theta$ for which (1) fails is always of Lebesgue measure zero. This note gives a simple proof of the stated result of LeCam, along the following lines. First a sufficient condition for the validity of (1) at a given value of $\theta$, say $\theta^0$, is obtained. This is a little weaker than the condition that $t_n$ be asymptotically median-unbiased (i.e. $P(t_n < \theta \mid \theta) \rightarrow \frac{1}{2}$ as $n \rightarrow \infty$) uniformly for $\theta$ in some neighborhood of $\theta^0$. It is then shown that the sufficient condition is automatically satisfied at almost all $\theta^0$. The main propositions are stated in the following paragraphs of this section, and the proofs are given in Section 2. The proofs depend on the Neyman-Pearson lemma concerning the optimality of the likelihood ratio test of a simple hypothesis against a simple alternative. This lemma is made available in the present context by means of the well known considerations that an estimate of $\theta$ can provide a test of the value of $\theta$, and that the quality of the resulting test is heavily dependent on the quality of the estimate. A similar application of the Neyman-Pearson lemma to estimation theory is made in Bahadur (1960). It is shown there that if instead of asymptotic variances one considers quantities called asymptotic effective variances, Fisher's bound becomes valid for all $\theta$. Now let $X = \{x\}$ be a sample space of points $x, \mathscr{B}$ a $\sigma$-field of sets of $X$, and $\{P(\cdot \mid \theta): \theta \varepsilon \Theta\}$ a set of probability measures $P(\cdot \mid \theta)$ on $\mathscr{B}$, where $\theta$ is a real parameter and $\Theta$ is an open interval on the real line. It is assumed that the following conditions (i)-(iv) are satisfied. (i) There exists a $\sigma$-finite measure on $\mathscr{B}$, say $\mu$, such that, for each $\theta, P(\cdot \mid \theta)$ admits a probability density with respect to $\mu, f(x \mid \theta)$ say, i.e., \begin{equation*}\tag{(2)}P(B |mid \theta) = \int_B f(x | \theta) d\mu\quad\text{for all} B \varepsilon \mathscr{B}, \theta \varepsilon \Theta.\end{equation*} (ii) For each $x \varepsilon X$, \begin{equation*}\tag{(3)}L(\theta \mid x) = \log f(x \mid \theta)\end{equation*} is a twice-differentiable function of $\theta$ and the second derivative is continuous in $\theta$. (iii) With dashes on $L$ denoting partial differentiation with respect to $\theta$, \begin{equation*}\tag{(4)}0 < E(\{L'(\theta \mid x)\}^2 \mid \theta) \equiv I(\theta) < \infty,\quad E(L'(\theta \mid x \mid \theta) = 0,\end{equation*} and \begin{equation*}\tag{(5)}E(L"(\theta \mid x) \mid \theta) = -I(\theta)\end{equation*} for every $\theta$. (iv) For any given $\theta^0$ in $\Theta$, there exists a $\delta > 0$ and a $\mathscr{B}$-measurable function $M(x)$ such that $|L" (\theta \mid x)| \leqq M(x)$ for all $x \varepsilon X$ and all $\theta \varepsilon (\theta^0 - \delta, \theta^0 + \delta)$, and such that $E(M(x) \mid \theta^0) < \infty; \delta$ and $M$ are, of course, allowed to depend on the given $\theta^0$. The above conditions are a simplification, along the lines of LeCam (1953), of the conditions formulated by Cramer (1946) for his analysis of the likelihood equation. It may be added that the present regularity conditions are in a sense weaker than those of LeCam (1953, 1958), since the method of proof of the latter papers requires local conditions such as (ii)-(iv) and also the existence and consistency of maximum likelihood estimates based on independent observations on $x$. Let $x_1, x_2, \cdots$ denote a sequence of independent and identically distributed observations on $x$. For each $n = 1, 2, \cdots$ write $x^{(n)} = (x_1, \cdots, x_n)$. Let $X^{(n)}$ denote the sample space of $x^{(n)}$, and $\mathscr{B}^{(n)}$ the $\sigma$-field of sets of $X^{(n)}$ which is determined by the given $\mathscr{B}$ in the usual way. For any measure $Q$ on $\mathscr{B}$, the corresponding product measure on $\mathscr{B}^{(n)}$ will be denoted by $Q^{(n)}$. For simplicity, $Q^{(n)}$ is abbreviated to $Q$ in cases where the domain of $Q^{(n)}$ is plain from the context. Now let there be given a sequence $\{t_n\}$ such that $t_n$ is a $\mathscr{B}^{(n)}$-measurable function on $X^{(n)}$ into $\Theta (n = 1, 2, \cdots)$. It is assumed that for each $\theta$ in $\Theta$ there exists a positive constant $v(\theta)$ such that, as $n \rightarrow \infty, n^{\frac{1}{2}}(t_n - \theta)$ is asymptotically normally distributed with mean 0 and variance $v(\theta)$ when $\theta$ obtains. (For a treatment of the case when the present assumption $0 < v < \infty$ is weakened to $0 \leqq v < \infty$, cf. the last paragraph of Section 3.) The given sequence $\{t_n\}$ will remain fixed throughout. We note that \begin{equation*}\tag{(6)}\lim_{n \rightarrow \infty} P(t_n < \theta \mid \theta) = \frac{1}{2}\end{equation*} for each $\theta$ in $\Theta$. PROPOSITION 1. If $\theta^0$ is a point in $\Theta$, and if \begin{equation*}\tag{(7)}\lim\inf_{n \rightarrow \infty} P(t_n < \theta^0 + n^{-\frac{1}{2}} \mid \theta^0 + n^{-\frac{1}{2}}) \leqq \frac{1}{2},\end{equation*} then (1) holds for $\theta = \theta^0$. It follows from Proposition 1 by symmetry that if \begin{equation*}\tag{(8)}\lim\inf_{n \rightarrow \infty} P(t_n > \theta^0 - n^{-\frac{1}{2}} \mid \theta^0 - n^{-\frac{1}{2}}) \leqq \frac{1}{2},\end{equation*} then also (1) holds for $\theta = \theta^0$. Another consequence of Proposition 1 is that if (6) holds uniformly for $\theta$ in some open interval of $\Theta$ then (1) holds for each $\theta$ in that interval. A somewhat weaker conclusion concerning the sufficiency of uniform convergence for (1) has been obtained independently by Rao (1963). The sequence $\{t_n\}$ is said to be superefficient if $v(\theta) \leqq I^{-1}(\theta)$ for all $\theta$ and the inequality is strict for at least one $\theta$. Examples of superefficient estimates were discovered by J. L. Hodges, Jr. (cf. LeCam (1953)). General studies bearing on superefficiency, using methods different from the present ones, were carried out by LeCam (1953, 1958). An informal discussion along lines similar to those of LeCam was given independently by Wolfowitz (1953). It is shown in LeCam (1953) that if $\{t_n\}$ is superefficient then $v(\theta) = I^{-1}(\theta)$ for almost all $\theta$ in $\Theta$; the following more general conclusion is given in LeCam (1958): PROPOSITION 2. The set of all $\theta$ in $\Theta$ for which (1) does not hold is of Lebesgue measure zero. It was observed by Chernoff (1956) that the asymptotic variance of an estimate is always a lower bound to the asymptotic expected squared error; in view of Proposition 2, this observation yields: PROPOSITION 3. $\lim\inf_{n \rightarrow \infty} \{ nE\lbrack (t_n - \theta)^2 \mid \theta\rbrack\} \geqq I^{-1}(\theta)$ for almost all $\theta$ in $\Theta$. The conclusions stated in this section can be extended to the case when $\theta$ is a $p$ dimensional parameter; a brief account of these extensions is given in Section 3. An extension to sampling frameworks more general than the present one of independent and identically distributed observations is described in Section 4.

## Citation

R. R. Bahadur. "On Fisher's Bound for Asymptotic Variances." Ann. Math. Statist. 35 (4) 1545 - 1552, December, 1964. https://doi.org/10.1214/aoms/1177700378

## Information