Abstract
This paper is concerned with the problem of estimating a probability density which is known to be measurable with respect to a $\sigma$-lattice of subsets of the space on which it is defined. Our solution is represented as a conditional expectation. (Generally in the literature "conditional expectation" refers to "conditional expectation given a $\sigma$-field." However since we shall be concerned exclusively with "conditional expectation given a $\sigma$-lattice" we shall use this abbreviated terminology for the latter, more general concept.) Brunk [2] discusses conditional expectations and many of the extremum problems for which they provide solutions. We shall consider the case where the measure space, $(\Omega, \mathscr{A}, \mu)$ on which the density is defined is totally finite. Let $\mathscr{L}$ denote a $\sigma$-lattice of subsets of $\Omega(\mathscr{L} \subset \mathcal{A})$. A $\sigma$-lattice, by definition, is closed under countable unions and intersections and contains both $\Omega$ and the null set $\varnothing$. Let $\omega_1, \omega_2, \cdots, \omega_n$ be a sample of independent observations chosen in $\Omega$ according to the unknown, $\mathscr{L}$-measurable density $f$. We say that a point is chosen in $\Omega$ according to $f$ if the probability that it will lie in any set $A$ in $\mathscr{A}$ is given by $\int_Af d\mu$. The function $f$ is $\mathscr{L}$-measurable if the set $\lbrack f > a\rbrack$ is in $\mathscr{L}$ for each real number $a$. We shall use the maximum likelihood criterion for choosing an estimate. In other words we wish to find an $\mathscr{L}$-measurable density $\hat f$ such that the product of the values of $\hat f$ at the observed points is at least as large as the product of the values of any other $\mathscr{L}$-measurable density at those points. Such a function will be called a maximizing function. Clearly the $\sigma$-lattice $\mathscr{L}$ must satisfy some restrictions in order for the problem to be of any interest at all. For example if $(\Omega, \mathscr{A}, \mu)$ is a finite subinterval of the real line together with Borel subsets and Lebesgue measure and if $\mathscr{L} = \mathscr{A}$ then there are many obvious solutions if the density is bounded, and none at all if it is not bounded. The second section of this paper is devoted to the restrictions that we impose and to showing that these restrictions are satisfied in some problems which are of interest. The fourth section of this paper is devoted to some results on the asymptotic properties of our estimates in three special cases. The methods used are similar to those used by Marshall and Proschan [4]. The final section contains some observations on the problem of estimating a density on a non-totally finite measure space. Let $L_2$ denote the set of square integrable random variables and $L_2(\mathscr{L})$ the collection of all those members of $L_2$ which are $\mathscr{L}$-measurable. Let $R(\mathscr{L})$ denote the collection of all $\mathscr{L}$-measurable random variables. Let $\mathscr{B}$ denote the collection of Borel subsets of the real line. We shall adopt the following definition for the conditional expectation, $E_\mu(f \mid \mathscr{L})$, of a random variable given a $\sigma$-lattice. DEFINITION 1.1. If $f \epsilon L_2$ then $g \epsilon L_2(\mathscr{L})$ is equal to $E_\mu(f \mid \mathscr{L})$ if and only if $g$ has both of the following properties: \begin{equation*}\tag{1.1}\int (f - g)h d\mu \leqq 0 \text{for all} h \epsilon L_2(\mathscr{L})\end{equation*} and \begin{equation*}\tag{1.2}\int_B(f - g)d\mu = 0\quad\text{for all}\quad B \epsilon g^{-1}(\mathscr{B})\end{equation*}. (Brunk [1] shows that there is such a random variable $g$ associated with each $f \epsilon L_2$ and that $g$ is unique in the sense that if $g'$ is any other member of $L_2(\mathscr{L})$ having these properties then $g = g'\lbrack\mu\rbrack$.) In order to motivate the consideration of a problem such as the one we take up in this paper we introduce the following examples. The first example is discussed in [5]. EXAMPLE 1.1. Suppose $\Omega$ is a finite set and we denote its elements by $1, 2, \cdots, k$. Let $\mathscr{A}$ be the collection of all subsets of $\Omega$ and suppose $\mu$ assigns positive mass to each point in $\Omega$. Suppose $\mathscr{L}$ is an arbitrary $\sigma$-lattice of subsets of $\Omega$. Let $n_i$ denote the number of times the point $i$ is observed. We wish to find an $\mathscr{L}$-measurable density $\hat f$ on $\Omega$ such that $\prod^k_{i = 1} \hat f(i)^{n_i} \geqq \prod^k_{i = 1} h(i)^{n_i}$ for every other $\mathscr{L}$-measurable density $h$. It is shown in [5] that a solution is given by $\hat f = E_\mu(g \mid \mathscr{L})$ where $g(i) = n_i\lbrack n \cdot \mu(i)\rbrack^{-1}$. The problem posed in the next example was solved by Pyke (personal communication with Professor Brunk) and for a monotone density by Grenander [3]. EXAMPLE 1.2. Suppose $\Omega$ is a closed subinterval of the real line $(\Omega = \lbrack c, d\rbrack), \mathscr{a}$ is the collection of Borel subsets of $\Omega$ and $\mu$ is Lebesque measure. We wish to estimate the density $f$ which is known to be unimodal at some unknown point in $\Omega$. Suppose that our observations are ordered: $\omega_1 < \omega_2 < \cdots < \omega_n$. If $h$ is any unimodal density with mode at $a$ and $\omega_j < a < \omega_{j + 1}$ then define the function $g$ by: $g(x) = h(\omega_j)\quad (\omega_j \leqq x < a) \\ = h(\omega_{j + 1}) \quad (a \leqq x \leqq \omega_{j + 1}) \\ = h(x) \quad \text{otherwise}$. It is easily seen that the density $\hat f = \lbrack \int g d\mu\rbrack^{-1} \cdot g$ is unimodal with mode at $\omega_j$ or $\omega_{j+1}$ and that the product of the values of $\hat f$ at the observed points is at least as large as the product of the corresponding values of $h$. Hence our problem reduces to finding an estimate which has mode at one of the observed points. Similarly we can show that any maximizing estimate must be constant on every open interval joining two consecutive observed values. The next remark is easy to verify. REMARK 1.1. Let $\mathscr{L}$ be the $\sigma$-lattice of subsets of $\Omega$ consisting of all those intervals containing the point $a$. A function $f$ on $\Omega$ is unimodal at $a$ if and only if $f$ is $\mathscr{L}$-measurable. If we can estimate the density subject to the restriction that it is unimodal at a fixed point then we can select an estimate by comparing the ones we get by assuming the mode is at particular observations. We see thus that the problem of estimating a unimodal density reduces to estimating a density which is measurable with respect to a $\sigma$-lattice and constant on intervals joining consecutive observations. We shall see that these results for a unimodal density are typical of a larger class of problems. Now consider the problem of estimating a unimodal density on the reals together with Lebesgue measure. Clearly any estimate must be zero outside of the smallest closed interval containing all the observed points. Hence this problem reduces to the one discussed above.
Citation
Tim Robertson. "On Estimating a Density which is Measurable with Respect to a $\sigma$-Lattice." Ann. Math. Statist. 38 (2) 482 - 493, April, 1967. https://doi.org/10.1214/aoms/1177698964
Information