The Annals of Mathematical Statistics

Tables for the Distribution of the Number of Exceedances

Benjamin Epstein

Full-text: Open access

Abstract

Consider a random sample of size $n$ taken from a continuous distribution $f(x)$. Let another random sample, independent of the first sample and also of size $n$, be drawn from the same population. Let $U^n_r$ be the random variable associated with the number of values in the second sample which exceed the $r$th smallest value in the first sample. Similarly let $V^n_s$ be the random variable associated with the number of values in the second sample which exceed the $s$th largest value in the first sample. Due to the fact that the $r$th smallest value in a sample of size $n$ is at the same time the $s$th largest value in the sample with $s = n - r + 1$, it follows that \begin{equation*}\tag{1} \mathrm{Pr} (U^n_r = x) \equiv \mathrm{Pr} (V^n_s = x),\end{equation*} $s = n - r + 1; \quad r = 1,2, \cdots, n;\quad x = 0,1,2, \cdots, n.$ The probability distribution of $U^n_r$ (and hence of $V^n_s)$ is given by: \begin{equation*}\tag{2} \mathrm{Pr} (U^n_r = x) = \binom{n-x+r-1}{r-1}\binom{n-r+x}{x}\binom/{2n}{n} = \frac{1}{2}P_{n-x+r-1,r-1} P_{n-r+x,x}/P_{2n,n},\end{equation*} $x = 0, 1, 2, \cdots, n.$ Formula (2) can be proved by combinatorial methods; details are omitted. An alternative formula, derived in another way [3], is \begin{equation*}\tag{2a} \mathrm{Pr}(U^n_r = x) = \frac{1}{2}\binom{n-1}{r-1}\binom{n}{x}/\binom{2n-1}{n-r_x} = \frac{1}{2} P_{n-1,r-1}P_{n,x}/P_{2n-1,n-r+x}.\end{equation*} In formulae (2) and (2a), $P_{n,x} = (\frac{1}{2})^n\binom{n}{x}$. Formulae in terms of $P_{n,x}$ are particularly convenient for hand computation, since one can use the extensive tables of the binomial probability distribution published by the National Bureau of Standards. If the values of $\mathrm{Pr} (U^n_r \leqq x), \text{for} x = 0, 1, 2, \cdots, n - 1, r = 1,2, \cdots, n$ are written (for fixed $n$) in matrix form, one notes certain useful symmetries, which can be expressed by the identities \begin{align*}\tag{3} \mathrm{Pr} (U^n_r &\leqq x) = \mathrm{Pr} (U^n_{x+1} \leqq r - 1), \\ \tag{4} \mathrm{Pr} (U^n_r &\leqq x) + \mathrm{Pr} (U^n_{n-r+1} \leqq n - x - 1) = 1. \\ \end{align*} If one takes $x = n - r$ in (4) and uses the relation (3), it is readily verified that \begin{equation*}\tag{5} \mathrm{Pr} (U^n_r \leqq n - r) = \frac{1}{2}.\end{equation*} Proofs of (3), (4), and (5) can be obtained by using the results of pages 257-258 of [3]. Because of these symmetries, the complete matrix (for any fixed $r$) can be constructed if one knows only the quantities, $\mathrm{Pr} (U^n_r \leqq x), r = 1(1)\lbrack n/2\rbrack, x = r - 1, r, r + 1, \cdots, n - r - 1.$ In Table 1 these values are given for $n = 2(1)15(5)20.$ To see how the complete matrix is obtained from Table 1, it is interesting to verify, using (3), (4), and (5), that the complete matrix, in the special case $n = 5,$ is given by Table 2. A somewhat different, but related, exceedance problem is to take two random samples of size $n$ from a continuous distribution $f(x).$ Let us for convenience attach the letter $x$ to one of the samples and the letter $y$ to the other sample. Further let $x_{r,n}$ and $y_{r,n}$ be respectively the $r$th smallest observations in each of the samples. Let us define $Z_{r,n} = \max (x_{r,n}, y_{r,n}).$ If $Z_{r,n} = x_{r,n}$, count the number of $y$'s which are $\geqq x_{r,n};$ if $z_{r,n} = y_{r,n}$ count the number of $x$'s which are $\geqq y_{r,n}.$ Denoting the number of exceedances as $W^n_r,$ it is readily seen from (1) that the probability distribution of $W^n_r$ is given by \begin{equation*}\tag{6} \mathrm{Pr}(W^n_r = x) = 2\binom{n-z+r-1}{r-1}\binom{n-r+x}{x}/\binom{2n}{n},\quad x = 0,1,2,\cdots, n - r.\end{equation*} It is evident from the definition that, \begin{equation*}\tag{7} \mathrm{Pr}(W^n_r \leqq x) = 1,\quad x \geqq n - r.\end{equation*} Clearly one can find the values of $\mathrm{Pr}(W^n_r \leqq x)$ by using Table 1. Thus, for example, in the special case $n = 5$ one obtains Table 3.

Article information

Source
Ann. Math. Statist., Volume 25, Number 4 (1954), 762-768.

Dates
First available in Project Euclid: 28 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aoms/1177728662

Digital Object Identifier
doi:10.1214/aoms/1177728662

Mathematical Reviews number (MathSciNet)
MR65074

Zentralblatt MATH identifier
0056.37503

JSTOR
links.jstor.org

Citation

Epstein, Benjamin. Tables for the Distribution of the Number of Exceedances. Ann. Math. Statist. 25 (1954), no. 4, 762--768. doi:10.1214/aoms/1177728662. https://projecteuclid.org/euclid.aoms/1177728662


Export citation