The Annals of Mathematical Statistics

A New Proof of the Bahadur Representation of Quantiles and an Application

J. K. Ghosh

Abstract

Let $\{X_i\}$ be a sequence of independent random variables with the same distribution function $F(x) = \mathrm{P r}\{X_i \leqq x\}$. Let $F(M_p) = p, 0 < p < 1$. Suppose $F$ has two derivatives in a neighborhood of $M_p, F"(x)$ is bounded there and $F'(M_p)$ is positive. Let $Y_{p,n}$ be a sample $p$-quantile based on $X_1, \cdots, X_n$. Let $nG_n(x)$ be the number of $X_i$'s among $(X_1, \cdots, X_n)$ which are $> x$. Bahadur (1966) has proved \begin{equation*}\tag{1}Y_{p,n} = M_p + \lbrack G_n(M_p) - (1 - p)\rbrack/F'(M_p) + R_n\end{equation*} where the remainder term $R_n$ becomes negligible as $n \rightarrow \infty$. More precisely, he has shown $R_n = O(n^{-\frac{3}{4}} \log n)$ a.s. as $n \rightarrow \infty$. The best result of this type is due to Kiefer (1967) who has calculated the exact order of $R_n$. Sen (1968) has extended Bahadur's result to random variables which are neither independent nor identically distributed. We shall give a new and much simpler proof of a weaker version of Bahadur's result which suffices for many statistical applications. Our proof involves fewer assumptions than Bahadur's. For arbitrary $p_n$ let $M_{p_n}$ be defined as $M_p + (p_n - p)/F'(M_p)$. Consider \begin{equation*}\tag{2}Y_{p_n,n} = M_{p_n} + \lbrack G_n(M_p) - (1 - p)\rbrack/F'(M_p) + R_n\end{equation*} where $Y_{p_n,n}$ is a sample $p_n$-quantile. In Section 2 we have proved the following result about $R_n$. THEOREM 1. Suppose $F'(M_p)$ exists and is strictly positive and $p_n - p = O(1/n^{\frac{1}{2}})$. Then $R_n$ as defined in (2) (and, a fortiori, $R_n$ as defined in (1) satisfies \begin{equation*}\tag{3}n^{\frac{1}{2}} R_n \rightarrow 0 \text{in probability}.\end{equation*} (After writing this paper the author discovered that the result for $p_n = p$ is stated without proof in Chernoff et al (1967).) It is easy to extend this result as in Sen (1968). An outline is sketched in one of the remarks. Once again it is possible to achieve some economy in assumptions. The representation (1) is not new. Its use in deriving the asymptotic moments of $Y_{p,n}$ goes back to Karl Pearson. See, for example, (1) in Hojo (1931). But the formulation therein is very imprecise and lacks a rigorous justification. We next consider an application of Theorem 1. Let $\bar{X}_n = (\sum^n_1 X_i)/n$ and $P_n =$ proportion of $X_i$'s above $\bar{X}_n$. David (1962) proved the asymptotic normality of $P_n$ when $F$ is a normal distribution function. Using the same elegant trick, Mustafi (1968) has proved a similar result for bivariate normal distributions. We shall extend these results considerably by providing alternative proofs based on Theorem 1, which dispense with the normality assumption on F. Moreover, in our proof we may consider--though we shall not do so for purposes of simplicity--instead of the sample mean $\bar{X}_n$ an $U$-statistic to which the central limit theorem of Hoeffding (1948) applies.

Article information

Source
Ann. Math. Statist., Volume 42, Number 6 (1971), 1957-1961.

Dates
First available in Project Euclid: 27 April 2007

https://projecteuclid.org/euclid.aoms/1177693063

Digital Object Identifier
doi:10.1214/aoms/1177693063

Mathematical Reviews number (MathSciNet)
MR297071

Zentralblatt MATH identifier
0235.62006

JSTOR