On Some Robust Estimates of Location

Peter J. Bickel

doi:10.1214/aoms/1177700058

June, 1965 On Some Robust Estimates of Location

Peter J. Bickel

Ann. Math. Statist. 36(3): 847-858 (June, 1965). DOI: 10.1214/aoms/1177700058

Abstract

During the past 15 years various approaches have been proposed to deal with the lack of robustness of the sample mean as an estimate of the population mean when the distribution sampled is contaminated by gross errors, i.e., has heavier tails than the normal distribution. First, Tukey and the Statistical Research Group at Princeton in [9] suggested and investigated the properties of "trimmed" and "Winsorized" means. More recently, Hodges and Lehmann [6], proposed estimates related to the well-known robust Wilcoxon and normal scores tests, among others. Finally Huber in [7] considered essentially the class of maximum likelihood estimates and found those members of this class which minimize the maximum variance over various classes of contaminated distributions. For a review of work in these directions in testing as well as estimation the reader is referred to Elashoff [3]. In Theorems 3.1 and 3.2 we state the main results of the asymptotic theory of the Winsorized and trimmed means and outline the proof. An alternative method of trimming and Winsorizing (not equivalent to that of Tukey) which encompasses the efficient estimates proposed by Huber and which generalizes to higher dimensions is discussed in Section 2. The fourth section (Theorem 4.1) gives the minimum efficiency with respect to the families of all symmetric and symmetric unimodal distributions, of the Winsorized and trimmed means with respect to the mean. The lower bounds found for the trimmed means (for small trimming proportions) in the unimodal case compare well with that found by Hodges and Lehmann in [5] for the median of averages of pairs, the Hodges-Lehmann estimate. However, the Winsorized mean (for unimodal distributions) has minimum efficiency $\frac{1}{3}$ with respect to the mean whatever be the trimming proportion used. For all distributions, the minimum efficiency is 0. Also in the fourth section (Theorem 4.2) we compare the trimmed mean to the $H-L$ estimate and find that while the latter can be infinitely more efficient than the former, the $H-L$ estimate, for small trimming proportions, $\alpha = .05$, is at least 90 per cent (approximately) as efficient. This would suggest that unless the computations involved are prohibitive, the $H-L$ estimate is to be preferred in any situation where the degree of contamination and type of distribution is not known with great precision. The same remarks apply to the Winsorized mean with only somewhat less force since the lower bounds involved are .74 for all symmetric distributions and .79 for symmetric unimodal distributions. Finally we compare the principal estimate proposed by Huber in [7] (Proposal 2) to the mean and the Hodges-Lehmann estimate, both for all symmetric densities and for the symmetric unimodal family. Results similar to those already mentioned in connection with the trimmed mean are obtained in Theorems 5.1 and 5.2.