The Annals of Statistics

Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences

Iain M. Johnstone and Bernard W. Silverman

Source: Ann. Statist. Volume 32, Number 4 (2004), 1594-1649.

Abstract

An empirical Bayes approach to the estimation of possibly sparse sequences observed in Gaussian white noise is set out and investigated. The prior considered is a mixture of an atom of probability at zero and a heavy-tailed density γ, with the mixing weight chosen by marginal maximum likelihood, in the hope of adapting between sparse and dense sequences. If estimation is then carried out using the posterior median, this is a random thresholding procedure. Other thresholding rules employing the same threshold can also be used. Probability bounds on the threshold chosen by the marginal maximum likelihood approach lead to overall risk bounds over classes of signal sequences of length n, allowing for sparsity of various kinds and degrees. The signal classes considered are “nearly black” sequences where only a proportion η is allowed to be nonzero, and sequences with normalized ℓp norm bounded by η, for η>0 and 0<p≤2. Estimation error is measured by mean qth power loss, for 0<q≤2. For all the classes considered, and for all q in (0,2], the method achieves the optimal estimation rate as n→∞ and η→0 at various rates, and in this sense adapts automatically to the sparseness or otherwise of the underlying signal. In addition the risk is uniformly bounded over all signals. If the posterior mean is used as the estimator, the results still hold for q>1. Simulations show excellent performance. For appropriately chosen functions γ, the method is computationally tractable and software is available. The extension to a modified thresholding method relevant to the estimation of very sparse sequences is also considered.

Primary Subjects: 62C12
Secondary Subjects: 62G08, 62G05
Keywords: Adaptivity; empirical Bayes; sequence estimation; sparsity; thresholding

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1091626180
Digital Object Identifier: doi:10.1214/009053604000000030
Mathematical Reviews number (MathSciNet): MR2089135
Zentralblatt MATH identifier: 1047.62008

References

Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2000). Adapting to unknown sparsity by controlling the false discovery rate. Technical Report 2000-19, Dept. Statistics, Stanford Univ.
Abramovich, F. and Silverman, B. W. (1998). Wavelet decomposition approaches to statistical inverse problems. Biometrika 85 115--129.
Mathematical Reviews (MathSciNet): MR1627226
Zentralblatt MATH: 0908.62095
Digital Object Identifier: doi:10.1093/biomet/85.1.115
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289--300.
Mathematical Reviews (MathSciNet): MR1325392
Brown, L. D., Johnstone, I. M. and MacGibbon, K. B. (1981). Variation diminishing transformations: A direct approach to total positivity and its statistical applications. J. Amer. Statist. Assoc. 76 824--832.
Mathematical Reviews (MathSciNet): MR650893
Bruce, A. and Gao, H.-Y. (1996). Applied Wavelet Analysis with S-PLUS. Springer, New York.
Zentralblatt MATH: 0857.65147
Cai, T. T. (2002). On block thresholding in wavelet regression: Adaptivity, block size, and threshold level. Statist. Sinica 12 1241--1273.
Mathematical Reviews (MathSciNet): MR1947074
Zentralblatt MATH: 1004.62036
Cai, T. T. and Silverman, B. W. (2001). Incorporating information on neighboring coefficients into wavelet estimation. Sankhyā Ser. B 63 127--148.
Mathematical Reviews (MathSciNet): MR1895786
Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $\ell_p$-balls for $\ell_q$-error. Probab. Theory Related Fields 99 277--303.
Mathematical Reviews (MathSciNet): MR1278886
Digital Object Identifier: doi:10.1007/BF01199026
Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200--1224.
Mathematical Reviews (MathSciNet): MR1379464
Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object (with discussion). J. Roy. Statist. Soc. Ser. B 54 41--81.
Mathematical Reviews (MathSciNet): MR1157714
George, E. I. and Foster, D. P. (1998). Empirical Bayes variable selection. In Proc. Workshop on Model Selection, Special Issue of Rassegna di Metodi Statistici ed Applicazioni (W. Racugno, ed.) 79--108. Pitagora Editrice, Bologna.
George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731--747.
Mathematical Reviews (MathSciNet): MR1813972
Zentralblatt MATH: 1029.62008
Digital Object Identifier: doi:10.1093/biomet/87.4.731
Johnstone, I. M. and Silverman, B. W. (2003). EbayesThresh: R and S-PLUS software for Empirical Bayes thresholding. Available at www.stats.ox.ac.uk/~silverma/ebayesthresh.
Johnstone, I. M. and Silverman, B. W. (2004). Empirical Bayes selection of wavelet thresholds. Ann. Statist. To appear.
Mathematical Reviews (MathSciNet): MR2166560
Digital Object Identifier: doi:10.1214/009053605000000345
Project Euclid: euclid.aos/1123250227
Karlin, S. (1968). Total Positivity 1. Stanford Univ. Press, Stanford, CA.
Mathematical Reviews (MathSciNet): MR230102
Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR762984
Zentralblatt MATH: 0544.60045
Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9 1135--1151.
Mathematical Reviews (MathSciNet): MR630098
Zhang, C.-H. (2004). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. To appear.
Mathematical Reviews (MathSciNet): MR2157796
Digital Object Identifier: doi:10.1214/009053604000000995
Project Euclid: euclid.aos/1112967699

2009 © Institute of Mathematical Statistics