The Annals of Statistics

Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data

David Donoho and Jiashun Jin

Full-text: Open access

Abstract

We apply FDR thresholding to a non-Gaussian vector whose coordinates Xi, i=1, …, n, are independent exponential with individual means μi. The vector μ=(μi) is thought to be sparse, with most coordinates 1 but a small fraction significantly larger than 1; roughly, most coordinates are simply ‘noise,’ but a small fraction contain ‘signal.’ We measure risk by per-coordinate mean-squared error in recovering log(μi), and study minimax estimation over parameter spaces defined by constraints on the per-coordinate p-norm of log(μi), $\frac{1}{n}\sum_{i=1}^{n}\,\log^{p}(\mu_{i})\leq \eta^{p}$.

We show for large n and small η that FDR thresholding can be nearly minimax. The FDR control parameter 0<q<1 plays an important role: when q≤1/2, the FDR estimator is nearly minimax, while choosing a fixed q>1/2 prevents near minimaxity.

These conclusions mirror those found in the Gaussian case in Abramovich et al. [Ann. Statist. 34 (2006) 584–653]. The techniques developed here seem applicable to a wide range of other distributional assumptions, other loss measures and non-i.i.d. dependency structures.

Article information

Source
Ann. Statist. Volume 34, Number 6 (2006), 2980-3018.

Dates
First available in Project Euclid: 23 May 2007

Permanent link to this document
http://projecteuclid.org/euclid.aos/1179935072

Digital Object Identifier
doi:10.1214/009053606000000920

Mathematical Reviews number (MathSciNet)
MR2329475

Zentralblatt MATH identifier
1114.62010

Subjects
Primary: 62H12: Estimation 62C20: Minimax procedures
Secondary: 62G20: Asymptotic properties 62C10: Bayesian problems; characterization of Bayes procedures 62C12: Empirical decision procedures; empirical Bayes procedures

Keywords
Minimax decision theory minimax Bayes estimation mixtures of exponential model sparsity false discovery rate (FDR) multiple comparisons threshold rules

Citation

Donoho, David; Jin, Jiashun. Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data. Ann. Statist. 34 (2006), no. 6, 2980--3018. doi:10.1214/009053606000000920. http://projecteuclid.org/euclid.aos/1179935072.


Export citation

References

  • Abramovich, F. and Benjamini, Y. (1995). Thresholding of wavelet coefficients as multiple hypotheses testing procedure. In Wavelets and Statistics. Lecture Notes in Statist. 103 5--14. Springer, New York.
  • Abramovich, F. and Benjamini, Y. (1996). Adaptive thresholding of wavelet coefficients. Comput. Statist. Data Anal. 22 351--361.
  • Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584--653. MR2281879
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289--300.
  • Bretagnolle, J. (1980). Statistique de Kolmogorov--Smirnov pour un enchantillon nonéquiréparti. In Statistical and Physical Aspects of Gaussian Processes (Saint-Flour, 1980). Colloq. Internat. CNRS 307 39--44.
  • Donoho, D. and Jin, J. (2006). Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data. Technical report, Dept. Statistics, Stanford Univ. Available at arxiv.org/abs/math/0602311.
  • Donoho, D. and Johnstone, I. (1994). Minimax risk over $\ell_p$-balls for $\ell_q$-error. Probab. Theory Related Fields 99 277--303.
  • Donoho, D. and Johnstone, I. (1998). Minimax estimation via wavelet shrinkage. Ann. Statist. 26 879--921.
  • Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Statist. 27 642--669.
  • Genovese, C. and Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 499--517.
  • Jin, J. (2003). Detecting and estimating sparse mixtures. Ph.D. dissertation, Dept. Statistics, Stanford Univ.
  • Jin, J. (2004). False discovery rate thresholding for sparse data from a location mixture. Preprint.
  • Lehmann, E. (1953). The power of rank tests. Ann. Math. Statist. 24 23--43.
  • Lehmann, E. (1986). Testing Statistical Hypotheses, 2nd ed. Wiley, New York.
  • Massart, P. (1990). The tight constant in the Dvoretzky--Kiefer--Wolfowitz inequality. Ann. Probab. 18 1269--1283.
  • Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. Wiley, New York.
  • Simes, R. (1986). An improved Bonferroni procedure for multiple tests of significances. Biometrika 73 751--754.