Institute of Mathematical Statistics Collections

Posterior consistency of Dirichlet mixtures of beta densities in estimating positive false discovery rates

Subhashis Ghosal, Anindya Roy, Yongqiang Tang

Abstract

In recent years, multiple hypothesis testing has come to the forefront of statistical research, ostensibly in relation to applications in genomics and some other emerging fields. The false discovery rate (FDR) and its variants provide very important notions of errors in this context comparable to the role of error probabilities in classical testing problems. Accurate estimation of positive FDR (pFDR), a variant of the FDR, is essential in assessing and controlling this measure. In a recent paper, the authors proposed a model-based nonparametric Bayesian method of estimation of the pFDR function. In particular, the density of p-values was modeled as a mixture of decreasing beta densities and an appropriate Dirichlet process was considered as a prior on the mixing measure. The resulting procedure was shown to work well in simulations. In this paper, we provide some theoretical results in support of the beta mixture model for the density of p-values, and show that, under appropriate conditions, the resulting posterior is consistent as the number of hypotheses grows to infinity.

First Page: Show Hide
Primary Subjects: 62G05, 62G20
Secondary Subjects: 62G10
Keywords: Dirichlet process; Dirichlet mixture; multiple testing; positive false discovery rate; posterior consistency
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.imsc/1207058267
Digital Object Identifier: doi:10.1214/193940307000000077

References

[1] Bayarri, M. J. and Berger, J. O. (2000). p-values for composite null models. J. Amer. Statist. Assoc. 95 1127–1142.
Mathematical Reviews (MathSciNet): MR1804239
Zentralblatt MATH: 1004.62022
Digital Object Identifier: doi:10.2307/2669749
[2] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
Mathematical Reviews (MathSciNet): MR1325392
[3] Efron, B. and Tibshirani, R. (2002). Empirical Bayes methods and false discovery rates for microarrays. Genetic Epidemiology 23 70–86.
[4] Feller, W. (1971). An Introduction to Probability Theory and Its Applications. II. Wiley, New York.
Mathematical Reviews (MathSciNet): MR270403
[5] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
Mathematical Reviews (MathSciNet): MR350949
Zentralblatt MATH: 0255.62037
Digital Object Identifier: doi:10.1214/aos/1176342360
Project Euclid: euclid.aos/1176342360
[6] Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Statist. 29 1233–1263.
Mathematical Reviews (MathSciNet): MR1873329
Zentralblatt MATH: 1043.62025
Digital Object Identifier: doi:10.1214/aos/1013203453
Project Euclid: euclid.aos/1013203452
[7] Ghosal, S. and van der Vaart, A. W. (2007). Posterior convergence rates of Dirichlet mixtures at smooth densities. Ann. Statist. 35 697–723.
Mathematical Reviews (MathSciNet): MR2336864
Zentralblatt MATH: 1117.62046
Digital Object Identifier: doi:10.1214/009053606000001271
Project Euclid: euclid.aos/1183667289
[8] Ghosh, J. K. and Ramamoorthi, R. V. (2003). Bayesian Nonparametrics. Springer, New York.
Mathematical Reviews (MathSciNet): MR1992245
[9] Robins, J. M., van der Vaart, A. W. and Ventura, V. (2000). Asymptotic distribution of p-values in composite null models. J. Amer. Statist. Assoc. 95 1143–1167.
Mathematical Reviews (MathSciNet): MR1804240
Zentralblatt MATH: 1072.62522
Digital Object Identifier: doi:10.2307/2669750
[10] Sarkar, S. K. (2002). Some results on false discovery rate in multiple testing procedures. Ann. Statist. 30 239–257.
Mathematical Reviews (MathSciNet): MR1892663
Zentralblatt MATH: 1101.62349
Digital Object Identifier: doi:10.1214/aos/1015362192
Project Euclid: euclid.aos/1015362192
[11] Storey, J. D. (2002). A direct approach to false discovery rates. J. Roy. Statist. Soc. Ser. B 64 479–498.
Mathematical Reviews (MathSciNet): MR1924302
Zentralblatt MATH: 1090.62073
Digital Object Identifier: doi:10.1111/1467-9868.00346
[12] Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value. Ann. Statist. 31 2013–2035.
Mathematical Reviews (MathSciNet): MR2036398
Zentralblatt MATH: 1042.62026
Digital Object Identifier: doi:10.1214/aos/1074290335
Project Euclid: euclid.aos/1074290335
[13] Tang, Y., Ghosal, S. and Roy, A. (2007). Nonparametric Bayesian estimation of positive false discovery rates. Biometrics 63 1126–1134.
Mathematical Reviews (MathSciNet): MR2414590
[14] Tsai, C., Hsueh, H. and Chen, J. (2003). Estimation of false discovery rates testing: Application to gene microarray data. Biometrics 59 1071–1081.
[15] Wong, W. H. and Shen, X. (1995). Probability inequalities for likelihood ratios and convergence rates of sieved MLEs. Ann. Statist. 23 339–362.
Mathematical Reviews (MathSciNet): MR1332570
Zentralblatt MATH: 0829.62002
Digital Object Identifier: doi:10.1214/aos/1176324524
Project Euclid: euclid.aos/1176324524

2012 © Institute of Mathematical Statistics

Institute of Mathematical Statistics Collections

Institute of Mathematical Statistics Collections