Electronic Journal of Statistics

The control of the false discovery rate in fixed sequence multiple testing

Gavin Lynch, Wenge Guo, Sanat K. Sarkar, and Helmut Finner

Full-text: Open access

Abstract

Controlling false discovery rate (FDR) is a powerful approach to multiple testing. In many applications, the tested hypotheses have an inherent hierarchical structure. In this paper, we focus on the fixed sequence structure where the testing order of the hypotheses has been strictly specified in advance. We are motivated to study such a structure, since it is the most basic of hierarchical structure, yet it is often seen in real applications such as statistical process control and streaming data analysis. We first consider a conventional fixed sequence method that stops testing once an acceptance occurs, and develop such a method controlling FDR under both arbitrary and negative dependencies. The method under arbitrary dependency is shown to be unimprovable without losing control of FDR and, unlike existing FDR methods; it cannot be improved even by restricting to the usual positive regression dependence on subset (PRDS) condition. To account for any potential mistakes in the ordering of the tests, we extend the conventional fixed sequence method to one that allows more but a given number of acceptances. Simulation studies show that the proposed procedures can be powerful alternatives to existing FDR controlling procedures. The proposed procedures are illustrated through a real data set from a microarray experiment.

Article information

Source
Electron. J. Statist., Volume 11, Number 2 (2017), 4649-4673.

Dates
Received: November 2016
First available in Project Euclid: 18 November 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1510974129

Digital Object Identifier
doi:10.1214/17-EJS1359

Mathematical Reviews number (MathSciNet)
MR3724971

Zentralblatt MATH identifier
06816628

Subjects
Primary: 62J15: Paired and multiple comparisons

Keywords
Arbitrary dependence false discovery rate fixed sequence multiple testing negative association PRDS property $p$-values

Rights
Creative Commons Attribution 4.0 International License.

Citation

Lynch, Gavin; Guo, Wenge; Sarkar, Sanat K.; Finner, Helmut. The control of the false discovery rate in fixed sequence multiple testing. Electron. J. Statist. 11 (2017), no. 2, 4649--4673. doi:10.1214/17-EJS1359. https://projecteuclid.org/euclid.ejs/1510974129


Export citation

References

  • [1] Aharoni, E. and Rosset, S. (2014). Generalized $\alpha$-investing: definitions, optimality results and application to public databases., Journal of the Royal Statistical Society: Series B 76 771–794.
  • [2] Barber, R. and Candes, E. (2015). Controlling the false discovery rate via knockoffs., The Annals of Statistics 43 2055–2085.
  • [3] Benjamini, Y. and Heller, R. (2007). False discovery rates for spatial signals., J. Amer. Satist. Assoc. 102 1272–1281.
  • [4] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing., Journal of the Royal Statistical Society: Series B 57 289–300.
  • [5] Benjamini, Y. and Liu, W. (1999). A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence., J. Statist. Plann. Inference 82 163–170.
  • [6] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency., Ann. Statist. 29 1165–1188.
  • [7] Dmitrienko, A., D’Agostino, R., and Huque, M. (2013). Key multiplicity issues in clinical drug development., Statistics in Medicine 32 1079–1111.
  • [8] Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model., Statistical Science 23 1–22.
  • [9] Farcomeni, A. and Finos, L. (2013). FDR control with pseudo-gatekeeping based on a possibly data driven order of the hypotheses., Biometrics 69 606–613.
  • [10] Finner, H. and Roters, M. (2001). On the false discovery rate and expected type I errors., Biometrical Journal 43 985–1005
  • [11] Finos, L. and Farcomeni, A. (2011). $k$-FWER control without multiplicity correction, with application to detection of genetic determinants of multiple sclerosis in Italian twins., Biometrics 67 174–181.
  • [12] G’Sell, M. G., Wager, S., Chouldechova, A., and Tibshirani, R. (2016). Sequential selection procedures and false discovery rate control., Journal of the Royal Statistical Society: Series B 78 423–444.
  • [13] Goeman, J. and Finos, L. (2012). The inheritance procedure: Multiple testing of tree-structured hypotheses., Statistical Applications in Genetics and Molecular Biology 11 1–18.
  • [14] Goeman, J. and Mansmann, U. (2008). Multiple testing on the directed acyclic graph of gene ontology., Bioinformatics 24 537–544.
  • [15] Goeman, J. and Solari, A. (2010). The sequential rejection principle of familywise error control., Ann. Statist. 38 3782–3810.
  • [16] Guo, W. and Rao, M. (2008). On control of the false discovery rate under no assumption of dependency., Journal of Statistical Planning and Inference 28 3176–3188.
  • [17] Heller, R., Manduchi, E., Grant, G., and Ewens, W. (2009). A flexible two-stage procedure for identifying gene sets that are differentially expressed., Bioinformatics 25 929–942.
  • [18] Hommel, G., Bretz, F., and Maurer, W. (2007). Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies., Statistics in Medicine 26 4063–4074.
  • [19] Hommel, G. and Kropf, S. (2005). Testing for differentiation in gene expression using a data driven order or weights for hypotheses., Biometrical Journal 47 554–562.
  • [20] Huque, M. and Alosh, M. (2008). A flexible fixed-sequence testing method for hierarchically ordered correlated multiple endpoints in clinical trials., Journal of Statistical Planning and Inference 138 321–335.
  • [21] Javanmard, A. and Montanari, A. (2015). On online control of false discovery rate., arXiv preprint arXiv:1502.06197.
  • [22] Joag Dev, K. and Proschan, F. (1983). Negative association of random variables with applications., Ann. Statist. 11 286–295.
  • [23] Kropf, S. and Läuter, J. (2002). Multiple tests for different sets of variables using a data-driven ordering of hypotheses, with an application to gene expression data., Biometrical Journal 44 789–800.
  • [24] Kropf, S., Läuter, J., Eszlinger, M., Krohn, K., and Paschkeb, R. (2004). Nonparametric multiple test procedures with data-driven order of hypotheses and with weighted hypotheses., Journal of Statistical Planning and Inference 125 31–47.
  • [25] Lehmann, E. and Romano, J. (2005)., Testing Statistical Hypotheses. Springer, New York.
  • [26] Lei, L. and Fithian, W. (2016). Power of ordered hypothesis testing., arXiv preprint arXiv:1606.01969.
  • [27] Li, A. and Barber, R. (2017). Accumulation tests for FDR control in ordered hypothesis testing., J. Amer. Statist. Assoc. 112 837–849.
  • [28] Li, J. and Mehrotra, D. (2008). An efficient method for accommodating potentially underpowered primary endpoints., Statistics in Medicine 27 5377–5391.
  • [29] Maurer, W., Hothorn, L., and Lehmacher, W. (1995)., Multiple comparisons in drug clinical trials and preclinical assays: A-priori ordered hypotheses. Vol. 6, Fischer-Verlag, Stuttgart, Germany.
  • [30] Mehrotra, D. and Heyse, J. (2004). Use of the false discovery rate for evaluating clinical safety data., Statistical Methods in Medical Research 13 227–238.
  • [31] Millen, B. and Dmitrienko, A. (2011). Chain procedures: A class of flexible closed testing procedures with clinical trial applications., Statistics in Biopharmaceutical Reseach 3 14–30.
  • [32] Rosenbaum, P. (2008). Testing hypotheses in order., Biometrika 95 248–252.
  • [33] Ross, G. J., Tasoulis, D., and Adams, N. (2011). Nonparametric monitoring of data streams for changes in location and scale., Technometrics 53 379–389.
  • [34] Sarkar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures., Ann. Statist. 30 239–257.
  • [35] van’t Wout, A., Lehrma, G., Mikheeva, S., OKeeffe, G., Katze, M., Bumgarner, R., Geiss, G., and Mullins, J. (2003). Cellular gene expression upon human immunodeficiency virus type 1 infection of CD4(+)-T-cell lines., Journal of Virology 77 1392–1402.
  • [36] Westfall, P. and Krishen, A. (2001). Optimally weighted, fixed sequence and gate-keeper multiple testing procedures., Journal of Statistical Planning and Inference 99 25–41.
  • [37] Westfall, P., Kropf, S., and Finos, L. (2004). Weighted FWE-controlling methods in highdimensional situations. In, Recent Developments in Multiple Comparison Procedures, eds. Y. Benjamini, F. Bretz, and S. Sarkar, Vol. 47, Beachwood, OH: Institute of Mathematical Statistics, pp. 143–154.
  • [38] Wiens, B. (2003). A fixed sequence Bonferroni procedure for testing multiple endpoints., Pharmaceutical Statistics 2 211–215.
  • [39] Wiens, B. and Dmitrienko, A. (2005). The fallback procedure for evaluating a single family of hypotheses., J. Biopharm. Stat. 15 929–942.
  • [40] Wiens, B. and Dmitrienko, A. (2010). On selecting a multiple comparison procedure for analysis of a clinical trial: Fallback, fixed sequence, and related procedures., Statistics in Biopharmaceutical Research 2 22–32.
  • [41] Yekutieli, D. (2008). Hierarchical false discovery rate-controlling methodology., J. Amer. Statist. Assoc. 103 309–316.