The Annals of Statistics

Cramér-type moderate deviations for Studentized two-sample $U$-statistics with applications

Jinyuan Chang, Qi-Man Shao, and Wen-Xin Zhou

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Two-sample $U$-statistics are widely used in a broad range of applications, including those in the fields of biostatistics and econometrics. In this paper, we establish sharp Cramér-type moderate deviation theorems for Studentized two-sample $U$-statistics in a general framework, including the two-sample $t$-statistic and Studentized Mann–Whitney test statistic as prototypical examples. In particular, a refined moderate deviation theorem with second-order accuracy is established for the two-sample $t$-statistic. These results extend the applicability of the existing statistical methodologies from the one-sample $t$-statistic to more general nonlinear statistics. Applications to two-sample large-scale multiple testing problems with false discovery rate control and the regularized bootstrap method are also discussed.

Article information

Source
Ann. Statist., Volume 44, Number 5 (2016), 1931-1956.

Dates
Received: June 2015
First available in Project Euclid: 12 September 2016

Permanent link to this document
https://projecteuclid.org/euclid.aos/1473685264

Digital Object Identifier
doi:10.1214/15-AOS1375

Mathematical Reviews number (MathSciNet)
MR3546439

Zentralblatt MATH identifier
1272.68116

Subjects
Primary: 60F10: Large deviations 62E17: Approximations to distributions (nonasymptotic)
Secondary: 62E20: Asymptotic distribution theory 62F40: Bootstrap, jackknife and other resampling methods 62H15: Hypothesis testing

Keywords
Bootstrap false discovery rate Mann–Whitney $U$ test multiple hypothesis testing self-normalized moderate deviation Studentized statistics two-sample $t$-statistic two-sample $U$-statistics

Citation

Chang, Jinyuan; Shao, Qi-Man; Zhou, Wen-Xin. Cramér-type moderate deviations for Studentized two-sample $U$-statistics with applications. Ann. Statist. 44 (2016), no. 5, 1931--1956. doi:10.1214/15-AOS1375. https://projecteuclid.org/euclid.aos/1473685264


Export citation

References

  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol. 57 289–300.
  • Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165–1188.
  • Borovskich, Y. V. (1983). Asymptotics of $U$-statistics and Von Mises’ functionals. Soviet Math. Dokl. 27 303–308.
  • Cao, H. and Kosorok, M. R. (2011). Simultaneous critical values for $t$-tests in very high dimensions. Bernoulli 17 347–394.
  • Chang, J., Shao, Q. and Zhou, W.-X. (2016). Supplement to “Cramér-type moderate deviations for Studentized two-sample $U$-statistics with applications.” DOI:10.1214/15-AOS1375SUPP.
  • Chang, J., Tang, C. Y. and Wu, Y. (2013). Marginal empirical likelihood and sure independence feature screening. Ann. Statist. 41 2123–2148.
  • Chang, J., Tang, C. Y. and Wu, Y. (2016). Local independence feature screening for nonparametric and semiparametric models by marginal empirical likelihood. Ann. Statist. 44 515–539.
  • Charness, G. and Gneezy, U. (2009). Incentives to exercise. Econometrica 77 909–931.
  • Chen, S. X. and Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Statist. 38 808–835.
  • Chen, L. H. Y. and Shao, Q.-M. (2007). Normal approximation for nonlinear statistics using a concentration inequality approach. Bernoulli 13 581–599.
  • Chen, S. X., Zhang, L.-X. and Zhong, P.-S. (2010). Tests for high-dimensional covariance matrices. J. Amer. Statist. Assoc. 105 810–819.
  • Chung, E. and Romano, J. P. (2013). Exact and asymptotically robust permutation tests. Ann. Statist. 41 484–507.
  • Chung, E. and Romano, J. (2016). Asymptotically valid and exact permutation tests based on two-sample $U$-statistics. J. Statist. Plann. Inference. 168 97–105.
  • Delaigle, A., Hall, P. and Jin, J. (2011). Robustness and accuracy of methods for high dimensional data analysis based on Student’s $t$-statistic. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 283–301.
  • Dudoit, S. and van der Laan, M. J. (2008). Multiple Testing Procedures with Applications to Genomics. Springer, New York.
  • Fan, J., Hall, P. and Yao, Q. (2007). To how many simultaneous hypothesis tests can normal, Student’s $t$ or bootstrap calibration be applied? J. Amer. Statist. Assoc. 102 1282–1288.
  • Fan, J., Han, X. and Gu, W. (2012). Estimating false discovery proportion under arbitrary covariance dependence. J. Amer. Statist. Assoc. 107 1019–1035.
  • Ferreira, J. A. and Zwinderman, A. H. (2006). On the Benjamini–Hochberg method. Ann. Statist. 34 1827–1849.
  • Friguet, C., Kloareg, M. and Causeur, D. (2009). A factor model approach to multiple testing under dependence. J. Amer. Statist. Assoc. 104 1406–1415.
  • Hall, P. (1990). On the relative performance of bootstrap and Edgeworth approximations of a distribution function. J. Multivariate Anal. 35 108–129.
  • Hall, P. and Wilson, S. R. (1991). Two guidelines for bootstrap hypothesis testing. Biometrics 47 757–762.
  • Helmers, R. and Janssen, P. (1982). On the Berry–Esseen theorem for multivariate $U$-statistics. In Math. Cent. Rep. SW 90 1–22. Mathematisch Centrum, Amsterdam.
  • Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statistics 19 293–325.
  • Jing, B.-Y., Shao, Q.-M. and Wang, Q. (2003). Self-normalized Cramér-type large deviations for independent random variables. Ann. Probab. 31 2167–2215.
  • Kochar, S. C. (1979). Distribution-free comparison of two probability distributions with reference to their hazard rates. Biometrika 66 437–441.
  • Koroljuk, V. S. and Borovskich, Y. V. (1994). Theory of $U$-Statistics. Mathematics and Its Applications 273. Kluwer Academic, Dordrecht.
  • Kosorok, M. R. and Ma, S. (2007). Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data. Ann. Statist. 35 1456–1486.
  • Kowalski, J. and Tu, X. M. (2007). Modern Applied $U$-Statistics. Wiley, Hoboken, NJ.
  • Lai, T. L., Shao, Q.-M. and Wang, Q. (2011). Cramér type moderate deviations for Studentized $U$-statistics. ESAIM Probab. Stat. 15 168–179.
  • Leek, J. T. and Storey, J. D. (2008). A general framework for multiple testing dependence. Proc. Natl. Acad. Sci. USA 105 18718–18723.
  • Li, R., Zhong, W. and Zhu, L. (2012). Feature screening via distance correlation learning. J. Amer. Statist. Assoc. 107 1129–1139.
  • Li, G., Peng, H., Zhang, J. and Zhu, L. (2012). Robust rank correlation based screening. Ann. Statist. 40 1846–1877.
  • Liu, W. and Shao, Q.-M. (2010). Cramér-type moderate deviation for the maximum of the periodogram with application to simultaneous tests in gene expression time series. Ann. Statist. 38 1913–1935.
  • Liu, W. and Shao, Q.-M. (2014). Phase transition and regularized bootstrap in large-scale $t$-tests with false discovery rate control. Ann. Statist. 42 2003–2025.
  • Mann, H. B. and Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statistics 18 50–60.
  • Nikitin, Y. and Ponikarov, E. (2006). On large deviations of nondegenerate two-sample $U$- and $V$-statistics with applications to Bahadur efficiency. Math. Methods Statist. 15 103–122.
  • Okeh, U. M. (2009). Statistical analysis of the application of Wilcoxon and Mann–Whitney $U$ test in medical research studies. Biotechnol. Molec. Biol. Rev. 4 128–131.
  • Shao, Q.-M. and Zhou, W.-X. (2016). Cramér type moderate deviation theorems for self-normalized processes. Bernoulli 22 2029–2079.
  • Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 187–205.
  • Vandemaele, M. and Veraverbeke, N. (1985). Cramér type large deviations for Studentized $U$-statistics. Metrika 32 165–179.
  • Wang, Q. (2005). Limit theorems for self-normalized large deviation. Electron. J. Probab. 10 1260–1285 (electronic).
  • Wang, Q. (2011). Refined self-normalized large deviations for independent random variables. J. Theoret. Probab. 24 307–329.
  • Wang, Q. and Hall, P. (2009). Relative errors in central limit theorems for Student’s $t$ statistic, with applications. Statist. Sinica 19 343–354.
  • Wang, Q., Jing, B.-Y. and Zhao, L. (2000). The Berry–Esseen bound for Studentized statistics. Ann. Probab. 28 511–535.
  • Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics 1 80–83.
  • Zhong, P.-S. and Chen, S. X. (2011). Tests for high-dimensional regression coefficients with factorial designs. J. Amer. Statist. Assoc. 106 260–274.

Supplemental materials

  • Supplement to “Cramér-type moderate deviations for Studentized two-sample $U$-statistics with applications”. This supplemental material contains proofs for all the theoretical results in the main text, including Theorems 2.2, 2.4, 3.1 and 3.4, and additional numerical results.