Electronic Journal of Statistics

Rank-based score tests for high-dimensional regression coefficients

Long Feng, Changliang Zou, Zhaojun Wang, and Bin Chen

Full-text: Open access

Abstract

This article is concerned with simultaneous tests on linear regression coefficients in high-dimensional settings. When the dimensionality is larger than the sample size, the classic $F$-test is not applicable since the sample covariance matrix is not invertible. Recently, [5] and [17] proposed testing procedures by excluding the inverse term in $F$-statistics. However, the efficiency of such $F$-statistic-based methods is adversely affected by outlying observations and heavy tailed distributions. To overcome this issue, we propose a robust score test based on rank regression. The asymptotic distributions of the proposed test statistic under the high-dimensional null and alternative hypotheses are established. Its asymptotic relative efficiency with respect to [17]’s test is closely related to that of the Wilcoxon test in comparison with the $t$-test. Simulation studies are conducted to compare the proposed procedure with other existing testing procedures and show that our procedure is generally more robust in both sizes and powers.

Article information

Source
Electron. J. Statist., Volume 7 (2013), 2131-2149.

Dates
First available in Project Euclid: 23 August 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1377268991

Digital Object Identifier
doi:10.1214/13-EJS839

Mathematical Reviews number (MathSciNet)
MR3104951

Zentralblatt MATH identifier
1349.62218

Subjects
Primary: 62H15: Hypothesis testing
Secondary: 62G20, 62J05

Keywords
Asymptotic normality high-dimensional data large $p$, small $n$ rank regression wicoxon test

Citation

Feng, Long; Zou, Changliang; Wang, Zhaojun; Chen, Bin. Rank-based score tests for high-dimensional regression coefficients. Electron. J. Statist. 7 (2013), 2131--2149. doi:10.1214/13-EJS839. https://projecteuclid.org/euclid.ejs/1377268991


Export citation

References

  • [1] Bai, Z. and Saranadasa, H. (1996). Effect of high dimension: by an example of a two sample problem., Statistica Sinica, 6, 311–329.
  • [2] Efron, B. and Tibshirani, R. (2007). On testing the significance of sets of genes., The Annals of Applied Statistics, 1, 107–129.
  • [3] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space., Journal of the Royal Statistical Society, Series B, 70, 849–911.
  • [4] Goeman, J., van Houwelingen, J. C., and Finos, L. (2011). Testing against a high dimensional alternative in the generalized linear model: asymptotic type I error control., Biometrika, 98, 381–390.
  • [5] Goeman, J., Van De Geer, S. A., and Houwelingen, V. (2006). Testing against a highdimensional alternative., Journal of the Royal Statistical Society, Series B, 68, 477–493.
  • [6] Hall, P. and Heyde, C. C. (1980)., Martingale Limit Theory and Its Application. New York: Academic Press.
  • [7] Hettmansperger, T. P. and McKean, J. W. (1998)., Robust Nonparametric Statistical Methods. London: Arnold.
  • [8] Li, G. R., Peng, H., Zhang, J., and Zhu, L. X. (2012). Robust rank correlation based screening., Annals of Statistics, 40, 1846–1877.
  • [9] McKean, J. W. and Hettmansperger, T. P. (1976). Tests of hypotheses of the general linear models based on ranks., Communications in Statistics-Theory and Methods, 5, 693–709.
  • [10] Meinshausen, N. (2008). Hierarchical testing of variable importance., Biometrika, 95, 265–278.
  • [11] Newton, M., Quintana, F., Den Boon, J., Sengupta, S., and Ahlquist, P. (2007). Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis., The Annals of Applied Statistics, 1, 85–106.
  • [12] Redfern, C. H., Coward, P., Degtyarev, M. Y., Lee, E. K., Kwa, A. T., Hennighausen, L., Bujard, H., Fishman, G. I., and Conklin, B. R. (1999). Conditional expression and signaling of a specifically designed Gi-coupled receptor in transgenic mice., Nature Biotechnology, 17, 165–169.
  • [13] Srivastava, M. S. (2009). A test for the mean vector with fewer observations than the dimension under non-normality., Journal of Multivariate Analysis, 100, 518–532.
  • [14] Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S., and Mesirov, J. P. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles., Proceedings of the National Academy of Sciences, 102, 15545–15550.
  • [15] Wang, H. (2009). Forward regression for ultra-high dimensional variable screening., Journal of the American Statistical Association, 104, 1512–1524.
  • [16] Wang, L. (2009). Wilcoxon-type generalized Bayesian information criterion., Biometrika, 96, 163–173.
  • [17] Zhong, P.-S. and Chen, S. X. (2011). Tests for high dimensional regression coefficients with factorial designs., Journal of the American Statistical Association, 106, 260–274.