## Electronic Journal of Statistics

### Rank-based score tests for high-dimensional regression coefficients

#### Abstract

This article is concerned with simultaneous tests on linear regression coefficients in high-dimensional settings. When the dimensionality is larger than the sample size, the classic $F$-test is not applicable since the sample covariance matrix is not invertible. Recently, [5] and [17] proposed testing procedures by excluding the inverse term in $F$-statistics. However, the efficiency of such $F$-statistic-based methods is adversely affected by outlying observations and heavy tailed distributions. To overcome this issue, we propose a robust score test based on rank regression. The asymptotic distributions of the proposed test statistic under the high-dimensional null and alternative hypotheses are established. Its asymptotic relative efficiency with respect to [17]’s test is closely related to that of the Wilcoxon test in comparison with the $t$-test. Simulation studies are conducted to compare the proposed procedure with other existing testing procedures and show that our procedure is generally more robust in both sizes and powers.

#### Article information

Source
Electron. J. Statist., Volume 7 (2013), 2131-2149.

Dates
First available in Project Euclid: 23 August 2013

https://projecteuclid.org/euclid.ejs/1377268991

Digital Object Identifier
doi:10.1214/13-EJS839

Mathematical Reviews number (MathSciNet)
MR3104951

Zentralblatt MATH identifier
1349.62218

Subjects
Primary: 62H15: Hypothesis testing
Secondary: 62G20, 62J05

#### Citation

Feng, Long; Zou, Changliang; Wang, Zhaojun; Chen, Bin. Rank-based score tests for high-dimensional regression coefficients. Electron. J. Statist. 7 (2013), 2131--2149. doi:10.1214/13-EJS839. https://projecteuclid.org/euclid.ejs/1377268991

#### References

• [1] Bai, Z. and Saranadasa, H. (1996). Effect of high dimension: by an example of a two sample problem., Statistica Sinica, 6, 311–329.
• [2] Efron, B. and Tibshirani, R. (2007). On testing the significance of sets of genes., The Annals of Applied Statistics, 1, 107–129.
• [3] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space., Journal of the Royal Statistical Society, Series B, 70, 849–911.
• [4] Goeman, J., van Houwelingen, J. C., and Finos, L. (2011). Testing against a high dimensional alternative in the generalized linear model: asymptotic type I error control., Biometrika, 98, 381–390.
• [5] Goeman, J., Van De Geer, S. A., and Houwelingen, V. (2006). Testing against a highdimensional alternative., Journal of the Royal Statistical Society, Series B, 68, 477–493.
• [6] Hall, P. and Heyde, C. C. (1980)., Martingale Limit Theory and Its Application. New York: Academic Press.
• [7] Hettmansperger, T. P. and McKean, J. W. (1998)., Robust Nonparametric Statistical Methods. London: Arnold.
• [8] Li, G. R., Peng, H., Zhang, J., and Zhu, L. X. (2012). Robust rank correlation based screening., Annals of Statistics, 40, 1846–1877.
• [9] McKean, J. W. and Hettmansperger, T. P. (1976). Tests of hypotheses of the general linear models based on ranks., Communications in Statistics-Theory and Methods, 5, 693–709.
• [10] Meinshausen, N. (2008). Hierarchical testing of variable importance., Biometrika, 95, 265–278.
• [11] Newton, M., Quintana, F., Den Boon, J., Sengupta, S., and Ahlquist, P. (2007). Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis., The Annals of Applied Statistics, 1, 85–106.
• [12] Redfern, C. H., Coward, P., Degtyarev, M. Y., Lee, E. K., Kwa, A. T., Hennighausen, L., Bujard, H., Fishman, G. I., and Conklin, B. R. (1999). Conditional expression and signaling of a specifically designed Gi-coupled receptor in transgenic mice., Nature Biotechnology, 17, 165–169.
• [13] Srivastava, M. S. (2009). A test for the mean vector with fewer observations than the dimension under non-normality., Journal of Multivariate Analysis, 100, 518–532.
• [14] Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S., and Mesirov, J. P. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles., Proceedings of the National Academy of Sciences, 102, 15545–15550.
• [15] Wang, H. (2009). Forward regression for ultra-high dimensional variable screening., Journal of the American Statistical Association, 104, 1512–1524.
• [16] Wang, L. (2009). Wilcoxon-type generalized Bayesian information criterion., Biometrika, 96, 163–173.
• [17] Zhong, P.-S. and Chen, S. X. (2011). Tests for high dimensional regression coefficients with factorial designs., Journal of the American Statistical Association, 106, 260–274.