## The Annals of Statistics

### On marginal sliced inverse regression for ultrahigh dimensional model-free feature selection

#### Abstract

Model-free variable selection has been implemented under the sufficient dimension reduction framework since the seminal paper of Cook [Ann. Statist. 32 (2004) 1062–1092]. In this paper, we extend the marginal coordinate test for sliced inverse regression (SIR) in Cook (2004) and propose a novel marginal SIR utility for the purpose of ultrahigh dimensional feature selection. Two distinct procedures, Dantzig selector and sparse precision matrix estimation, are incorporated to get two versions of sample level marginal SIR utilities. Both procedures lead to model-free variable selection consistency with predictor dimensionality $p$ diverging at an exponential rate of the sample size $n$. As a special case of marginal SIR, we ignore the correlation among the predictors and propose marginal independence SIR. Marginal independence SIR is closely related to many existing independence screening procedures in the literature, and achieves model-free screening consistency in the ultrahigh dimensional setting. The finite sample performances of the proposed procedures are studied through synthetic examples and an application to the small round blue cell tumors data.

#### Article information

Source
Ann. Statist., Volume 44, Number 6 (2016), 2594-2623.

Dates
Revised: December 2015
First available in Project Euclid: 23 November 2016

https://projecteuclid.org/euclid.aos/1479891629

Digital Object Identifier
doi:10.1214/15-AOS1424

Mathematical Reviews number (MathSciNet)
MR3576555

Zentralblatt MATH identifier
1359.62218

#### Citation

Yu, Zhou; Dong, Yuexiao; Shao, Jun. On marginal sliced inverse regression for ultrahigh dimensional model-free feature selection. Ann. Statist. 44 (2016), no. 6, 2594--2623. doi:10.1214/15-AOS1424. https://projecteuclid.org/euclid.aos/1479891629

#### References

• Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
• Bondell, H. D. and Li, L. (2009). Shrinkage inverse regression estimation for model-free variable selection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 287–299.
• Cai, T. and Liu, W. (2011). A direct estimation approach to sparse linear discriminant analysis. J. Amer. Statist. Assoc. 106 1566–1577.
• Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
• Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
• Chen, X., Zou, C. and Cook, R. D. (2010). Coordinate-independent sparse sufficient dimension reduction and variable selection. Ann. Statist. 38 3696–3723.
• Cook, R. D. (1998). Regression Graphics: Ideas for Studying Regressions Through Graphics. Wiley, New York.
• Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction. Ann. Statist. 32 1062–1092.
• Cook, R. D. and Weisberg, S. (1991). Discussion of “Sliced inverse regression for dimension reduction,” by K. Li. J. Amer. Statist. Assoc. 86 328–332.
• Cook, R. D. and Yin, X. (2001). Dimension reduction and visualization in discriminant analysis. Aust. N. Z. J. Stat. 43 147–199.
• Cook, R. D. and Zhang, X. (2014). Fused estimators of the central subspace in sufficient dimension reduction. J. Amer. Statist. Assoc. 109 815–827.
• Cui, H., Li, R. and Zhong, W. (2015). Model-free feature screening for ultrahigh dimensional discriminant analysis. J. Amer. Statist. Assoc. 110 630–641.
• Diaconis, P. and Freedman, D. (1984). Asymptotics of graphical projection pursuit. Ann. Statist. 12 793–815.
• Fan, J., Feng, Y. and Wu, Y. (2009). Network exploration via the adaptive lasso and SCAD penalties. Ann. Appl. Stat. 3 521–541.
• Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
• Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 849–911.
• Fan, J., Samworth, R. and Wu, Y. (2009). Ultrahigh dimensional variable selection: Beyond the lienar model. J. Mach. Learn. Res. 10 1829–1853.
• Fan, J. and Song, R. (2010). Sure independence screening in generalized linear models with NP-dimensionality. Ann. Statist. 38 3567–3604.
• Frideman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
• Hall, P. and Li, K. (1993). On almost linearity of low-dimensional projections from high-dimensional data. Ann. Statist. 21 867–889.
• Huang, Q. and Zhu, Y. (2014). Model-free sure screening via maximum correlation. Available at arXiv:1403.0048.
• Jiang, B. and Liu, J. S. (2014). Variable selection for general index models via sliced inverse regression. Ann. Statist. 42 1751–1786.
• Khan, J., Wei, J. S., Ringnér, M., Saal, L. H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C. R., Peterson, C. and Meltzer, P. S. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7 673–679.
• Li, K. (1991). Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86 316–342.
• Li, L. (2007). Sparse sufficient dimension reduction. Biometrika 94 603–613.
• Li, L. and Nachtsheim, C. J. (2006). Sparse sliced inverse regression. Technometrics 48 503–510.
• Li, Q. and Shao, J. (2015). Regularizing LASSO: A consistent variable selection method. Statist. Sinica 25 975–992.
• Li, B. and Wang, S. (2007). On directional regression for dimension reduction. J. Amer. Statist. Assoc. 102 997–1008.
• Li, L. and Yin, X. (2008). Sliced inverse regression with regularizations. Biometrics 64 124–131, 323.
• Li, R., Zhong, W. and Zhu, L. (2012). Feature screening via distance correlation learning. J. Amer. Statist. Assoc. 107 1129–1139.
• Li, G., Peng, H., Zhang, J. and Zhu, L. (2012). Robust rank correlation based screening. Ann. Statist. 40 1846–1877.
• Mai, Q. and Zou, H. (2013). The Kolmogorov filter for variable screening in high-dimensional binary classification. Biometrika 100 229–234.
• Mai, Q. and Zou, H. (2015). The fused Kolmogorov filter: A nonparametric model-free screening method. Ann. Statist. 43 1471–1497.
• Ni, L., Cook, D. and Tsai, C. (2005). A note on shrinkage sliced inverse regression. Biometrika 92 242–247.
• Pan, R., Wang, H. and Li, R. (2015). Ultrahigh dimensional multi-class linear discriminant analysis by pairwise sure independence screening. J. Amer. Statist. Assoc. 110 630–641.
• Shao, Y., Cook, R. D. and Weisberg, S. (2007). Marginal tests with sliced average variance estimation. Biometrika 94 285–296.
• Székely, G. J., Rizzo, M. L. and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Ann. Statist. 35 2769–2794.
• Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58 267–288.
• van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
• Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. J. Amer. Statist. Assoc. 104 1512–1524.
• Wu, Y. and Li, L. (2011). Asymptotic properties of sufficient dimension reduction with a diverging number of predictors. Statist. Sinica 21 707–730.
• Yin, X. and Hilafu, H. (2015). Sequential sufficient dimension reduction for large $p$, small $n$ problems. J. R. Stat. Soc. Ser. B. Stat. Methodol. 77 879–892.
• Yu, Z. and Dong, Y. (2016). Model-free coordinate test and variable selection via directional regression. Statist. Sinica 26 1159–1174.
• Yu, Z., Dong, Y. and Zhu, L.-X. (2016). Trace pursuit: A general framework for model-free variable selection. J. Amer. Statist. Assoc. 111 813–821.
• Yu, Z., Zhu, L., Peng, H. and Zhu, L. (2013). Dimension reduction and predictor selection in semiparametric models. Biometrika 100 641–654.
• Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 49–67.
• Zhou, J. and He, X. (2008). Dimension reduction based on constrained canonical correlation and variable filtering. Ann. Statist. 36 1649–1668.
• Zhu, L., Li, L., Li, R. and Zhu, L. (2011). Model-free feature screening for ultrahigh-dimensional data. J. Amer. Statist. Assoc. 106 1464–1475.
• Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.