## Electronic Journal of Statistics

### A dimension reduction based approach for estimation and variable selection in partially linear single-index models with high-dimensional covariates

#### Abstract

In this paper, we formulate the partially linear single-index models as bi-index dimension reduction models for the purpose of identifying significant covariates in both the linear part and the single-index part through only one combined index in a dimension reduction approach. This is different from all existing dimension reduction methods in the literature, which in general identify two basis directions in the subspace spanned by the parameter vectors of interest, rather than the two parameter vectors themselves. This approach makes the identification and the subsequent estimation and variable selection easier than existing methods for multi-index models. When the number of parameters diverges with the sample size, we then adopt coordinate-independent sparse estimation procedure to select significant covariates and estimate the corresponding parameters. The resulting sparse dimension reduction estimators are shown to be consistent and asymptotically normal with the oracle property. Simulations are conducted to evaluate the performance of the proposed method, and a real data set is analysed for an illustration.

#### Article information

Source
Electron. J. Statist., Volume 6 (2012), 2235-2273.

Dates
First available in Project Euclid: 30 November 2012

https://projecteuclid.org/euclid.ejs/1354284419

Digital Object Identifier
doi:10.1214/12-EJS744

Mathematical Reviews number (MathSciNet)
MR3020262

Zentralblatt MATH identifier
1295.62046

#### Citation

Zhang, Jun; Wang, Tao; Zhu, Lixing; Liang, Hua. A dimension reduction based approach for estimation and variable selection in partially linear single-index models with high-dimensional covariates. Electron. J. Statist. 6 (2012), 2235--2273. doi:10.1214/12-EJS744. https://projecteuclid.org/euclid.ejs/1354284419

#### References

• Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding., The Annals of Statistics 36(6): 2577–2604.
• Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices., The Annals of Statistics 36(1): 199–227.
• Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation., Journal of the American Statistical Association 106(494): 672–684.
• Carroll, R., Fan, J., Gijbels, I., and Wand, M. P. (1997). Generalized partially linear single-index models., Journal of the American Statistical Association 92: 477–489.
• Chen, H. (1988). Convergence rates for parametric components in a partly linear model., The Annals of Statistics 16: 136–146.
• Chen, X., Zou, C., and Cook, R. D. (2010). Coordinate-independent sparse sufficient dimension reduction and variable selection., The Annals of Statistics 38: 3696–3723.
• Cook, R. D. (1996a). Added-variable plots and curvature in linear regression., Technometrics 38: 275–278.
• Cook, R. D. (1996b). Graphics for regressions with a binary response., Journal of the American Statistical Association 91: 983–992.
• Cook, R. D. (1998)., Regression Graphics. John Wiley & Sons Inc., New York. Ideas for studying regressions through graphics, A Wiley-Interscience Publication.
• Cook, R. D. and Weisberg, S. (1991). Comment on “sliced inverse regression for dimension reduction”., Journal of the American Statistical Association 86: 328–332.
• Cui, X., Härdle, W., and Zhu, L.-X. (2011). The EFM approach for single-index models., The Annals of Statistics 39: 1658–1688.
• Cuzick, J. (1992). Efficient estimates in semiparametric additive regression models with unknown error distribution., The Annals of Statistics 20: 1129–1136.
• Engle, R., Granger, C., Rice, J., and Weiss, A. (1986). Semiparametric estimates of the relation between weather and electricity sales., Journal of the American Statistical Association 81: 310–320.
• Fan, J. and Gijbels, I. (1996)., Local Polynomial Modelling and Its Applications. Chapman & Hall, London.
• Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., Journal of the American Statistical Association 96: 1348–1360.
• Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: feature selection in knowledge discovery. In, International Congress of Mathematicians. Vol. III, pages 595–622. Eur. Math. Soc., Zürich.
• Fan, J. and Peng, H. (2004). On non-concave penalized likelihood with diverging number of parameters., The Annals of Statistics 32: 928–961.
• Härdle, W., Hall, P., and Ichimura, H. (1993). Optimal smoothing in single-index models., The Annals of Statistics 21: 157–178.
• Härdle, W., Liang, H., and Gao, J. T. (2000)., Partially Linear Models. Springer Physica, Heidelberg.
• Heckman, N. E. (1986). Spline smoothing in partly linear models., Journal of the Royal Statistical Society, Series B 48: 244–248.
• Horowitz, J. L. (2009)., Semiparametric and Nonparametric Methods in Econometrics. Springer, New York.
• Huang, J., Horowitz, J. L., and Wei, F. (2010). Variable selection in nonparametric additive models., The Annals of Statistics 38: 2281–2313.
• Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models., Journal of Econometrics 58: 71–120.
• Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrices estimation., Annals of Statistics 37(6B): 4254–4278.
• Li, B. and Wang, S. (2007). On directional regression for dimension reduction., Journal of the American Statistical Association 102: 997–1008.
• Li, G. R., Peng, H., Zhang, J., and Zhu, L. X. (2012). Robust rank correlation based screening., The Annals of Statistics 40: 1846–1877.
• Li, K.-C. (1991). Sliced inverse regression for dimension reduction (with discussion)., Journal of the American Statistical Association 86: 316–342.
• Li, K.-C. (1992). On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma., Journal of the American Statistical Association 87: 1025–1039.
• Li, L. (2007). Sparse sufficient dimension reduction., Biometrika 94: 603–613.
• Liang, H., Härdle, W., and Carroll, R. (1999). Estimation in a semiparametric partially linear errors-in-variables model., The Annals of Statistics 27: 1519–1535.
• Liang, H., Liu, X., Li, R., and Tsai, C. L. (2010). Estimation and testing for partially linear single-index models., The Annals of Statistics 38: 3811–3836.
• Liang, H., Wang, S., Robins, J., and Carroll, R. (2004). Estimation in partially linear models with missing covariates., Journal of the American Statistical Association 99: 357–367.
• Lin, Y. and Zhang, H. H. (2006). Component selection and smoothing in smoothing spline analysis of variance models, 34(5): 2272–2297.
• Manton, J. H. (2002). Optimization algorithms exploiting unitary constraints., IEEE Transactions on Signal Processing 50: 635–650.
• Meier, L., van de Geer, S., and Bühlmann, P. (2009). High-dimensional additive modeling., The Annals of Statistics 37: 3779–3821.
• Naik, P. A. and Tsai, C.-L. (2001). Single-index model selections., Biometrika 88: 821–832.
• Ni, X., Zhang, H. H., and Zhang, D. (2009). Automatic model selection for partially linear models., Journal of Multivariate Analysis 100: 2100–2111.
• Ravikumar, P., Lafferty, H., Liu, H., and Wasserman, L. (2009). Sparse additive models., Journal of the Royal Statistical Society, Series B 71: 1009–1030.
• Sala-I-Martin, X. X. (1997). I just ran two million regressions., The American Economic Review 87: 178–183.
• Serfling, R. J. (1980)., Approximation Theorems of Mathematical Statistics. John Wiley & Sons Inc., New York.
• Speckman, P. E. (1988). Kernel smoothing in partial linear models., Journal of the Royal Statistical Society, Series B, 50: 413–436.
• Wahba, G. (1984). Partial spline models for the semiparametric estimation of functions of several variables. In, Statistical Analyses for Time Series, pages 319–329, Tokyo. Institute of Statistical Mathematics. Japan-US Joint Seminar.
• Wang, H. and Xia, Y. (2008). Sliced regression for dimension reduction., Journal of the American Statistical Association 103: 811–821.
• Wang, J. L., Xue, L. G., Zhu, L. X., and Chong, Y. S. (2010). Estimation for a partial-linear single-index model., The Annals of Statistics 38: 246–274.
• Wang, L. and Yang, L. (2009). Spline estimation of single-index models., Statistica Sinica 19: 765–783.
• Wang, T. and Zhu, L. X. (2011). Consistent model selection and estimation in a general single-index model with “large $p$ and small $n$”. Technical report, Department of Mathematics, Hong Kong Baptist University, Hong, Kong.
• Wu, Y. and Li, L. (2011). Asymptotic properties of sufficient dimension reduction with a diverging number of predictors., Statistica Sinica 21: 707–730.
• Xia, Y. and Härdle, W. (2006). Semi-parametric estimation of partially linear single-index models., Journal of Multivariate Analysis 97: 1162–1184.
• Xia, Y., Tong, H., Li, W. K., and Zhu, L. X. (2002). An adaptive estimation of dimension reduction space., Journal of the Royal Statistical Society, Series B 64: 363–410.
• Xie, H. and Huang, J. (2009). SCAD-penalized regression in high-dimensional partially linear models., The Annals of Statistics 37: 673–696.
• Yin, X. and Cook, R. D. (2002). Dimension reduction for the conditional $k$th moment in regression., Journal of the Royal Statistical Society, Series B 64(2): 159–175.
• Yu, Y. and Ruppert, D. (2002). Penalized spline estimation for partially linear single-index models., Journal of the American Statistical Association 97: 1042–1054.
• Yu, Z., Li, B., and Zhu, L. X. (2011). Asymptotic expansion for dimension reduction methods and its application to bias correction. Technical report, The Pennsylvania State, University.
• Zhang, H. H., Cheng, G., and Liu, Y. (2011). Linear or nonlinear? Automatic structure discovery for partially linear models., Journal of the American Statistical Association 106: 1099–1112.
• Zhu, L. P., Wang, T., Zhu, L. X., and Ferré, L. (2010). Sufficient dimension reduction through discretization-expectation estimation., Biometrika 97: 295–304.
• Zhu, L. P. and Zhu, L. X. (2009a). Dimension reduction for conditional variance in regressions., Statistica Sinica 19: 869–883.
• Zhu, L. P. and Zhu, L. X. (2009b). On distribution-weighted partial least squares with diverging number of highly correlated predictors., Journal of the Royal Statistical Society, Series B 71: 525–548.
• Zhu, L. P., Zhu, L. X., and Feng, Z. H. (2010). Dimension reduction in regressions through cumulative slicing estimation., Journal of the American Statistical Association 105: 1455–1466.
• Zhu, L. X. and Fang, K. T. (1996). Asymptotics for kernel estimate of sliced inverse regression., The Annals of Statistics 24: 1053–1068.
• Zhu, L. X., Miao, B., and Peng, H. (2006). On sliced inverse regression with high-dimensional covariates., Journal of the American Statistical Association 101: 630–643.