Electronic Journal of Statistics

Prediction in abundant high-dimensional linear regression

R. Dennis Cook, Liliana Forzani, and Adam J. Rothman

Full-text: Open access

Abstract

An abundant regression is one in which most of the predictors contribute information about the response, which is contrary to the common notion of a sparse regression where few of the predictors are relevant. We discuss asymptotic characteristics of methodology for prediction in abundant linear regressions as the sample size and number of predictors increase in various alignments. We show that some of the estimators can perform well for the purpose of prediction in abundant high-dimensional regressions.

Article information

Source
Electron. J. Statist. Volume 7 (2013), 3059-3088.

Dates
First available in Project Euclid: 16 December 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1387207935

Digital Object Identifier
doi:10.1214/13-EJS872

Mathematical Reviews number (MathSciNet)
MR3151762

Zentralblatt MATH identifier
1279.62140

Subjects
Primary: 62J05: Linear regression
Secondary: 62H12: Estimation

Keywords
Inverse regression least squares Moore-Penrose inverse sparse covariance estimation

Citation

Cook, R. Dennis; Forzani, Liliana; Rothman, Adam J. Prediction in abundant high-dimensional linear regression. Electron. J. Statist. 7 (2013), 3059--3088. doi:10.1214/13-EJS872. https://projecteuclid.org/euclid.ejs/1387207935


Export citation

References

  • Cai, T. T., Liu, W. and Luo, X. (2011). A constrained $\ell_1$ minimization approach to sparse precision matrix estimation., J. Amer. Statist. Assoc. 106 594–607.
  • Christensen, R. (1987)., Plane Answers to Complex Questions. Wiley, New York.
  • Chung, H. and Keleş, S. (2010). Sparse partial least squares regression for simultaneous dimension reduction and variable selection., Journal of The Royal Statistical Society Series B 72 3–25.
  • Cook, R. D. (2007). Fisher lecture: dimension reduction in regression., Statist. Sci. 22 1–26.
  • Cook, R. D. and Forzani, L. (2008). Principal fitted components for dimension reduction in regression., Statist. Sci. 23 485–501.
  • Cook, R. D. and Forzani, L. (2011). On the mean and variance of the generalized inverse of a singular Wishart matrix., Electronic Journal of Statistics 5 146–158.
  • Cook, R. D., Forzani, L. and Rothman, A. J. (2012). Estimating sufficient reductions of the predictors in abundant high-dimensional regressions., Ann. Statist. 40 353–384.
  • Dicker, L. (2012). Dense signals, linear estimators, and out-of-sample predictions for high-dimensional linear models. arXiv:1102.2952v3, [math.ST].
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., J. Amer. Statist. Assoc. 96 1348–1360.
  • Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools., Technometrics 35 109–135.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 432–441.
  • Henmi, M. and Eguchi, S. (2004). A paradox converning nuisance parameters and projected estimating functions., Biometrika 91 929–941.
  • Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: biased estimation for nonorthogonal problems., Technometrics 12 55–67.
  • Hsieh, C.-J., Sustik, M. A., Dhillon, I. S. and Ravikumar, P. K. (2011). Sparse inverse covariance matrix estimation using quadratic approximation. In, Advances in Neural Information Processing Systems, 24 2330–2338. MIT Press, Cambridge, MA.
  • Jeng, X. J. and Daye, Z. J. (2011). Sparse covariance thresholding for high-dimensional variable selection., Statistica Sinica 21 625–657.
  • Letac, G. and Massan, H. (2004). All invariant moments of the Wishart distribution., Scand. J. Statist. 31 295–318.
  • Magnus, J. R. and Neudecker, H. (1979). The commutation matrix: some properties and applications., Ann. Statist. 7 381–394.
  • Muirhead, R. J. (1982)., Aspects of Multivariate Statistical Theory. Wiley, New York.
  • Pourahmadi, M. (2011). Modeling covariance matrices: the GLM and regularization perspectives., Statistical Science 26 369–387.
  • Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence., Electronic Journal of Statistics 5 935–980.
  • Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electronic Journal of Statistics 2 494–515.
  • Sæbø, S., Almøy, T., Aarøe, J. and Aastveit, A. H. (2007). ST-PLS: a multi-directional nearest shrunken centroid type classifier via PLS., Journal of Chemometrics 20 54–62.
  • Shao, J. and Deng, X. (2012). Estimation in high-dimensional linear models with deterministic design matrices., Ann. Statist. 40 812–831.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc., Ser. B 58 267–288.
  • von Rosen, D. (1988). Moments of the inverted Wishart distribution., Scand. J. Statist. 15 97–109.
  • Witten, D. M. and Tibshirani, R. (2009). Covariance-regularized regression and classification for high-dimensional problems., Journal of The Royal Statistical Society Series B 71 615–636.
  • Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika 94 19–35.
  • Zou, H. (2005). Regularization and variable selection via the elastic net., Journal of The Royal Statistical Society Series B 67 301–320.
  • Zou, H. (2006). The adaptive lasso and its oracle properties., J. Amer. Statist. Assoc. 101 1418–1429.