The Annals of Statistics

On asymptotically optimal confidence regions and tests for high-dimensional models

Sara van de Geer, Peter Bühlmann, Ya’acov Ritov, and Ruben Dezeure

Full-text: Open access

Abstract

We propose a general method for constructing confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in a high-dimensional model. It can be easily adjusted for multiplicity taking dependence among tests into account. For linear models, our method is essentially the same as in Zhang and Zhang [J. R. Stat. Soc. Ser. B Stat. Methodol. 76 (2014) 217–242]: we analyze its asymptotic properties and establish its asymptotic optimality in terms of semiparametric efficiency. Our method naturally extends to generalized linear models with convex loss functions. We develop the corresponding theory which includes a careful analysis for Gaussian, sub-Gaussian and bounded correlated designs.

Article information

Source
Ann. Statist. Volume 42, Number 3 (2014), 1166-1202.

Dates
First available in Project Euclid: 20 June 2014

Permanent link to this document
https://projecteuclid.org/euclid.aos/1403276911

Digital Object Identifier
doi:10.1214/14-AOS1221

Mathematical Reviews number (MathSciNet)
MR3224285

Zentralblatt MATH identifier
1305.62259

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62J12: Generalized linear models 62F25: Tolerance and confidence regions

Keywords
Central limit theorem generalized linear model lasso linear model multiple testing semiparametric efficiency sparsity

Citation

van de Geer, Sara; Bühlmann, Peter; Ritov, Ya’acov; Dezeure, Ruben. On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 (2014), no. 3, 1166--1202. doi:10.1214/14-AOS1221. https://projecteuclid.org/euclid.aos/1403276911.


Export citation

References

  • [1] Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection amongst high-dimensional controls. Rev. Econ. Stud. 81 608–650.
  • [2] Belloni, A., Chernozhukov, V. and Kato, K. (2013). Uniform postselection inference for LAD regression models. Available at arXiv:1306.0282.
  • [3] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98 791–806.
  • [4] Belloni, A., Chernozhukov, V. and Wei, Y. (2013). Honest confidence regions for logistic regression with a large number of controls. Available at arXiv:1306.3969.
  • [5] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference. Ann. Statist. 41 802–837.
  • [6] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • [7] Bühlmann, P. (2006). Boosting for high-dimensional linear models. Ann. Statist. 34 559–583.
  • [8] Bühlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli 19 1212–1242.
  • [9] Bühlmann, P., Kalisch, M. and Meier, L. (2014). High-dimensional statistics with a view toward applications in biology. Annual Review of Statistics and Its Applications 1 255–278.
  • [10] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
  • [11] Bunea, F., Tsybakov, A. and Wegkamp, M. (2007). Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1 169–194.
  • [12] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • [13] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping lasso estimators. J. Amer. Statist. Assoc. 106 608–625.
  • [14] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap. Ann. Statist. 41 1232–1259.
  • [15] Cramér, H. (1946). Mathematical Methods of Statistics. Princeton Mathematical Series 9. Princeton Univ. Press, Princeton, NJ.
  • [16] Dümbgen, L., van de Geer, S. A., Veraar, M. C. and Wellner, J. A. (2010). Nemirovski’s inequalities revisited. Amer. Math. Monthly 117 138–160.
  • [17] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 849–911.
  • [18] Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statist. Sinica 20 101–148.
  • [19] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
  • [20] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularized paths for generalized linear models via coordinate descent. Journal of Statistical Software 33 1–22.
  • [21] Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10 971–988.
  • [22] Javanmard, A. and Montanari, A. (2013). Confidence intervals and hypothesis testing for high-dimensional regression. Available at arXiv:1306.3171.
  • [23] Javanmard, A. and Montanari, A. (2013). Hypothesis testing in high-dimensional regression under the Gaussian random design model: Asymptotic theory. Available at arXiv:1301.4240v1.
  • [24] Juditsky, A., Kilinç Karzan, F., Nemirovski, A. and Polyak, B. (2012). Accuracy guaranties for $\ell_1$ recovery of block-sparse signals. Ann. Statist. 40 3077–3107.
  • [25] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356–1378.
  • [26] Lederer, J. and van de Geer, S. (2014). New concentration inequalities for suprema of empirical processes. Bernoulli. To appear. Available at arXiv:1111.3486.
  • [27] Li, K.-C. (1989). Honest confidence regions for nonparametric regression. Ann. Statist. 17 1001–1008.
  • [28] Meier, L., van de Geer, S. and Bühlmann, P. (2008). The group Lasso for logistic regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 53–71.
  • [29] Meinshausen, N. (2013). Assumption-free confidence intervals for groups of variables in sparse high-dimensional regression. Available at arXiv:1309.3489.
  • [30] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • [31] Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417–473.
  • [32] Meinshausen, N., Meier, L. and Bühlmann, P. (2009). $p$-values for high-dimensional regression. J. Amer. Statist. Assoc. 104 1671–1681.
  • [33] Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 246–270.
  • [34] Negahban, S. N., Ravikumar, P., Wainwright, M. J. and Yu, B. (2012). A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers. Statist. Sci. 27 538–557.
  • [35] Nickl, R. and van de Geer, S. (2013). Confidence sets in sparse regression. Ann. Statist. 41 2852–2876.
  • [36] Portnoy, S. (1987). A central limit theorem applicable to robust regression estimators. J. Multivariate Anal. 22 24–50.
  • [37] Pötscher, B. M. (2009). Confidence sets based on sparse estimators are necessarily large. Sankhyā 71 1–18.
  • [38] Pötscher, B. M. and Leeb, H. (2009). On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding. J. Multivariate Anal. 100 2065–2082.
  • [39] Raskutti, G., Wainwright, M. J. and Yu, B. (2010). Restricted eigenvalue properties for correlated Gaussian designs. J. Mach. Learn. Res. 11 2241–2259.
  • [40] Robinson, P. M. (1988). Root-$N$-consistent semiparametric regression. Econometrica 56 931–954.
  • [41] Shah, R. D. and Samworth, R. J. (2013). Variable selection with error control: Another look at stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 75 55–80.
  • [42] Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression. Biometrika 99 879–898.
  • [43] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
  • [44] van de Geer, S. (2007). The deterministic Lasso. In JSM Proceedings, 2007, 140. Am. Statist. Assoc., Alexandria, VA.
  • [45] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). Supplement to “On asymptotically optimal confidence regions and tests for high-dimensional models.” DOI:10.1214/14-AOS1221SUPP.
  • [46] van de Geer, S. and Müller, P. (2012). Quasi-likelihood and/or robust estimation in high dimensions. Statist. Sci. 27 469–480.
  • [47] van de Geer, S. A. (2008). High-dimensional generalized linear models and the lasso. Ann. Statist. 36 614–645.
  • [48] van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 1360–1392.
  • [49] Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using $\ell_1$-constrained quadratic programming (Lasso). IEEE Trans. Inform. Theory 55 2183–2202.
  • [50] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection. Ann. Statist. 37 2178–2201.
  • [51] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • [52] Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B Stat. Methodol. 76 217–242.
  • [53] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.

Supplemental materials

  • Supplementary material: Supplement to “On asymptotically optimal confidence regions and tests for high-dimensional models”. The supplemental article contains additional empirical results, as well as the proofs of Theorems 2.3 and 3.2, Lemmas 2.1 and 3.1.