## The Annals of Statistics

### Accuracy assessment for high-dimensional linear regression

#### Abstract

This paper considers point and interval estimation of the $\ell_{q}$ loss of an estimator in high-dimensional linear regression with random design. We establish the minimax rate for estimating the $\ell_{q}$ loss and the minimax expected length of confidence intervals for the $\ell_{q}$ loss of rate-optimal estimators of the regression vector, including commonly used estimators such as Lasso, scaled Lasso, square-root Lasso and Dantzig Selector. Adaptivity of confidence intervals for the $\ell_{q}$ loss is also studied. Both the setting of the known identity design covariance matrix and known noise level and the setting of unknown design covariance matrix and unknown noise level are studied. The results reveal interesting and significant differences between estimating the $\ell_{2}$ loss and $\ell_{q}$ loss with $1\le q<2$ as well as between the two settings.

New technical tools are developed to establish rate sharp lower bounds for the minimax estimation error and the expected length of minimax and adaptive confidence intervals for the $\ell_{q}$ loss. A significant difference between loss estimation and the traditional parameter estimation is that for loss estimation the constraint is on the performance of the estimator of the regression vector, but the lower bounds are on the difficulty of estimating its $\ell_{q}$ loss. The technical tools developed in this paper can also be of independent interest.

#### Article information

Source
Ann. Statist., Volume 46, Number 4 (2018), 1807-1836.

Dates
Revised: March 2017
First available in Project Euclid: 27 June 2018

Permanent link to this document
https://projecteuclid.org/euclid.aos/1530086434

Digital Object Identifier
doi:10.1214/17-AOS1604

Mathematical Reviews number (MathSciNet)
MR3819118

Zentralblatt MATH identifier
06936479

Subjects
Primary: 62G15: Tolerance and confidence regions
Secondary: 62C20: Minimax procedures 62H35: Image analysis

#### Citation

Cai, T. Tony; Guo, Zijian. Accuracy assessment for high-dimensional linear regression. Ann. Statist. 46 (2018), no. 4, 1807--1836. doi:10.1214/17-AOS1604. https://projecteuclid.org/euclid.aos/1530086434

#### References

• [1] Arias-Castro, E., Candès, E. J. and Plan, Y. (2011). Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Ann. Statist. 39 2533–2556.
• [2] Bayati, M. and Montanari, A. (2012). The LASSO risk for Gaussian matrices. IEEE Trans. Inform. Theory 58 1997–2017.
• [3] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98 791–806.
• [4] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
• [5] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
• [6] Cai, T. T. and Guo, Z. (2018). Supplement to “Accuracy assessment for high-dimensional linear regression.” DOI:10.1214/17-AOS1604SUPP.
• [7] Cai, T. T. and Guo, Z. (2017). Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Statist. 45 615–646.
• [8] Cai, T. T., Low, M. and Ma, Z. (2014). Adaptive confidence bands for nonparametric regression functions. J. Amer. Statist. Assoc. 109 1054–1070.
• [9] Cai, T. T. and Low, M. G. (2004). An adaptation theory for nonparametric confidence intervals. Ann. Statist. 32 1805–1840.
• [10] Cai, T. T. and Low, M. G. (2006). Adaptive confidence balls. Ann. Statist. 34 202–228.
• [11] Cai, T. T. and Zhou, H. H. (2009). A data-driven block thresholding approach to wavelet estimation. Ann. Statist. 37 569–595.
• [12] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
• [13] Chernozhukov, V., Hansen, C. and Spindler, M. (2015). Post-selection and post-regularization inference in linear models with many controls and instruments. Preprint. Available at arXiv:1501.03185.
• [14] Chernozhukov, V., Hansen, C. and Spindler, M. (2015). Valid post-selection and post-regularization inference: An elementary, general approach. Preprint. Available at arXiv:1501.03430.
• [15] Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224.
• [16] Donoho, D. L., Maleki, A. and Montanari, A. (2011). The noise-sensitivity phase transition in compressed sensing. IEEE Trans. Inform. Theory 57 6920–6941.
• [17] Guo, Z., Wang, W., Cai, T. T. and Li, H. (2016). Optimal estimation of co-heritability in high-dimensional linear models. Preprint. Available at arXiv:1605.07244.
• [18] Hoffmann, M. and Nickl, R. (2011). On adaptive inference and confidence bands. Ann. Statist. 39 2383–2409.
• [19] Ingster, Y. I., Tsybakov, A. B. and Verzelen, N. (2010). Detection boundary in sparse regression. Electron. J. Stat. 4 1476–1526.
• [20] Janson, L., Barber, R. F. and Candès, E. (2015). Eigenprism: Inference for high-dimensional signal-to-noise ratios. Preprint. Available at arXiv:1505.02097.
• [21] Li, K.-C. (1985). From Stein’s unbiased risk estimates to the method of generalized cross validation. Ann. Statist. 13 1352–1377.
• [22] Nickl, R. and van de Geer, S. (2013). Confidence sets in sparse regression. Ann. Statist. 41 2852–2876.
• [23] Raskutti, G., Wainwright, M. J. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over $\ell_{q}$-balls. IEEE Trans. Inform. Theory 57 6976–6994.
• [24] Robins, J. and van der Vaart, A. (2006). Adaptive nonparametric confidence sets. Ann. Statist. 34 229–253.
• [25] Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9 1135–1151.
• [26] Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression. Biometrika 99 879–898.
• [27] Thrampoulidis, C., Panahi, A. and Hassibi, B. (2015). Asymptotically exact error analysis for the generalized $\ell_{2}^{2}$-lasso. Preprint. Available at arXiv:1502.06287.
• [28] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
• [29] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
• [30] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electron. J. Stat. 6 38–90.
• [31] Ye, F. and Zhang, C.-H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the $\ell_{q}$ loss in $\ell_{r}$ balls. J. Mach. Learn. Res. 11 3519–3540.
• [32] Yi, F. and Zou, H. (2013). SURE-tuned tapering estimation of large covariance matrices. Comput. Statist. Data Anal. 58 339–351.

#### Supplemental materials

• Supplement to “Accuracy assessment for high-dimensional linear regression”. We provide remaining proofs of the theorems of the main paper. In addition, we discuss the differences between the two parameter spaces $\Theta(k)$ and $\Theta_{0}(k)$ and present the minimaxity and adaptivity lower bounds of confidence intervals over the parameter space $\Theta_{\sigma_{0}}(k,s)$.