Estimator augmentation with applications in high-dimensional group inference

Qing Zhou; Seunghyun Min

doi:10.1214/17-EJS1309

2017 Estimator augmentation with applications in high-dimensional group inference

Qing Zhou, Seunghyun Min

Electron. J. Statist. 11(2): 3039-3080 (2017). DOI: 10.1214/17-EJS1309

Abstract

To make statistical inference about a group of parameters on high-dimensional data, we develop the method of estimator augmentation for the block lasso, which is defined via block norm regularization. By augmenting a block lasso estimator $\hat{\beta }$ with the subgradient $S$ of the block norm evaluated at $\hat{\beta }$, we derive a closed-form density for the joint distribution of $(\hat{\beta },S)$ under a high-dimensional setting. This allows us to draw from an estimated sampling distribution of $\hat{\beta }$, or more generally any function of $(\hat{\beta },S)$, by Monte Carlo algorithms. We demonstrate the application of estimator augmentation in group inference with the group lasso and a de-biased group lasso constructed as a function of $(\hat{\beta },S)$. Our numerical results show that importance sampling via estimator augmentation can be orders of magnitude more efficient than parametric bootstrap in estimating tail probabilities for significance tests. This work also brings new insights into the geometry of the sample space and the solution uniqueness of the block lasso. To broaden its application, we generalize our method to a scaled block lasso, which estimates the error variance simultaneously.

References

1.

[1] Antoniadis, A. (2010). Comments on: $\ell_1$-penalization for mixture regression models., TEST 19, 2, 257–258. MR2677723 10.1007/s11749-010-0198-y[1] Antoniadis, A. (2010). Comments on: $\ell_1$-penalization for mixture regression models., TEST 19, 2, 257–258. MR2677723 10.1007/s11749-010-0198-y

2.

[2] Bunea, F., Lederer, J., and She, Y. (2014). The group square-root lasso: theoretical properties and fast algorithms., IEEE Trans. Inform. Theory 60, 2, 1313–1325.[2] Bunea, F., Lederer, J., and She, Y. (2014). The group square-root lasso: theoretical properties and fast algorithms., IEEE Trans. Inform. Theory 60, 2, 1313–1325.

3.

[3] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap., Ann. Statist. 41, 3, 1232–1259.[3] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap., Ann. Statist. 41, 3, 1232–1259.

4.

[4] Dezeure, R., Bühlmann, P., Meier, L., and Meinshausen, N. (2015). High-dimensional inference: confidence intervals, $p$-values and r-software hdi., Statist. Sci. 30, 4, 533–558.[4] Dezeure, R., Bühlmann, P., Meier, L., and Meinshausen, N. (2015). High-dimensional inference: confidence intervals, $p$-values and r-software hdi., Statist. Sci. 30, 4, 533–558.

5.

[5] Dezeure, R., Bühlmann, P., and Zhang, C. (2016). High-dimensional simultaneous inference with the bootstrap., Preprint, arXiv:1606.03940.[5] Dezeure, R., Bühlmann, P., and Zhang, C. (2016). High-dimensional simultaneous inference with the bootstrap., Preprint, arXiv:1606.03940.

6.

[6] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression., J. Mach. Learn. Res. 15, 2869–2909.[6] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression., J. Mach. Learn. Res. 15, 2869–2909.

7.

[7] Lee, J. D., Sun, D. L., Sun, Y., and Taylor, J. E. (2016). Exact post-selection inference, with application to the lasso., Ann. Statist. 44, 3, 907–927.[7] Lee, J. D., Sun, D. L., Sun, Y., and Taylor, J. E. (2016). Exact post-selection inference, with application to the lasso., Ann. Statist. 44, 3, 907–927.

8.

[8] Lockhart, R., Taylor, J., Tibshirani, R. J., and Tibshirani, R. (2014). A significance test for the lasso., Ann. Statist. 42, 2, 413–468.[8] Lockhart, R., Taylor, J., Tibshirani, R. J., and Tibshirani, R. (2014). A significance test for the lasso., Ann. Statist. 42, 2, 413–468.

9.

[9] Meinshausen, N. (2015). Group bound: confidence intervals for groups of variables in sparse high dimensional regression without assumptions on the design., J. R. Stat. Soc. Ser. B. Stat. Methodol. 77, 5, 923–945.[9] Meinshausen, N. (2015). Group bound: confidence intervals for groups of variables in sparse high dimensional regression without assumptions on the design., J. R. Stat. Soc. Ser. B. Stat. Methodol. 77, 5, 923–945.

10.

[10] Meinshausen, N. and Bühlmann, P. (2010). Stability selection., J. R. Stat. Soc. Ser. B Stat. Methodol. 72, 4, 417–473. MR2758523 10.1111/j.1467-9868.2010.00740.x[10] Meinshausen, N. and Bühlmann, P. (2010). Stability selection., J. R. Stat. Soc. Ser. B Stat. Methodol. 72, 4, 417–473. MR2758523 10.1111/j.1467-9868.2010.00740.x

11.

[11] Meinshausen, N., Meier, L., and Bühlmann, P. (2009). $p$-values for high-dimensional regression., J. Amer. Statist. Assoc. 104, 488, 1671–1681.[11] Meinshausen, N., Meier, L., and Bühlmann, P. (2009). $p$-values for high-dimensional regression., J. Amer. Statist. Assoc. 104, 488, 1671–1681.

12.

[12] Mitra, R. and Zhang, C.-H. (2016). The benefit of group sparsity in group inference with de-biased scaled group Lasso., Electron. J. Stat. 10, 2, 1829–1873.[12] Mitra, R. and Zhang, C.-H. (2016). The benefit of group sparsity in group inference with de-biased scaled group Lasso., Electron. J. Stat. 10, 2, 1829–1873.

13.

[13] Negahban, S. N., Ravikumar, P., Wainwright, M. J., and Yu, B. (2012). A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers., Statist. Sci. 27, 4, 538–557.[13] Negahban, S. N., Ravikumar, P., Wainwright, M. J., and Yu, B. (2012). A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers., Statist. Sci. 27, 4, 538–557.

14.

[14] Negahban, S. N. and Wainwright, M. J. (2011). Simultaneous support recovery in high dimensions: benefits and perils of block $\ell_1/\ell_\infty $-regularization., IEEE Trans. Inform. Theory 57, 6, 3841–3863.[14] Negahban, S. N. and Wainwright, M. J. (2011). Simultaneous support recovery in high dimensions: benefits and perils of block $\ell_1/\ell_\infty $-regularization., IEEE Trans. Inform. Theory 57, 6, 3841–3863.

15.

[15] Neykov, M., Ning, Y., Liu, J., and Liu, H. (2015). A unified theory of confidence regions and testing for high dimensional estimating equations., Preprint, arXiv:1510.08986.[15] Neykov, M., Ning, Y., Liu, J., and Liu, H. (2015). A unified theory of confidence regions and testing for high dimensional estimating equations., Preprint, arXiv:1510.08986.

16.

[16] Ning, Y. and Liu, H. (2017). A general theory of hypothesis tests and confidence regions for sparse high dimensional models., Ann. Statist. 45, 1, 158–195.[16] Ning, Y. and Liu, H. (2017). A general theory of hypothesis tests and confidence regions for sparse high dimensional models., Ann. Statist. 45, 1, 158–195.

17.

[17] Roth, V. and Fischer, B. (2008). The group-lasso for generalized linear model: uniqueness of solutions and efficient algorithms. In, Proceedings of the 25th International Conference on Machine Learning.[17] Roth, V. and Fischer, B. (2008). The group-lasso for generalized linear model: uniqueness of solutions and efficient algorithms. In, Proceedings of the 25th International Conference on Machine Learning.

18.

[18] Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression., Biometrika 99, 4, 879–898.[18] Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression., Biometrika 99, 4, 879–898.

19.

[19] Tian Harris, X., Panigrahi, S., Markovic, J., Bi, N., and Taylor, J. (2016). Selective sampling after solving a convex problem., Preprint, arXiv:1609.05609v1.[19] Tian Harris, X., Panigrahi, S., Markovic, J., Bi, N., and Taylor, J. (2016). Selective sampling after solving a convex problem., Preprint, arXiv:1609.05609v1.

20.

[20] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc. Ser. B 58, 1, 267–288. MR1379242[20] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc. Ser. B 58, 1, 267–288. MR1379242

21.

[21] Tibshirani, R. J. (2013). The lasso problem and uniqueness., Electron. J. Stat. 7, 1456–1490.[21] Tibshirani, R. J. (2013). The lasso problem and uniqueness., Electron. J. Stat. 7, 1456–1490.

22.

[22] van de Geer, S., Bühlmann, P., Ritov, Y., and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models., Ann. Statist. 42, 3, 1166–1202.[22] van de Geer, S., Bühlmann, P., Ritov, Y., and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models., Ann. Statist. 42, 3, 1166–1202.

23.

[23] van de Geer, S. and Stucky, B. (2016). $\chi^2$-confidence sets in high-dimensional regression. In, Statistical analysis for high-dimensional data. Abel Symp., Vol. 11. Springer, Cham, 279–306.[23] van de Geer, S. and Stucky, B. (2016). $\chi^2$-confidence sets in high-dimensional regression. In, Statistical analysis for high-dimensional data. Abel Symp., Vol. 11. Springer, Cham, 279–306.

24.

[24] Voorman, A., Shojaie, A., and Witten, D. (2014). Inference in high dimensions with the penalized score test., Preprint, arXiv:1401.2678.[24] Voorman, A., Shojaie, A., and Witten, D. (2014). Inference in high dimensions with the penalized score test., Preprint, arXiv:1401.2678.

25.

[25] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection., Ann. Statist. 37, 5A, 2178–2201. MR2543689 10.1214/08-AOS646 euclid.aos/1247663752 [25] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection., Ann. Statist. 37, 5A, 2178–2201. MR2543689 10.1214/08-AOS646 euclid.aos/1247663752

26.

[26] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., J. R. Stat. Soc. Ser. B Stat. Methodol. 68, 1, 49–67.[26] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., J. R. Stat. Soc. Ser. B Stat. Methodol. 68, 1, 49–67.

27.

[27] Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models., J. R. Stat. Soc. Ser. B. Stat. Methodol. 76, 1, 217–242.[27] Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models., J. R. Stat. Soc. Ser. B. Stat. Methodol. 76, 1, 217–242.

28.

[28] Zhang, X. and Cheng, G. (2017). Simultaneous inference for high-dimensional linear models., J. Amer. Statist. Assoc. 112, 518, 757–768.[28] Zhang, X. and Cheng, G. (2017). Simultaneous inference for high-dimensional linear models., J. Amer. Statist. Assoc. 112, 518, 757–768.

29.

[29] Zhou, Q. (2014). Monte carlo simulation for Lasso-type problems by estimator augmentation., J. Amer. Statist. Assoc. 109, 508, 1495–1516.[29] Zhou, Q. (2014). Monte carlo simulation for Lasso-type problems by estimator augmentation., J. Amer. Statist. Assoc. 109, 508, 1495–1516.

30.

[30] Zhou, Q. and Min, S. (2017). Uncertainty quantification under group sparsity., Biometrika 104, doi: 10.1093/biomet/asx037.[30] Zhou, Q. and Min, S. (2017). Uncertainty quantification under group sparsity., Biometrika 104, doi: 10.1093/biomet/asx037.

Creative Commons Attribution 4.0 International License.

Citation Download Citation

Qing Zhou and Seunghyun Min "Estimator augmentation with applications in high-dimensional group inference," Electronic Journal of Statistics 11(2), 3039-3080, (2017). https://doi.org/10.1214/17-EJS1309

Received: 1 December 2016; Published: 2017

Access the abstract

JOURNAL ARTICLE
42 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY