Electronic Journal of Statistics

On the post selection inference constant under restricted isometry properties

François Bachoc, Gilles Blanchard, and Pierre Neuvial

Full-text: Open access

Abstract

Uniformly valid confidence intervals post model selection in regression can be constructed based on Post-Selection Inference (PoSI) constants. PoSI constants are minimal for orthogonal design matrices, and can be upper bounded in function of the sparsity of the set of models under consideration, for generic design matrices.

In order to improve on these generic sparse upper bounds, we consider design matrices satisfying a Restricted Isometry Property (RIP) condition. We provide a new upper bound on the PoSI constant in this setting. This upper bound is an explicit function of the RIP constant of the design matrix, thereby giving an interpolation between the orthogonal setting and the generic sparse setting. We show that this upper bound is asymptotically optimal in many settings by constructing a matching lower bound.

Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 3736-3757.

Dates
Received: April 2018
First available in Project Euclid: 20 November 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1542682881

Digital Object Identifier
doi:10.1214/18-EJS1490

Subjects
Primary: 62J05: Linear regression 62J15: Paired and multiple comparisons 62F25: Tolerance and confidence regions

Keywords
Inference post model-selection confidence intervals PoSI constants linear regression high-dimensional inference sparsity restricted isometry property

Rights
Creative Commons Attribution 4.0 International License.

Citation

Bachoc, François; Blanchard, Gilles; Neuvial, Pierre. On the post selection inference constant under restricted isometry properties. Electron. J. Statist. 12 (2018), no. 2, 3736--3757. doi:10.1214/18-EJS1490. https://projecteuclid.org/euclid.ejs/1542682881


Export citation

References

  • [1] F. Bachoc, H. Leeb, and B. M. Pötscher. Valid confidence intervals for post-model-selection predictors., The Annals of Statistics (forthcoming), 2018.
  • [2] F. Bachoc, D. Preinerstorfer, and L. Steinberger. Uniformly valid confidence intervals post-model-selection., arXiv:1611.01043, 2016.
  • [3] Z. Bai and J. W. Silverstein., Spectral analysis of large dimensional random matrices, volume 20. Springer, 2010.
  • [4] A. Belloni, V. Chernozhukov, and C. Hansen. Inference for high-dimensional sparse econometric models., Advances in Economics and Econometrics. 10th World Congress of the Econometric Society, Volume III, pages 245–295, 2011.
  • [5] A. Belloni, V. Chernozhukov, and C. Hansen. Inference on treatment effects after selection among high-dimensional controls., The Review of Economic Studies, 81:608–650, 2014.
  • [6] R. Berk, L. Brown, A. Buja, K. Zhang, and L. Zhao. Valid post-selection inference., The Annals of Statistics, 41(2):802–837, 2013.
  • [7] L. Birgé. An alternative point of view on Lepski’s method. In, State of the art in probability and statistics (Leiden, 1999), volume 36 of IMS Lecture Notes Monogr. Ser., pages 113–133. Inst. Math. Statist., 2001.
  • [8] P. Bühlmann and S. Van De Geer., Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media, 2011.
  • [9] E. Candes, T. Tao, et al. The dantzig selector: Statistical estimation when p is much larger than n., The Annals of Statistics, 35(6) :2313–2351, 2007.
  • [10] E. J. Candes and T. Tao. Decoding by linear programming., IEEE transactions on information theory, 51(12) :4203–4215, 2005.
  • [11] S. Chatterjee., Superconcentration and related topics. Springer, 2014.
  • [12] V. Chernozhukov, D. Chetverikov, and K. Kato. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors., The Annals of Statistics, 41(6) :2786–2819, 2013.
  • [13] B. S. Cirel’son, I. A. Ibragimov, and V. N. Sudakov. Norm of Gaussian sample functions. In, Proceedings of the 3rd Japan-U.S.S.R. Symposium on Probability Theory (Tashkent, 1975), volume 550 of Lecture Notes in Mathematics, pages 20–41. Springer, 1976.
  • [14] W. Fithian, D. Sun, and J. Taylor. Optimal inference after model selection., arXiv:1410.2597, 2015.
  • [15] S. Foucart and H. Rauhut., A mathematical introduction to compressive sensing. Basel: Birkhäuser, 2013.
  • [16] C. Giraud., Introduction to high-dimensional statistics., volume 139 of Monographs on Statistics and Applied Probability. CRC Press, 2015.
  • [17] P. Kabaila and H. Leeb. On the large-sample minimal coverage probability of confidence intervals after model selection., Journal of the American Statistical Association, 101:619–629, 2006.
  • [18] A. K. Kuchibhotla, L. D. Brown, A. Buja, E. I. George, and L. Zhao. A model free perspective for linear regression: Uniform-in-model bounds for post selection inference., arXiv:1802.05801, 2018.
  • [19] J. D. Lee, D. L. Sun, Y. Sun, and J. E. Taylor. Exact post-selection inference, with application to the lasso., The Annals of Statistics, 44(3):907–927, 2016.
  • [20] J. D. Lee and J. E. Taylor. Exact post model selection inference for marginal screening. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 136–144. Curran Associates, Inc., 2014.
  • [21] H. Leeb and B. M. Pötscher. Model selection and inference: Facts and fiction., Econometric Theory, 21:21–59, 2005.
  • [22] H. Leeb and B. M. Pötscher. Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results., Econometric Theory, 22:69–97, 2 2006.
  • [23] H. Leeb and B. M. Pötscher. Model selection. In T. G. Andersen, R. A. Davis, J.-P. Kreiß, and T. Mikosch, editors, Handbook of Financial Time Series, pages 785–821, New York, NY, 2008. Springer.
  • [24] V. A. Marčenko and L. A. Pastur. Distribution of eigenvalues for some sets of random matrices., Mathematics of the USSR-Sbornik, 1(4):457, 1967.
  • [25] B. M. Pötscher. Confidence sets based on sparse estimators are necessarily large., Sankhya, 71:1–18, 2009.
  • [26] R Core Team., R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2018.
  • [27] R. J. Tibshirani, A. Rinaldo, R. Tibshirani, and L. Wasserman. Uniform asymptotic inference and the bootstrap after model selection., The Annals of Statistics, forthcoming, 2015.
  • [28] R. J. Tibshirani, J. Taylor, R. Lockhart, and R. Tibshirani. Exact post-selection inference for sequential regression procedures., Journal of the American Statistical Association, 111(514):600–620, 2016.
  • [29] S. van de Geer, P. Bühlmann, Y. Ritov, and R. Dezeure. On asymptotically optimal confidence regions and tests for high-dimensional models., The Annals of Statistics, 42 :1166–1202, 2014.
  • [30] S. A. Van De Geer, P. Bühlmann, et al. On the conditions used to prove oracle results for the lasso., Electronic Journal of Statistics, 3 :1360–1392, 2009.
  • [31] C.-H. Zhang and S. Zhang. Confidence intervals for low dimensional parameters in high dimensional linear models., Journal of the Royal Statistical Society B, 76:217–242, 2014.
  • [32] K. Zhang. Spherical cap packing asymptotics and rank-extreme detection., IEEE Transactions on Information Theory, 63(7), 2017.
  • [33] P. Zhao and B. Yu. On model selection consistency of lasso., Journal of Machine Learning Research, 7(Nov) :2541–2563, 2006.