We compare several confidence intervals after model selection in the setting recently studied by Berk et al. [ Ann. Statist. 41 (2013) 802–837], where the goal is to cover not the true parameter but a certain nonstandard quantity of interest that depends on the selected model. In particular, we compare the PoSI-intervals that are proposed in that reference with the “naive” confidence interval, which is constructed as if the selected model were correct and fixed a priori (thus ignoring the presence of model selection). Overall, we find that the actual coverage probabilities of all these intervals deviate only moderately from the desired nominal coverage probability. This finding is in stark contrast to several papers in the existing literature, where the goal is to cover the true parameter.
Statist. Sci.
30(2):
216-227
(May 2015).
DOI: 10.1214/14-STS507
Andrews, D. W. K. and Guggenberger, P. (2009). Hybrid and size-corrected subsampling methods. Econometrica 77 721–762. MR2531360 10.3982/ECTA7015Andrews, D. W. K. and Guggenberger, P. (2009). Hybrid and size-corrected subsampling methods. Econometrica 77 721–762. MR2531360 10.3982/ECTA7015
Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference. Ann. Statist. 41 802–837. MR3099122 10.1214/12-AOS1077 euclid.aos/1369836961
Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference. Ann. Statist. 41 802–837. MR3099122 10.1214/12-AOS1077 euclid.aos/1369836961
Brown, L. (1967). The conditional level of Student’s $t$ test. Ann. Math. Stat. 38 1068–1071. MR214210 10.1214/aoms/1177698776 euclid.aoms/1177698776
Brown, L. (1967). The conditional level of Student’s $t$ test. Ann. Math. Stat. 38 1068–1071. MR214210 10.1214/aoms/1177698776 euclid.aoms/1177698776
Buehler, R. J. and Feddersen, A. P. (1963). Note on a conditional property of Student’s $t$. Ann. Math. Stat. 34 1098–1100. MR150864 10.1214/aoms/1177704034 euclid.aoms/1177704034
Buehler, R. J. and Feddersen, A. P. (1963). Note on a conditional property of Student’s $t$. Ann. Math. Stat. 34 1098–1100. MR150864 10.1214/aoms/1177704034 euclid.aoms/1177704034
Craven, P. and Wahba, G. (1978/79). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403. MR516581 10.1007/BF01404567Craven, P. and Wahba, G. (1978/79). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403. MR516581 10.1007/BF01404567
Dijkstra, T. K. and Veldkamp, J. H. (1988). Data-driven selection of regressors and the bootstrap. In Lecture Notes in Econom. and Math. Systems 307 17–38. Springer, New York.Dijkstra, T. K. and Veldkamp, J. H. (1988). Data-driven selection of regressors and the bootstrap. In Lecture Notes in Econom. and Math. Systems 307 17–38. Springer, New York.
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499. MR2060166 10.1214/009053604000000067 euclid.aos/1083178935
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499. MR2060166 10.1214/009053604000000067 euclid.aos/1083178935
Ewald, K. (2012). On the influence of model selection on confidence regions for marginal associations in the linear model. Master’s thesis, Univ. Vienna.Ewald, K. (2012). On the influence of model selection on confidence regions for marginal associations in the linear model. Master’s thesis, Univ. Vienna.
Kabaila, P. (1998). Valid confidence intervals in regression after variable selection. Econometric Theory 14 463–482. MR1650037 10.1017/S0266466698144031Kabaila, P. (1998). Valid confidence intervals in regression after variable selection. Econometric Theory 14 463–482. MR1650037 10.1017/S0266466698144031
Kabaila, P. and Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. J. Amer. Statist. Assoc. 101 619–629. MR2256178 10.1198/016214505000001140Kabaila, P. and Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. J. Amer. Statist. Assoc. 101 619–629. MR2256178 10.1198/016214505000001140
Leeb, H. (2006). The distribution of a linear predictor after model selection: Unconditional finite-sample distributions and asymptotic approximations. In Optimality. Institute of Mathematical Statistics Lecture Notes—Monograph Series 49 291–311. IMS, Beachwood, OH. MR2338549 10.1214/074921706000000518Leeb, H. (2006). The distribution of a linear predictor after model selection: Unconditional finite-sample distributions and asymptotic approximations. In Optimality. Institute of Mathematical Statistics Lecture Notes—Monograph Series 49 291–311. IMS, Beachwood, OH. MR2338549 10.1214/074921706000000518
Leeb, H. (2008). Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process. Bernoulli 14 661–690. MR2537807 10.3150/08-BEJ127 euclid.bj/1219669625
Leeb, H. (2008). Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process. Bernoulli 14 661–690. MR2537807 10.3150/08-BEJ127 euclid.bj/1219669625
Leeb, H. and Pötscher, B. M. (2003). The finite-sample distribution of post-model-selection estimators and uniform versus nonuniform approximations. Econometric Theory 19 100–142. MR1965844 10.1017/S0266466603191050Leeb, H. and Pötscher, B. M. (2003). The finite-sample distribution of post-model-selection estimators and uniform versus nonuniform approximations. Econometric Theory 19 100–142. MR1965844 10.1017/S0266466603191050
Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory 21 21–59. MR2153856 10.1017/S0266466605050036Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory 21 21–59. MR2153856 10.1017/S0266466605050036
Leeb, H. and Pötscher, B. M. (2006a). Can one estimate the conditional distribution of post-model-selection estimators? Ann. Statist. 34 2554–2591. MR2291510 10.1214/009053606000000821 euclid.aos/1169571807
Leeb, H. and Pötscher, B. M. (2006a). Can one estimate the conditional distribution of post-model-selection estimators? Ann. Statist. 34 2554–2591. MR2291510 10.1214/009053606000000821 euclid.aos/1169571807
Leeb, H. and Pötscher, B. M. (2006b). Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory 22 69–97. MR2212693 10.1017/S0266466606060038Leeb, H. and Pötscher, B. M. (2006b). Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory 22 69–97. MR2212693 10.1017/S0266466606060038
Leeb, H. and Pötscher, B. M. (2008a). Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory 24 338–376. MR2422862Leeb, H. and Pötscher, B. M. (2008a). Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory 24 338–376. MR2422862
Leeb, H. and Pötscher, B. M. (2008b). Model selection. In Handbook of Financial Time Series (T. G. Andersen, R. A. Davis, J.-P. Kreiß and Th. Mikosch, eds.) 785–821. Springer, New York.Leeb, H. and Pötscher, B. M. (2008b). Model selection. In Handbook of Financial Time Series (T. G. Andersen, R. A. Davis, J.-P. Kreiß and Th. Mikosch, eds.) 785–821. Springer, New York.
Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory 7 163–185. MR1128410 10.1017/S0266466600004382Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory 7 163–185. MR1128410 10.1017/S0266466600004382
Pötscher, B. M. (2006). The distribution of model averaging estimators and an impossibility result regarding its estimation. In Time Series and Related Topics. Institute of Mathematical Statistics Lecture Notes—Monograph Series 52 113–129. IMS, Beachwood, OH. MR2427842Pötscher, B. M. (2006). The distribution of model averaging estimators and an impossibility result regarding its estimation. In Time Series and Related Topics. Institute of Mathematical Statistics Lecture Notes—Monograph Series 52 113–129. IMS, Beachwood, OH. MR2427842
Pötscher, B. M. and Leeb, H. (2009). On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding. J. Multivariate Anal. 100 2065–2082. MR2543087 10.1016/j.jmva.2009.06.010Pötscher, B. M. and Leeb, H. (2009). On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding. J. Multivariate Anal. 100 2065–2082. MR2543087 10.1016/j.jmva.2009.06.010
Pötscher, B. M. and Schneider, U. (2009). On the distribution of the adaptive LASSO estimator. J. Statist. Plann. Inference 139 2775–2790. MR2523666 10.1016/j.jspi.2009.01.003Pötscher, B. M. and Schneider, U. (2009). On the distribution of the adaptive LASSO estimator. J. Statist. Plann. Inference 139 2775–2790. MR2523666 10.1016/j.jspi.2009.01.003
Pötscher, B. M. and Schneider, U. (2010). Confidence sets based on penalized maximum likelihood estimators in Gaussian regression. Electron. J. Stat. 4 334–360. MR2645488 10.1214/09-EJS523 euclid.ejs/1268655653
Pötscher, B. M. and Schneider, U. (2010). Confidence sets based on penalized maximum likelihood estimators in Gaussian regression. Electron. J. Stat. 4 334–360. MR2645488 10.1214/09-EJS523 euclid.ejs/1268655653
Pötscher, B. M. and Schneider, U. (2011). Distributional results for thresholding estimators in high-dimensional Gaussian regression models. Electron. J. Stat. 5 1876–1934. MR2970179 10.1214/11-EJS659 euclid.ejs/1325264852
Pötscher, B. M. and Schneider, U. (2011). Distributional results for thresholding estimators in high-dimensional Gaussian regression models. Electron. J. Stat. 5 1876–1934. MR2970179 10.1214/11-EJS659 euclid.ejs/1325264852
Sen, P. K. (1979). Asymptotic properties of maximum likelihood estimators based on conditional specification. Ann. Statist. 7 1019–1033. MR536504 10.1214/aos/1176344785 euclid.aos/1176344785
Sen, P. K. (1979). Asymptotic properties of maximum likelihood estimators based on conditional specification. Ann. Statist. 7 1019–1033. MR536504 10.1214/aos/1176344785 euclid.aos/1176344785
Sen, P. K. and Saleh, A. K. M. E. (1987). On preliminary test and shrinkage $M$-estimation in linear models. Ann. Statist. 15 1580–1592. MR913575 10.1214/aos/1176350611 euclid.aos/1176350611
Sen, P. K. and Saleh, A. K. M. E. (1987). On preliminary test and shrinkage $M$-estimation in linear models. Ann. Statist. 15 1580–1592. MR913575 10.1214/aos/1176350611 euclid.aos/1176350611
Tukey, J. W. (1967). Discussion of “Topics in the investigation of linear relations fitted by the method of least squares” by F. J. Anscombe. J. Roy. Statist. Soc. Ser. B 29 47–48. MR212941Tukey, J. W. (1967). Discussion of “Topics in the investigation of linear relations fitted by the method of least squares” by F. J. Anscombe. J. Roy. Statist. Soc. Ser. B 29 47–48. MR212941