The Annals of Statistics

Nonpenalized variable selection in high-dimensional linear model settings via generalized fiducial inference

Jonathan P. Williams and Jan Hannig

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design matrix is collinear. To overcome this challenge, an entirely new perspective on variable selection is presented within a generalized fiducial inference framework. This new procedure is able to effectively account for linear dependencies among subsets of covariates in a high-dimensional setting where $p$ can grow almost exponentially in $n$, as well as in the classical setting where $p\le n$. It is shown that the procedure very naturally assigns small probabilities to subsets of covariates which include redundancies by way of explicit $L_{0}$ minimization. Furthermore, with a typical sparsity assumption, it is shown that the proposed method is consistent in the sense that the probability of the true sparse subset of covariates converges in probability to 1 as $n\to\infty$, or as $n\to\infty$ and $p\to\infty$. Very reasonable conditions are needed, and little restriction is placed on the class of possible subsets of covariates to achieve this consistency result.

Article information

Ann. Statist., Volume 47, Number 3 (2019), 1723-1753.

Received: February 2018
First available in Project Euclid: 13 February 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62J05: Linear regression 62F12: Asymptotic properties of estimators 62A01: Foundations and philosophical topics

Best subset selection high-dimensional regression $L_{0}$ minimization feature selection


Williams, Jonathan P.; Hannig, Jan. Nonpenalized variable selection in high-dimensional linear model settings via generalized fiducial inference. Ann. Statist. 47 (2019), no. 3, 1723--1753. doi:10.1214/18-AOS1733.

Export citation


  • [1] Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37 697–725.
  • [2] Beaumont, M. A. (2003). Estimation of population growth or decline in genetically monitored populations. Genetics 164 1139–1160.
  • [3] Berger, J. O. and Pericchi, L. R. (2001). Objective Bayesian methods for model selection: Introduction and comparison. In Model Selection. Institute of Mathematical Statistics Lecture Notes—Monograph Series 38 135–207. IMS, Beachwood, OH.
  • [4] Berk, R. A. (2008). Statistical Learning from a Regression Perspective. Springer, New York.
  • [5] Bertsimas, D., King, A. and Mazumder, R. (2016). Best subset selection via a modern optimization lens. Ann. Statist. 44 813–852.
  • [6] Bondell, H. D. and Reich, B. J. (2012). Consistent high-dimensional Bayesian variable selection via penalized credible regions. J. Amer. Statist. Assoc. 107 1610–1624.
  • [7] Breheny, P. and Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5 232–253.
  • [8] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • [9] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • [10] Ghosh, J. and Ghattas, A. E. (2015). Bayesian variable selection under collinearity. Amer. Statist. 69 165–173.
  • [11] Hannig, J., Iyer, H., Lai, R. C. S. and Lee, T. C. M. (2016). Generalized fiducial inference: A review and new results. J. Amer. Statist. Assoc. 111 1346–1361.
  • [12] Jameson, G. J. O. (2013). Inequalities for gamma function ratios. Amer. Math. Monthly 120 936–940.
  • [13] Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. J. Amer. Statist. Assoc. 107 649–660.
  • [14] Lai, R. C. S., Hannig, J. and Lee, T. C. M. (2015). Generalized fiducial inference for ultrahigh-dimensional regression. J. Amer. Statist. Assoc. 110 760–772.
  • [15] Luo, S. and Chen, Z. (2013). Extended BIC for linear regression models with diverging number of relevant features and high or ultra-high feature spaces. J. Statist. Plann. Inference 143 494–504.
  • [16] Narisetty, N. N. and He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. Ann. Statist. 42 789–817.
  • [17] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R. et al. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 2825–2830.
  • [18] Ročková, V. and George, E. I. (2018). The Spike-and-Slab LASSO. J. Amer. Statist. Assoc. 113 431–444.
  • [19] Rossell, D. and Telesca, D. (2017). Nonlocal priors for high-dimensional estimation. J. Amer. Statist. Assoc. 112 254–265.
  • [20] Shin, M., Bhattacharya, A. and Johnson, V. E. (2018). Scalable Bayesian variable selection using nonlocal prior densities in ultrahigh-dimensional settings. Statist. Sinica 28 1053–1078.
  • [21] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • [22] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.