The Annals of Statistics

Spike and slab variable selection: Frequentist and Bayesian strategies

Hemant Ishwaran and J. Sunil Rao

Full-text: Open access

Abstract

Variable selection in the linear regression model takes many apparent faces from both frequentist and Bayesian standpoints. In this paper we introduce a variable selection method referred to as a rescaled spike and slab model. We study the importance of prior hierarchical specifications and draw connections to frequentist generalized ridge regression estimation. Specifically, we study the usefulness of continuous bimodal priors to model hypervariance parameters, and the effect scaling has on the posterior mean through its relationship to penalization. Several model selection strategies, some frequentist and some Bayesian in nature, are developed and studied theoretically. We demonstrate the importance of selective shrinkage for effective variable selection in terms of risk misclassification, and show this is achieved using the posterior from a rescaled spike and slab model. We also show how to verify a procedure’s ability to reduce model uncertainty in finite samples using a specialized forward selection strategy. Using this tool, we illustrate the effectiveness of rescaled spike and slab models in reducing model uncertainty.

Article information

Source
Ann. Statist. Volume 33, Number 2 (2005), 730-773.

Dates
First available in Project Euclid: 26 May 2005

Permanent link to this document
http://projecteuclid.org/euclid.aos/1117114335

Digital Object Identifier
doi:10.1214/009053604000001147

Mathematical Reviews number (MathSciNet)
MR2163158

Zentralblatt MATH identifier
1068.62079

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62J05: Linear regression

Keywords
Generalized ridge regression hypervariance model averaging model uncertainty ordinary least squares penalization rescaling shrinkage stochastic variable selection Zcut

Citation

Ishwaran, Hemant; Rao, J. Sunil. Spike and slab variable selection: Frequentist and Bayesian strategies. Ann. Statist. 33 (2005), no. 2, 730--773. doi:10.1214/009053604000001147. http://projecteuclid.org/euclid.aos/1117114335.


Export citation

References

  • Barbieri, M. and Berger, J. (2004). Optimal predictive model selection. Ann. Statist 32 870--897.
  • Bickel, P. and Zhang, P. (1992). Variable selection in non-parametric regression with categorical covariates. J. Amer. Statist. Assoc. 87 90--97.
  • Breiman, L. (1992). The little bootstrap and other methods for dimensionality selection in regression: $X$-fixed prediction error. J. Amer. Statist. Assoc. 87 738--754.
  • Chipman, H. (1996). Bayesian variable selection with related predictors. Canad. J. Statist. 24 17--36.
  • Chipman, H. A., George, E. I. and McCulloch, R. E. (2001). The practical implementation of Bayesian model selection (with discussion). In Model Selection (P. Lahiri, ed.) 65--134. IMS, Beachwood, OH.
  • Clyde, M., DeSimone, H. and Parmigiani, G. (1996). Prediction via orthogonalized model mixing. J. Amer. Statist. Assoc. 91 1197--1208.
  • Clyde, M., Parmigiani, G. and Vidakovic, B. (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85 391--401.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407--499.
  • George, E. I. (1986). Minimax multiple shrinkage estimation. Ann. Statist. 14 188--205.
  • George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881--889.
  • Geweke, J. (1996). Variable selection and model comparison in regression. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 609--620. Oxford Univ. Press, New York.
  • Hoerl, A. E. (1962). Application of ridge analysis to regression problems Chemical Engineering Progress 58 54--59.
  • Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 55--67.
  • Ishwaran, H. (2004). Discussion of ``Least angle regression,'' by B. Efron, T. Hastie, I. Johnstone and R. Tibshirani. Ann. Statist. 32 452--457.
  • Ishwaran, H. and Rao, J. S. (2000). Bayesian nonparametric MCMC for large variable selection problems. Unpublished manuscript.
  • Ishwaran, H. and Rao, J. S. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection J. Amer. Statist. Assoc. 98 438--455.
  • Ishwaran, H. and Rao, J. S. (2005). Spike and slab gene selection for multigroup microarray data. J. Amer. Statist. Assoc. To appear.
  • Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators Ann. Statist. 28 1356--1378.
  • Kuo, L. and Mallick, B. K. (1998). Variable selection for regression models. Sankhyā Ser. B 60 65--81.
  • Le Cam, L. and Yang, G. L. (1990). Asymptotics in Statistics: Some Basic Concepts. Springer, New York.
  • Leeb, H. and Pötscher, B. M. (2003). The finite-sample distribution of post-model-selection estimators, and uniform versus non-uniform approximations. Econometric Theory 19 100--142.
  • Lempers, F. B. (1971). Posterior Probabilities of Alternative Linear Models. Rotterdam Univ. Press.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression (with discussion). J. Amer. Statist. Assoc. 83 1023--1036.
  • Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory 7 163--185.
  • Rao, C. R. and Wu, Y. (1989). A strongly consistent procedure for model selection in a regression problem. Biometrika 76 369--374.
  • Rao, J. S. (1999). Bootstrap choice of cost complexity for better subset selection. Statist. Sinica 9 273--287.
  • Shao, J. (1993). Linear model selection by cross-validation J. Amer. Statist. Assoc. 88 486--494.
  • Shao, J. (1996). Bootstrap model selection. J. Amer. Statist. Assoc. 91 655--665.
  • Shao, J. (1997). An asymptotic theory for linear model selection (with discussion). Statist. Sinica 7 221--264.
  • Shao, J. and Rao, J. S. (2000). The GIC for model selection: A hypothesis testing approach. Linear models. J. Statist. Plann. Inference 88 215--231.
  • Zhang, P. (1992). On the distributional properties of model selection criteria. J. Amer. Statist. Assoc. 87 732--737.
  • Zhang, P. (1993). Model selection via multifold cross validation. Ann. Statist. 21 299--313.
  • Zheng, X. and Loh, W.-Y. (1995). Consistent variable selection in linear models. J. Amer. Statist. Assoc. 90 151--156.
  • Zheng, X. and Loh, W.-Y. (1997). A consistent variable selection criterion for linear models with high-dimensional covariates. Statist. Sinica 7 311--325.