Electronic Journal of Statistics

Selecting massive variables using an iterated conditional modes/medians algorithm

Vitara Pungpapong, Min Zhang, and Dabao Zhang

Full-text: Open access

Abstract

Empirical Bayes methods are designed in selecting massive variables, which may be inter-connected following certain hierarchical structures, because of three attributes: taking prior information on model parameters, allowing data-driven hyperparameter values, and free of tuning parameters. We propose an iterated conditional modes/medians (ICM/M) algorithm to implement empirical Bayes selection of massive variables, while incorporating sparsity or more complicated a priori information. The iterative conditional modes are employed to obtain data-driven estimates of hyperparameters, and the iterative conditional medians are used to estimate the model coefficients and therefore enable the selection of massive variables. The ICM/M algorithm is computationally fast, and can easily extend the empirical Bayes thresholding, which is adaptive to parameter sparsity, to complex data. Empirical studies suggest competitive performance of the proposed method, even in the simple case of selecting massive regression predictors.

Article information

Source
Electron. J. Statist., Volume 9, Number 1 (2015), 1243-1266.

Dates
Received: September 2014
First available in Project Euclid: 11 June 2015

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1433982945

Digital Object Identifier
doi:10.1214/15-EJS1034

Mathematical Reviews number (MathSciNet)
MR3355757

Zentralblatt MATH identifier
1327.62409

Subjects
Primary: 62J05: Linear regression
Secondary: 62C12: Empirical decision procedures; empirical Bayes procedures 62F07: Ranking and selection

Keywords
Empirical Bayes variable selection high dimensional data prior sparsity

Citation

Pungpapong, Vitara; Zhang, Min; Zhang, Dabao. Selecting massive variables using an iterated conditional modes/medians algorithm. Electron. J. Statist. 9 (2015), no. 1, 1243--1266. doi:10.1214/15-EJS1034. https://projecteuclid.org/euclid.ejs/1433982945


Export citation

References

  • Barbieri, M. M. and Berger, J. O. (2004). Optimal predictive model selection., The Annals of Statistics, 32:870–897.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing., Journal of the Royal Statistical Society, Series B, 57:289–300.
  • Besag, J. (1975). Statistical analysis of non-lattice data., Journal of the Royal Statistical Society Series D (The Statistician), 24:179–195.
  • Besag, J. (1986). On the statistical analysis of dirty pictures., Journal of the Royal Statistical Society Series B, 48:259–302.
  • Bottolo, L. and Richardson, S. (2010). Evolutionary stochastic search for bayesian model exploration., Bayesian Analysis, 5:583–618.
  • Breheny, P. and Huang, J. (2011). Cooridinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection., The Annals of Applied Statistics, 5:232–253.
  • Brenner, V., Lindauera, K., Parkara, A., Fordhama, J., Hayesa, I., Stowa, M., Gamaa, R., Pollocka, K., and Jupp, R. (2001). Analysis of cellular adhesion by microarray expression profiling., Journal of Immunological Methods, 250:15–28.
  • Carlin, B. P. and Chib, S. (1995). Bayesian model choice via markov chain Monte Carlo methods., Journal of the Royal Statistical Society Series B, 57:473–484.
  • Cupples, L. A., Arruda, H. T., Benjamin, E. J., and et al. (2007). The framingham heart study 100k snp genome-wide association study resource: Overview of 17 phenotype working group reports., BMC Medical Genetics, 8(Suppl 1):S1.
  • Daubechies, I., Defrise, M., and Mol, C. D. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint., Commications on Pure and Applied Mathematics, 57:1413–1457.
  • Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage., Biometrika, 81:425–455.
  • Dudoit, S., Shaffer, J. P., and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments., Statistical Science, 18:71–103.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., Journal of the American Statistical Association, 96:1348–1360.
  • Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso., Journal of Computational and Graphical Statistics, 7:397–416.
  • George, E. I. and McCulloch, R. E. (1993). Variable selection via gibbs sampling., Journal of American Statistical Association, 85:398–409.
  • Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In, Computing Science and Statistics, Proceedings of the 23rd Symposium on the Interface, pages 156–163.
  • Geyer, C. J. and Thompson, E. A. (1992). Constrained monte carlo maximum likelihood for dependent data., Journal of the Royal Statistical Society Series B, 54:657–699.
  • Guyon, X. and Kunsch, H. R. (1992). Asymptotic comparison of estimators in the ising model. In Barone, P., Frigessi, A., and Piccioni, M., editors, Stochastic Models, Statistical Methods, and Algorithms in Image Analysis, pages 177–198. Springer, New York.
  • Hans, C., Dobra, A., and West, M. (2007). Shotgun stochastic search for “large p” regression., Journal of the American Statistical Association, 102:507–516.
  • Ishwaran, H. and Rao, J. S. (2005). Spike and slab variable selection: Frequentist and Bayesian strategies., The Annals of Statistics, 33:730–773.
  • Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems., Proceedings of the Royal Society of Landon Series A, 196:453–461.
  • Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical bayes estimates of possibly sparse sequence., The Annals of Statistics, 32:1594–1649.
  • Johnstone, I. M. and Silverman, B. W. (2005). Ebayesthresh: R programs for empirical bayes thresholding., Journal of Statistical Software, 12:1–38.
  • Li, C. and Li, H. (2010). Variable selection and regression analysis for graph-structured covariates with an application to genomics., The Annals of Applied Statistics, 4:1498–1516.
  • Li, F. and Zhang, N. R. (2010). Bayesian variable selection in structured high-dimensional covariate spaces with application in genomics., Journal of the American Statistical Association, 105:1202–1214.
  • Liang, G. and Yu, B. (2003). Maximum pseudo likelihood estimation in network tomography., IEEE Transactions on Signal Processing, 51:2043– 2053.
  • Mase, S. (2000). Marked gibbs processes and asymptotic normality of maximum pseudo-likelihood estimators., Mathematische Nachrichten, 209:151–169.
  • Meinshausen, N., Meier, L., and Buehlmann, P. (2009). P-values for high-dimensional regression., Journal of the American Statistical Association, 104:1671–1681.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression., Journal of the American Statistical Association, 83:1023–1036.
  • Nawy, T. (2012). Rare variants and the power of association., Nature Methods, 9:324.
  • Newton, M. A., Noueiry, A., Sarkar, D., and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method., Biostatistics, 5:155–176.
  • Onsager, L. (1943). Crystal statistics. i. A two-dimensional model with an order-disorder transition., Physical Review, 65:117–149.
  • Pan, W., Xie, B., and Shen, X. (2010). Incorporating predictor network in penalized regression with application to microarray data., Biometrics, 66:474–484.
  • Rockova, V. and George, E. I. (2014). Incorporating predictor network in penalized regression with application to microarray data., Journal of the American Statistical Association, 109:828–846.
  • Stingo, F. C., Chen, Y. A., Tadesse, M. G., and Vannucci, M. (2011). Incorporating biological information into linear models: A Bayesian approach to the selection of pathways and genes., The Annals of Applied Statistics, 5:1978–2002.
  • Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression., Biometrika, 99:879–898.
  • Syed, V. and Hecht, N. B. (1997). Up-regulation and down-regulation of genes expressed in cocultures of rat sertoli cells and germ cells., Molecular Reproduction and Development, 47:380–389.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., Journal of Royal Statistical Society Series B, 58:267–288.
  • Varin, C., Reid, N., and Firth, D. (2011). An overview of composite likelihood methods., Statistica Sinica, 21:5–42.
  • Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection., Annals of Statistics, 37:2178–2201.
  • Wu, T. T. and Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression., The Annals of Applied Statistics, 2:224–244.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., Journal of Royal Statistical Society Series B, 68:49–67.
  • Zhang, M., Zhang, D., and Wells, M. T. (2010). Generalized thresholding estimators for high-dimensional location parameters., Statistica Sinica, 20:911–926.
  • Zhou, X. and Schmidler, S. C. (2009). Bayesian parameter estimation in ising and potts models: A comparative study with applications to protein modeling. Technical report, Duke, University.
  • Zou, H. (2006). The adaptive lasso and its oracle properties., Journal of the American Statistical Association, 101:1418–1429.