Statistical Science
- Statist. Sci.
- Volume 26, Number 1 (2011), 130-149.
Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies
Terrance Savitsky, Marina Vannucci, and Naijun Sha
Full-text: Access has been disabled (more information)
Abstract
This paper presents a unified treatment of Gaussian process models that extends to data from the exponential dispersion family and to survival data. Our specific interest is in the analysis of data sets with predictors that have an a priori unknown form of possibly nonlinear associations to the response. The modeling approach we describe incorporates Gaussian processes in a generalized linear model framework to obtain a class of nonparametric regression models where the covariance matrix depends on the predictors. We consider, in particular, continuous, categorical and count responses. We also look into models that account for survival outcomes. We explore alternative covariance formulations for the Gaussian process prior and demonstrate the flexibility of the construction. Next, we focus on the important problem of selecting variables from the set of possible predictors and describe a general framework that employs mixture priors. We compare alternative MCMC strategies for posterior inference and achieve a computationally efficient and practical approach. We demonstrate performances on simulated and benchmark data sets.
Article information
Source
Statist. Sci. Volume 26, Number 1 (2011), 130-149.
Dates
First available in Project Euclid: 9 June 2011
Permanent link to this document
http://projecteuclid.org/euclid.ss/1307626570
Digital Object Identifier
doi:10.1214/11-STS354
Mathematical Reviews number (MathSciNet)
MR2849913
Zentralblatt MATH identifier
1222.65017
Keywords
Bayesian variable selection generalized linear models Gaussian processes latent variables MCMC nonparametric regression survival data
Citation
Savitsky, Terrance; Vannucci, Marina; Sha, Naijun. Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies. Statist. Sci. 26 (2011), no. 1, 130--149. doi:10.1214/11-STS354. http://projecteuclid.org/euclid.ss/1307626570.
References
- Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.Mathematical Reviews (MathSciNet): MR1224394
Zentralblatt MATH: 0774.62031
Digital Object Identifier: doi:10.2307/2290350
JSTOR: links.jstor.org - Banerjee, S., Gelfand, A. E., Finley, A. O. and Sang, H. (2008). Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 825–848.Mathematical Reviews (MathSciNet): MR2523906
Zentralblatt MATH: 05563371
Digital Object Identifier: doi:10.1111/j.1467-9868.2008.00663.x - Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York.
- Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation (with discussion). J. Amer. Statist. Assoc. 80 580–619.Mathematical Reviews (MathSciNet): MR803258
Zentralblatt MATH: 0594.62044
Digital Object Identifier: doi:10.2307/2288473
JSTOR: links.jstor.org - Brown, P. J., Vannucci, M. and Fearn, T. (1998a). Bayesian wavelength selection in multicomponent analysis. J. Chemometrics 12 173–182.
- Brown, P. J., Vannucci, M. and Fearn, T. (1998b). Multivariate Bayesian variable selection and prediction. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 627–641.Mathematical Reviews (MathSciNet): MR1626005
Zentralblatt MATH: 0909.62022
Digital Object Identifier: doi:10.1111/1467-9868.00144
JSTOR: links.jstor.org - Brown, P. J., Vannucci, M. and Fearn, T. (2002). Bayes model averaging with selection of regressors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 519–536.Mathematical Reviews (MathSciNet): MR1924304
Zentralblatt MATH: 1073.62004
Digital Object Identifier: doi:10.1111/1467-9868.00348
JSTOR: links.jstor.org - Chen, M.-H., Ibrahim, J. G. and Yiannoutsos, C. (1999). Prior elicitation, variable selection and Bayesian computation for logistic regression models. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 223–242.Mathematical Reviews (MathSciNet): MR1664057
Zentralblatt MATH: 0913.62026
Digital Object Identifier: doi:10.1111/1467-9868.00173
JSTOR: links.jstor.org - Chipman, H., George, E. and McCulloch, R. (2001). Practical implementation of Bayesian model selection. In Model Selection (P. Lahiri, ed.) 65–134. IMS, Beachwood, OH.
- Chipman, H., George, E. and McCulloch, R. (2002). Bayesian treed models. Machine Learning 48 303–324.Zentralblatt MATH: 0820.68098
- Cox, D. R. (1972). Regression models and life-tables (with discussion). J. Roy. Statist. Soc. Ser. B 34 187–220.
- Diggle, P. J., Tawn, J. A. and Moyeed, R. A. (1998). Model-based geostatistics (with discussion). J. Roy. Statist. Soc. Ser. C 47 299–350.Mathematical Reviews (MathSciNet): MR1626544
Digital Object Identifier: doi:10.1111/1467-9876.00113
JSTOR: links.jstor.org - Fahrmeir, L., Kneib, T. and Lang, S. (2004). Penalized structured additive regression for space-time data: A Bayesian perspective. Statist. Sinica 14 731–761.
- Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55 119–139.Mathematical Reviews (MathSciNet): MR1473055
Zentralblatt MATH: 0880.68103
Digital Object Identifier: doi:10.1006/jcss.1997.1504 - George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881–889.
- George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–373.
- Gottardo, R. and Raftery, A. E. (2008). Markov chain Monte Carlo with mixtures of mutually singular distributions. J. Comput. Graph. Statist. 17 949–975.Mathematical Reviews (MathSciNet): MR2479418
Digital Object Identifier: doi:10.1198/106186008X386102 - Gramacy, R. B. and Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. J. Amer. Statist. Assoc. 103 1119–1130.Mathematical Reviews (MathSciNet): MR2528830
Zentralblatt MATH: 1205.62218
Digital Object Identifier: doi:10.1198/016214508000000689 - Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711–732.Mathematical Reviews (MathSciNet): MR1380810
Zentralblatt MATH: 0861.62023
Digital Object Identifier: doi:10.1093/biomet/82.4.711
JSTOR: links.jstor.org - Guan, Y. and Stephens, M. (2011). Bayesian variable selection regression for genome-wide association studies, and other large-scale problems. Ann. Appl. Stat. To appear.
- Haario, H., Saksman, E. and Tamminen, J. (2001). An adaptive metropolis algorithm. Bernoulli 7 223–242.Mathematical Reviews (MathSciNet): MR1828504
Digital Object Identifier: doi:10.2307/3318737
Project Euclid: euclid.bj/1080222083 - Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.Mathematical Reviews (MathSciNet): MR1851606
- Ji, C. and Schmidler, S. (2009). Adaptive Markov chain Monte Carlo for Bayesian variable selection. Technical report.
- Kalbfleisch, J. D. (1978). Non-parametric Bayesian analysis of survival time data. J. Roy. Statist. Soc. Ser. B 40 214–221.
- Lee, K. E. and Mallick, B. K. (2004). Bayesian methods for variable selection in survival models with application to DNA microarray data. Sankhyā 66 756–778.Mathematical Reviews (MathSciNet): MR2205819
- Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. J. Amer. Statist. Assoc. 103 410–423.Mathematical Reviews (MathSciNet): MR2420243
Zentralblatt MATH: 05564499
Digital Object Identifier: doi:10.1198/016214507000001337 - Linkletter, C., Bingham, D., Hengartner, N., Higdon, D. and Ye, K. Q. (2006). Variable selection for Gaussian process models in computer experiments. Technometrics 48 478–490.Mathematical Reviews (MathSciNet): MR2328617
Digital Object Identifier: doi:10.1198/004017006000000228 - Long, J. (1997). Regression Models for Categorical and Limited Dependent Variables. Sage, Thousand Oaks, CA.Zentralblatt MATH: 0911.62055
- Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. Internat. Statist. Rev. 63 215–232.
- McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall, London.Mathematical Reviews (MathSciNet): MR727836
- Neal, R. M. (1999). Regression and classification using Gaussian process priors. In Bayesian Statistics 6 ( A. P. Dawid, J. M. Bernardo, J. O. Berger and A. F. M. Smith, eds.) 475–501. Oxford Univ. Press, New York.
- Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Statist. 9 249–265.Mathematical Reviews (MathSciNet): MR1823804
Digital Object Identifier: doi:10.2307/1390653
JSTOR: links.jstor.org - O’Hagan, A. (1978). Curve fitting and optimal design for prediction. J. Roy. Statist. Soc. Ser. B 40 1–42.
- Panagiotelis, A. and Smith, M. (2008). Bayesian identification, selection and estimation of semiparametric functions in high-dimensional additive models. J. Econometrics 143 291–316.Mathematical Reviews (MathSciNet): MR2389611
Digital Object Identifier: doi:10.1016/j.jeconom.2007.10.003 - Parzen, E. (1963). Probability density functionals and reproducing kernel Hilbert spaces. In Proc. Sympos. Time Series Analysis (Brown Univ., 1962) ( M. Rosenblatt, ed.) 155–169. Wiley, New York.
- Qian, P. Z. G., Wu, H. and Wu, C. F. J. (2008). Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50 383–396.Mathematical Reviews (MathSciNet): MR2457574
Digital Object Identifier: doi:10.1198/004017008000000262 - Raftery, A. E., Madigan, D. and Volinsky, C. T. (1996). Accounting for model uncertainty in survival analysis improves predictive performance. In Bayesian Statistics 5. Oxford Sci. Publ. 323–349. Oxford Univ. Press, New York.Mathematical Reviews (MathSciNet): MR1425413
- Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA.
- Roberts, G. O. and Rosenthal, J. S. (2007). Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. J. Appl. Probab. 44 458–475.Mathematical Reviews (MathSciNet): MR2340211
Zentralblatt MATH: 1137.62015
Digital Object Identifier: doi:10.1239/jap/1183667414
Project Euclid: euclid.jap/1183667414 - Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge Series in Statistical and Probabilistic Mathematics 12. Cambridge Univ. Press, Cambridge.Mathematical Reviews (MathSciNet): MR1998720
- Sacks, J., Schiller, S. B. and Welch, W. J. (1989). Designs for computer experiments. Technometrics 31 41–47.Mathematical Reviews (MathSciNet): MR997669
Digital Object Identifier: doi:10.2307/1270363
JSTOR: links.jstor.org - Savitsky, T. D. (2010). Generalized Gaussian process models with Bayesian variable selection. Ph.D. thesis, Dept. Statistics, Rice Univ.Mathematical Reviews (MathSciNet): MR2782375
- Sha, N., Tadesse, M. G. and Vannucci, M. (2006). Bayesian variable selection for the analysis of microarray data with censored outcomes. Bioinformatics 22 2262–2268.
- Sha, N., Vannucci, M., Tadesse, M. G., Brown, P. J., Dragoni, I., Davies, N., Roberts, T. C., Contestabile, A., Salmon, M., Buckley, C. and Falciani, F. (2004). Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage. Biometrics 60 812–828.Mathematical Reviews (MathSciNet): MR2089459
Digital Object Identifier: doi:10.1111/j.0006-341X.2004.00233.x
JSTOR: links.jstor.org - Shawe-Taylor, J. and Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Cambridge Univ. Press, Cambridge.
- Sinha, D., Ibrahim, J. G. and Chen, M.-H. (2003). A Bayesian justification of Cox’s partial likelihood. Biometrika 90 629–641.
- Thrun, S., Saul, L. K. and Scholkopf, B. (2004). Advances in Neural Information Processing Systems. MIT Press, Cambridge.
- Tokdar, S., Zhu, Y. and Ghosh, J. (2010). Bayesian density regression with logistic Gaussian process and subspace projection. Bayesian Anal. 5 319–344.
- Volinsky, C., Madigan, D., Raftery, A. and Kronmal, R. (1997). Bayesian model averaging in proportional hazard models: Assessing the risk of stroke. Appl. Statist. 46 433–448.
- Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics 59. SIAM, Philadelphia, PA.
- Weng, Y. P. and Wong, K. F. (2007). Baseline Survival Function Estimators under Proportional Hazards Assumption, Ph.D. thesis, Institute of Statistics, National Univ. Kaohsiung, Taiwan.

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Variable selection for BART: An application to gene regulation
Bleich, Justin, Kapelner, Adam, George, Edward I., and Jensen, Shane T., The Annals of Applied Statistics, 2014 - Bayesian Nonparametric Tests via Sliced Inverse Modeling
Jiang, Bo, Ye, Chao, and Liu, Jun S., Bayesian Analysis, 2017 - Asymptotically optimal model selection method with right censored outcomes
Keles, Sündüz, Van Der Laan, Mark, and Dudoit, Sandrine, Bernoulli, 2004
- Variable selection for BART: An application to gene regulation
Bleich, Justin, Kapelner, Adam, George, Edward I., and Jensen, Shane T., The Annals of Applied Statistics, 2014 - Bayesian Nonparametric Tests via Sliced Inverse Modeling
Jiang, Bo, Ye, Chao, and Liu, Jun S., Bayesian Analysis, 2017 - Asymptotically optimal model selection method with right censored outcomes
Keles, Sündüz, Van Der Laan, Mark, and Dudoit, Sandrine, Bernoulli, 2004 - Bayesian density regression with logistic Gaussian process and subspace projection
Ghosh, Jayanta K., Tokdar, Surya T., and Zhu, Yu M., Bayesian Analysis, 2010 - Adaptive Bayesian density regression for high-dimensional data
Shen, Weining and Ghosal, Subhashis, Bernoulli, 2016 - “Preconditioning” for feature selection and regression in high-dimensional problems
Paul, Debashis, Bair, Eric, Hastie, Trevor, and Tibshirani, Robert, The Annals of Statistics, 2008 - A Fully Nonparametric Modeling Approach to Binary Regression
DeYoreo, Maria and Kottas, Athanasios, Bayesian Analysis, 2015 - BART: Bayesian additive regression
trees
Chipman, Hugh A., George, Edward I., and McCulloch, Robert E., The Annals of Applied Statistics, 2010 - Recursive partitioning and multi-scale modeling on conditional densities
Ma, Li, Electronic Journal of Statistics, 2017 - A nonparametric Bayesian technique for high-dimensional regression
Guha, Subharup and Baladandayuthapani, Veerabhadran, Electronic Journal of Statistics, 2016
