Open Access
December 2013 Bayesian shrinkage methods for partially observed data with many predictors
Philip S. Boonstra, Bhramar Mukherjee, Jeremy M. G. Taylor
Ann. Appl. Stat. 7(4): 2272-2292 (December 2013). DOI: 10.1214/13-AOAS668

Abstract

Motivated by the increasing use of and rapid changes in array technologies, we consider the prediction problem of fitting a linear regression relating a continuous outcome $Y$ to a large number of covariates $\mathbf{X}$, for example, measurements from current, state-of-the-art technology. For most of the samples, only the outcome $Y$ and surrogate covariates, $\mathbf{W}$, are available. These surrogates may be data from prior studies using older technologies. Owing to the dimension of the problem and the large fraction of missing information, a critical issue is appropriate shrinkage of model parameters for an optimal bias-variance trade-off. We discuss a variety of fully Bayesian and Empirical Bayes algorithms which account for uncertainty in the missing data and adaptively shrink parameter estimates for superior prediction. These methods are evaluated via a comprehensive simulation study. In addition, we apply our methods to a lung cancer data set, predicting survival time ($Y$) using qRT-PCR ($\mathbf{X}$) and microarray ($\mathbf{W}$) measurements.

Citation

Download Citation

Philip S. Boonstra. Bhramar Mukherjee. Jeremy M. G. Taylor. "Bayesian shrinkage methods for partially observed data with many predictors." Ann. Appl. Stat. 7 (4) 2272 - 2292, December 2013. https://doi.org/10.1214/13-AOAS668

Information

Published: December 2013
First available in Project Euclid: 23 December 2013

zbMATH: 1283.62049
MathSciNet: MR3161722
Digital Object Identifier: 10.1214/13-AOAS668

Keywords: High-dimensional data , Markov chain Monte Carlo , measurement error , missing data , shrinkage

Rights: Copyright © 2013 Institute of Mathematical Statistics

Vol.7 • No. 4 • December 2013
Back to Top