Institute of Mathematical Statistics Collections

Bayesian prediction with adaptive ridge estimators

David G.T. Denison and Edward I. George

Full-text: Open access


The Bayesian linear model framework has become an increasingly popular building block in regression problems. It has been shown to produce models with good predictive power and can be used with basis functions that are nonlinear in the data to provide flexible estimated regression functions. Further, model uncertainty can be accounted for by Bayesian model averaging. We propose a simpler way to account for model uncertainty that is based on generalized ridge regression estimators. This is shown to predict well and to be much more computationally efficient than standard model averaging methods. Further, we demonstrate how to efficiently mix over different sets of basis functions, letting the data determine which are most appropriate for the problem at hand.

Chapter information

Dominique Fourdrinier, Éric Marchand and Andrew L. Rukhin, eds., Contemporary Developments in Bayesian Analysis and Statistical Decision Theory: A Festschrift for William E. Strawderman (Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2012), 215-234

First available in Project Euclid: 14 March 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F15: Bayesian inference 62J05: Linear regression

Bayesian model averaging generalized ridge regression prediction regression splines shrinkage

Copyright © 2012, Institute of Mathematical Statistics


Denison, David G.T.; George, Edward I. Bayesian prediction with adaptive ridge estimators. Contemporary Developments in Bayesian Analysis and Statistical Decision Theory: A Festschrift for William E. Strawderman, 215--234, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2012. doi:10.1214/11-IMSCOLL815.

Export citation


  • Chipman, H., Kolaczyk, E.D. and McCulloch, R.E. (1997) Adaptive Bayesian wavelet shrinkage. J. Am. Statist. Assoc., 92, 1413–1421.
  • Clyde, M., DeSimone, H. and Parmigiani, G. (1996) Prediction via orthogonalized model mixing. J. Am. Statist. Assoc., 91, 1197–1208.
  • Clyde, M., Parmigiani, G. and Vidakovic, B. (1998) Multiple shrinkage and subset selection in wavelets. J. Am. Statist. Assoc., 92, 391–402.
  • Clyde, M. and George, E.I. (2000) Flexible empirical Bayes estimation for wavelets. J. Roy. Statist. Soc. B, 62, 681–698.
  • Copas, J.B. (1983) Regression, prediction and shrinkage (with discussion), J. Roy. Statist. Soc. B, 45, 311–354.
  • Dempster, A.P., Schatzoff, M. and Wermuth, N. (1977) A simulation study of alternatives to ordinary least squares. J. Am. Statist. Assoc., 72, 77–106.
  • Denison, D.G.T., Mallick, B.K. and Smith, A.F.M. (1998) Automatic Bayesian curve fitting. J. Roy. Statist. Soc. B, 60, 333–350.
  • Denison, D.G.T., Holmes, C.C., Mallick, B.K. and Smith, A.F.M. (2002) Bayesian Methods for Nonlinear Classification and Regression. Chichester: Wiley.
  • Draper, D. (1995) Assessment and propogation of model uncertainty (with discussion). J. Roy. Statist. Soc. B, 57, 45–97.
  • Fan, J.Q. and Gijbels, I. (1995) Data-driven bandwidth selection in local polynomial fitting – Variable bandwidth and spatial adaption. J. Roy. Statist. Soc., 57, 371–394.
  • Gelfand, A.E. and Smith, A.F.M. (1990) Sampling based approaches to calculating marginal densities. J. Am. Statist. Assoc., 85, 398–409.
  • George, E.I. (1986). Minimax multiple shrinkage estimation. Ann. Statist., 14, 188–205.
  • George, E.I. and McCulloch, R.E. (1993) Variable selection via Gibbs sampling. J. Am. Statist. Assoc., 88, 881–889.
  • Goldstein, M. and Smith, A.F.M. (1974) Ridge-type estimators for regression analysis. J. Roy. Statist. Soc. B, 36, 284–291.
  • Gustafson, P. (2000) Bayesian regression modeling with interactions and smooth effects. J. Am. Statist. Assoc., 95, 795–806.
  • Harrison, D. and Rubenfeld, D.L. (1978) Hedonic housing prices and the demand for clean air. J. Environ. Econ. Manag., 5, 81–102.
  • Hastie, T.J. and Tibshirani, R.J. (1990) Generalized Additive Models. London: Chapman & Hall.
  • Hemmerle, W.J. (1975) An explicit solution for generalized ridge regression. Technometrics, 17, 309–314.
  • Hocking, R.R., Speed, F.M. and Lynn, M.J. (1976) A class of biased estimators in linear regression. Technometrics, 18, 425–437.
  • Hoerl, A.E. and Kennard, R.W. (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.
  • Hoeting, J.A., Madigan, D., Raftery, A.E. and Volinsky, C.T. (1999) Bayesian model averaging: A tutorial (with discussion). Statist. Sci., 14, 382–417.
  • Holmes, C.C. and Denison, D.G.T. (1999) Bayesian wavelet analysis with a model complexity prior. In Bayesian statistics 6 (Eds. J.M. Bernardo, J.O. Berger, A.P. Dawid and A.F.M. Smith), pp. 769–776. Oxford: Clarendon Press.
  • James, W. and Stein, C.M. (1961) Estimation with quadratic loss. Proc. 4th Berkeley Symposium 1, 361–379.
  • Kohn, R., Smith, M. and Chan, D. (2000) Nonparametric regression using linear combinations of basis functions. Technical report., Australian Graduate School of Management, University of New South Wales.
  • Lawless, J.F. (1981) Mean squared error properties of generalized ridge estimators. J. Am. Statist. Assoc., 76, 462–466.
  • Lindley, D.V. (1995) Discussion of “Assessment and propogation of uncertainty” by D. Draper. J. Roy. Statist. Soc. B, 57, 75.
  • Mallows, C.L. (1973) Some comments on Cp. Technometrics, 15, 661–675.
  • O’Hagan, A. (1994) Kendall’s Advanced theory of statistics: Bayesian Inference. Cambridge: Arnold.
  • Quinlan, R. (1993) Combining instance-based and model-based learning. Machine Learning: Proc. 10th Int. Conf., Amherst, MA, 1993. Morgan Kaufmann.
  • Smith, M. and Kohn, R. (1996) Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317–344.
  • Strawderman, W.E. (1971) Proper Bayes minimax estimators of the multivariate normal mean. Ann. Math. Statist., 42, 385–388.
  • Thisted, R.A. (1978) On generalized ridge regression. Technical report No. 57, Dept. of Statistics, University of Chicago.
  • Volinsky, C.T. (1997) Bayesian model averaging for censored survival data. PhD Thesis, University of Washington, Seattle.
  • Zellner, A. (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In Bayesian inference and decision techniques: Essays in honor of Bruno de Finetti (Eds. P.K. Goel and A. Zellner). Amsterdam: North Holland.