Electronic Journal of Statistics

Adaptive-modal Bayesian nonparametric regression

George Karabatsos and Stephen G. Walker

Full-text: Open access


We introduce a novel, Bayesian nonparametric, infinite-mixture regression model. The model has unimodal kernel (component) densities, and has covariate-dependent mixture weights that are defined by an infinite ordered-category probits regression. Based on these mixture weights, the regression model predicts a probability density that becomes increasingly unimodal as the explanatory power of the covariate (vector) increases, and increasingly multimodal as this explanatory power decreases, while allowing the explanatory power to vary from one covariate (vector) value to another. The model is illustrated and compared against many other regression models in terms of predictive performance, through the analysis of many real and simulated data sets.

Article information

Electron. J. Statist., Volume 6 (2012), 2038-2068.

First available in Project Euclid: 2 November 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian inference nonparametric regression unimodal distribution binary regression


Karabatsos, George; Walker, Stephen G. Adaptive-modal Bayesian nonparametric regression. Electron. J. Statist. 6 (2012), 2038--2068. doi:10.1214/12-EJS738. https://projecteuclid.org/euclid.ejs/1351865117

Export citation


  • Agresti, A. (1996)., An introduction to categorical data analysis. John Wiley and Sons, New York.
  • Akaike, H. (1973). Information Theory and the an Extension of the Maximum Likelihood Principle. In, Second International Symposium On Information Theory (B. N. Petrov and F. Csaki, eds.) 267–281. Academiai Kiado, Budapest.
  • Albert, J. H. and Chib, S. (1993). Bayesian Analysis of Binary and Polychotomous Response Data., Journal of the American Statistical Association 88 669–679.
  • Barbieri, M. and Berger, J. (2004). Optimal Predictive Model Selection., Annals of Statistics 32 870–897.
  • Barrientos, A. F., Jara, A. and Quintana, F. A. (2012). On the Support of MacEachern’s Dependent Dirichlet Processes and Extensions., Bayesian Analysis 7 277–310.
  • Brunner, L. J. (1992). Bayesian nonparametric methods for data from a unimodal density., Statistics and Probability Letters 14 195–199.
  • Cepeda, E. and Gamerman, D. (2001). Bayesian modeling of variance heterogeneity in normal regression models., Brazilian Journal of Probability and Statistics 14 207–221.
  • Chipman, H., George, E. I. and McCulloch, R. E. (2010). BART: Bayesian Additive Regression Trees., Annals of Applied Statistics 4 266–298.
  • Chipman, H. and McCulloch, R. (2010). BayesTree: Bayesian Methods for Tree Based Models R package version, 0.3-1.1.
  • DeIorio, M., Müller, P., Rosner, G. L. and MacEachern, S. N. (2004). An ANOVA Model for Dependent Random Measures., Journal of the American Statistical Association 99 205–215.
  • Dunson, D. and Park, J. H. (2008). Kernel Stick Breaking Processes., Biometrika 95 307–323.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least Angle Regression., Annals of Statistics 32 407–499.
  • Ferguson, T. S. (1973). A Bayesian Analysis of Some Nonparametric Problems., Annals of Statistics 1 209–230.
  • Friedman, J. H. (1991). Multivariate Adaptive Regression Splines (With Discussion)., Annals of Statistics 19 1–67.
  • Friedman, J. H., Hastie, T. and Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent., Journal of Statistical Software 33.
  • Fuentes-García, R., Mena, R. H. and Walker, S. G. (2010). A New Bayesian Nonparametric Mixture Model., Communications In Statistics 39 669–682.
  • Gelfand, A. E. and Banerjee, S. (2010). Multivariate Spatial Process Models. In, Handbook of Spatial Statistics (A. E. Gelfand, P. Diggle, P. Guttorp and M. Fuentes, eds.) 495–515. Chapman and Hall/CRC, Boca Raton.
  • Gelfand, A. E. and Ghosh, J. K. (1998). Model Choice: A Minimum Posterior Predictive Loss Approach., Biometrika 85 1–11.
  • Gelfand, A. E., Kottas, A. and MacEachern, S. N. (2005). Bayesian Nonparametric Spatial Modeling With Dirichlet Processes Mixing., Journal of the American Statistical Association 100 1021–1035.
  • Gelman, A., Jakulin, A., Pittau, M. and Su, Y. S. (2008). A Weakly Informative Default Prior Distribution for Logistic and Other Regression Models., The Annals of Applied Statistics 2 1360–1383.
  • George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian Variable Selection., Statistica Sinica 7 339–373.
  • Gramacy, R. B. (2010). Monomvn: Estimation for multivariate normal and Student-t data with monotone missingness R package version, 1.8-3.
  • Griffin, J. E. and Steel, M. F. J. (2006). Order-Based Dependent Dirichlet Processes., Journal of the American Statistical Association 101 179–194.
  • Gruen, B. and Leisch, F. (2007). Fitting finite mixtures of generalized linear regressions in R., Computational Statistics and Data Analysis 51 5247–5252.
  • Hanson, T. E. (2006). Inference for Mixtures of Finite Pólya Tree Models., Journal of the American Statistical Association 101 1548–1565.
  • Hastie, T. and Efron, B. (2007). Lars: Least Angle Regression, Lasso and Forward Stagewise R package version, 0.9-7.
  • Hastie, T. and Tibshirani, R. (1990)., Generalized Additive Models. Chapman and Hall, London.
  • Holmes, C. C., Denison, D. G. T., Ray, S. and Mallick, B. K. (2005). Bayesian Prediction via Partitioning., Journal of Computational and Graphical Statistics 14 811–830.
  • Hwang, J., Lay, S., Maechler, R., Martin, D. and Schimert, J. (1994). Regression Modelling in Back-Propagation and Projection Pursuit Learning., IEEE Transactions of Neural Networks 5 342–353.
  • Ibrahim, J. G., Chen, M. H. and Sinha, D. (2001). Criterion-based methods for Bayesian model assessment., Statistica Sinica 11 419–443.
  • Ibrahim, J. G. and Kleinman, K. P. (1998). Semiparametric Bayesian Methods for Random Effects Models. In, Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statistics 133 (D. Dey, P. Müller and D. Sinha, eds.) 89–114. Springer-Verlag, New York.
  • Ishwaran, H. and James, L. F. (2001). Gibbs Sampling Methods for Stick-Breaking Priors., Journal of the American Statistical Association 96 161–173.
  • Jara, A. and Hanson, T. (2011). A class of mixtures of dependent tail-free processes., Biometrika 98 553–566.
  • Jara, A., Hanson, T. E., Quintana, F. A., Müller, P. and Rosner, G. L. (2011). DPpackage: Bayesian Semi- and Nonparametric Modeling in R., Journal of Statistical Software 40 1–20.
  • Jones, G. L., Haran, M., Caffo, B. S. and Neath, R. (2006). Fixed-Width Output Analysis for Markov Chain Monte Carlo., Journal of the American Statistical Association 101 1537–1547.
  • Kalli, M., Griffin, J. and Walker, S. G. (2010). Slice Sampling Mixture Models., Statistics and Computing 21 93–105.
  • Kim, H., Loh, W. Y., Shih, Y. S. and Chaudhuri, P. (2007). Visualizable and interpretable regression models with good prediction power., IEEE Transactions: Special Issue on Data Mining and Web Mining 39 565–579.
  • Kottas, A., Müller, P. and Quintana, F. (2005). Nonparametric Bayesian Modeling for Multivariate Ordinal Data., Journal of Computational and Graphical Statistics 14 610–625.
  • Laud, P. W. and Ibrahim, J. G. (1995). Predictive Model Selection., Journal of the Royal Statistical Society, Series B 57 247–262.
  • Lo, A. Y. (1984). On a Class of Bayesian Nonparametric Estimates., Annals of Statistics 12 351–357.
  • MacEachern, S. N. (1999). Dependent Nonparametric processes., Proceedings of the Bayesian Statistical Sciences Section of the American Statistical Association 50–55.
  • MacEachern, S. N. (2000). Dependent Dirichlet Processes Technical Report, Department of Statistics, The Ohio State, University.
  • MacEachern, S. N. (2001). Decision Theoretic Aspects of Dependent Nonparametric Processes. In, Bayesian Methods with Applications to Science, Policy and Official Statistics (E. George, ed.) 551–560. International Society for Bayesian Analysis, Creta.
  • Mallows, C. L. (1973). Some Comments on Cp., Technometrics 15 661–675.
  • Milborrow, S. (2009). Earth: Multivariate Adaptive Regression Spline Models R package version, 2.4-0.
  • Mukhopadhyay, S. and Gelfand, A. E. (1997). Dirichlet Process Mixed Generalized Linear Models., Journal of the American Statistical Association 92 633–639.
  • Müller, P., Erkanli, A. and West, M. (1996). Bayesian Curve Fitting Using Multivariate Normal Mixtures., Biometrika 83 67–79.
  • Müller, P. and Quintana, F. A. (2010). Random Partition Models with Regression on Covariates., Journal of Statistical Planning and Inference 140 2801–2808.
  • Müller, P., Quintana, F. A. and Rosner, G. L. (2011). A Product Partition Model with Regression on Covariates., Journal of Computational and Graphical Statistics 20 260–278.
  • Newton, M. A., Czado, C. and Chappell, R. (1996). Bayesian Inference for Semiparametric Binary Regression., Journal of the American Statistical Association 91 142–153.
  • O’Hagan, A. and Forster, J. (2004)., Kendall’s Advanced Theory of Statistics: Bayesian Inference 2B. Arnold, London.
  • Park, Y. and Casella, G. (2008). The Bayesian LASSO., Journal of the American Statistical Association 103 681–686.
  • Park, J. H. and Dunson, D. B. (2010). Bayesian generalized product partition models., Statistica Sinica 20 1203–1226.
  • Perman, M., Pitman, J. and Yor, M. (1992). Size-biased sampling of Poisson point processes and excursions., Probability Theory and Related Fields 92 21–39.
  • Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D. and R Development Core Team (2010). Nlme: Linear and Nonlinear Mixed Effects Models R package version, 3.1-97.
  • Polzehl, J. (2010). EDR: Estimation of the effective dimension reduction (EDR) space R package version, 0.6-4.
  • Polzehl, J. and Sperlich, S. (2009). A note on structural adaptive dimension reduction., Journal of Statistical Computation and Simulation 79 805–818.
  • Robert, C. P. and Casella, G. (2004)., Monte Carlo Statistical Methods (Second Edition). Springer, New York.
  • Rodriguez, A., Dunson, D. B. and Gelfand, A. E. (2008). The Nested Dirichlet Process., Journal of the American Statistical Association 103 1131–1144.
  • Rodriguez, A. and Dunson, D. B. (2011). Nonparametric Bayesian models through probit stick-breaking processes., Bayesian Analysis 6 1–34.
  • Sethuraman, J. (1994). A Constructive Definition of Dirichlet Priors., Statistica Sinica 4 639–650.
  • Smyth, G. (2010). Statmod: Statistical modeling R package version, 1.4.6.
  • R Development Core Team (2011)., R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org
  • Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes., Journal of the American Statistical Association 101 1566–1581.
  • Tokdar, S. T., Zhu, Y. M. and Ghosh, J. K. (2010). Density regression with logistic Gaussian process priors and subspace projection., Bayesian Analysis 5 316–344.
  • Walker, S. G. and Karabatsos, G. (2012). Revisiting Bayesian curve fitting using multivariate normal mixtures. In, Bayesian Theory and Applications (P. Damien, P. Dellaportas, N. Polson and D. Stephens, eds.) 297–305. Oxford University Press, New York.
  • Wood, S. N. (2004). Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models., Journal of the American Statistical Association 99 673–686.
  • Wood, S. N. (2010). GAMs with GCV/AIC/REML Smoothness Estimation and GAMMs by PQL: mgcv Package Documentation for the R Software, R Foundation for Statistical Computing, Vienna, Austria.