Bayesian Analysis

Dependent Species Sampling Models for Spatial Density Estimation

Seongil Jo, Jaeyong Lee, Peter Müller, Fernando A. Quintana, and Lorenzo Trippa

Full-text: Open access

Abstract

We consider a novel Bayesian nonparametric model for density estimation with an underlying spatial structure. The model is built on a class of species sampling models, which are discrete random probability measures that can be represented as a mixture of random support points and random weights. Specifically, we construct a collection of spatially dependent species sampling models and propose a mixture model based on this collection. The key idea is the introduction of spatial dependence by modeling the weights through a conditional autoregressive model. We present an extensive simulation study to compare the performance of the proposed model with competitors. The proposed model compares favorably to these alternatives. We apply the method to the estimation of summer precipitation density functions using Climate Prediction Center Merged Analysis of Precipitation data over East Asia.

Article information

Source
Bayesian Anal., Volume 12, Number 2 (2017), 379-406.

Dates
First available in Project Euclid: 3 May 2016

Permanent link to this document
https://projecteuclid.org/euclid.ba/1462297334

Digital Object Identifier
doi:10.1214/16-BA1006

Mathematical Reviews number (MathSciNet)
MR3620738

Zentralblatt MATH identifier
1384.62124

Keywords
climate prediction conditional autoregressive model spatial density estimation species sampling model

Rights
Creative Commons Attribution 4.0 International License.

Citation

Jo, Seongil; Lee, Jaeyong; Müller, Peter; Quintana, Fernando A.; Trippa, Lorenzo. Dependent Species Sampling Models for Spatial Density Estimation. Bayesian Anal. 12 (2017), no. 2, 379--406. doi:10.1214/16-BA1006. https://projecteuclid.org/euclid.ba/1462297334


Export citation

References

  • Airoldi, E. M., Costa, T., Bassetti, F., Leisen, F., and Guindani, M. (2014). “Generalized species sampling priors with latent beta reinforcements.” Journal of the American Statistical Association, 109(508): 1466–1480.
  • Albert, J. H. and Chib, S. (1993). “Bayesian analysis of binary and polychotomous response data.” Journal of the American Statistical Association, 88(422): 699–679.
  • Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2004). Hierarchical Modeling and Analysis for Spatial Data. London: Chapman & Hall.
  • Barnston, A. G., Mason, S. J., Goddard, L., Dewitt, D. G., and Zebiak, S. E. (2003). “Multimodel Ensembling in Seasonal Climate Forecasting at IRI.” Bulletin of the American Meteorological Society, 84(12): 1783–1796.
  • Barrientos, A. F., Jara, A., and Quintana, F. A. (2012). “On the support of MacEachern’s dependent Dirichlet processes and extensions.” Bayesian Analysis, 7(2): 277–309.
  • Bassetti, F., Crimaldi, I., and Leisen, F. (2010). “Conditionally identically distributed species sampling sequences.” Advances in Applied Probability, 42(2): 433–459.
  • Besag, J. (1974). “Spatial interaction and the statistical analysis of lattice systems (with discussion).” Journal of the Royal Statistical Society, Series B, 36(2): 192–236.
  • Besag, J., York, J., and Mollié, A. (1991). “Bayesian image restoration, with two applications in spatial statistics.” Annals of the Institute of Statistical Mathematics, 43(1): 1–20.
  • Blackwell, D. and MacQueen, J. B. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 1: 353–355.
  • Celeux, G., Forbes, F., Robert, C. P., and Titterington, D. M. (2006). “Deviance Information Criteria for Missing Data Models.” Bayesian Analysis, 1(4): 651–674.
  • Chung, Y. and Dunson, D. B. (2011). “The local Dirichlet process.” Annals of the Institute of Statistical Mathematics, 63(1): 59–80.
  • Clayton, D. and Kaldor, J. (1987). “Empirical Bayes estimates of age-standardized relative risks for use in disease mapping.” Biometrics, 43(3): 671–681.
  • Cressie, N. A. C. (1993). Statistics for spatial data. New York: John Wiley & Sons Inc.
  • Cressie, N. A. C. and Wikle, C. K. (2011). Statistics for spatio-temporal data. Hoboken, New Jersey: John Wiley & Sons Inc.
  • De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N. (2004). “An ANOVA model for dependent random measures.” Journal of the American Statistical Association, 99(465): 205–215.
  • Ding, M., He, L., Dunson, D. B., and Lawrence, C. (2012). “Nonparametric Bayesian Segmentation of a Multivariate Inhomogeneous Space-Time Poisson Process.” Bayesian Analysis, 7(4): 813–840.
  • Duan, J. A., Guindani, M., and Gelfand, A. E. (2007). “Generalized Spatial Dirichlet Process Models.” Biometrika, 94(4): 809–825.
  • Dunson, D. B. and Park, J.-H. (2008). “Kernel stick-breaking processes.” Biometrika, 95(2): 307–323.
  • Ferguson, T. S. (1973). “A Bayesian analysis of some nonparametric problems.” Annals of Statistics, 1(2): 209–230.
  • Fuentes, M. and Reich, B. (2013). “Multivariate spatial nonparametric modelling via kernel processes mixing.” Statistica Sinica, 23: 75–97.
  • Geisser, S. and Eddy, W. F. (1979). “A predictive approach to model selection.” Journal of the American Statistical Association, 74(365): 153–160.
  • Gelfand, A. E., Kottas, A., and MacEachern, S. N. (2005). “Bayesian nonparametric spatial modeling with Dirichlet process mixing.” Journal of the American Statistical Association, 100(471): 1021–1035.
  • Geman, S. and Geman, D. (1984). “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6: 721–741.
  • Ghosh, J. K. and Ramamoorthi, R. V. (2003). Bayesian nonparametrics. Springer Series in Statistics. New York: Springer-Verlag.
  • Griffin, J. E. and Steel, M. F. J. (2006). “Order-based dependent Dirichlet processes.” Journal of the American Statistical Association, 110(473): 179–194.
  • Ishwaran, H. and James, L. F. (2001). “Gibbs sampling methods for stick-breaking priors.” Journal of the American Statistical Association, 96(453): 161–173.
  • Ishwaran, H. and James, L. F. (2003). “Generalized weighted Chinese restaurant processes for species sampling mixture models.” Statistica Sinica, 13(4): 1211–1235.
  • James, L., Lijoi, A., and Prünster, I. (2009). “Posterior Analysis for Normalized Random Measures with Independent Increments.” Scandinavian Journal of Statistics, 36(1): 76–97.
  • James, L. F. (2008). “Large sample asymptotics for the two-parameter Poisson–Dirichlet process.” In Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, volume 3 of IMS Lecture Notes-Monograph Series, 187–199. Hayward, CA: Inst. Math. Statist.
  • Jang, G., Lee, J., and Lee, S. (2010). “Posterior consistency of species sampling priors.” Statistica Sinica, 20(2): 581–593.
  • Jara, A., Lesaffre, E., De Iorio, M., and Quintana, F. (2010). “Bayesian semiparametric inference for multivariate doubly-interval-censored data.” Annals of Applied Statistics, 4(4): 2126–2149.
  • Jo, S., Lee, J., Müller, P., Quintana, F. A., and Trippa, L. (2016). “Supplementary material for dependent species sampling models for spatial density estimation.” Bayesian Analysis.
  • Kaiser, M. S. and Cressie, N. (2000). “The construction of multivariate distribution from Markov random fields.” Journal of Multivariate Analysis, 73(2): 199–220.
  • Lee, D. (2011). “A comparison of conditional autoregressive models used in Bayesian disease mapping.” Spatial and Spatio-temporal Epidemiology, 2(2): 79–89.
  • Lee, J., Quintana, F., Müller, P., and Trippa, L. (2013). “Defining Predictive Probability Functions for Species Sampling Models.” Statistical Science, 28(2): 209–222.
  • Li, P., Banerjee, S., Hanson, T. A., and McBean, A. M. (2015). “Bayesian models for detecting difference boundaries in areal data.” Statistica Sinica, 25: 385–402.
  • Lijoi, A., Mena, R. H., and Prünster, I. (2005). “Hierarchical mixture modeling with normalized inverse-Gaussian priors.” Journal of the American Statistical Association, 100(472): 1278–1291.
  • Lo, A. Y. (1984). “On a class of Bayesian nonparametric estimates: I. Density estimates.” The Annals of Statistics, 12: 351–357.
  • MacEachern, S. N. (1999). “Dependent Nonparametric Processes.” In ASA Proceedings of the Section on Bayesian Statistical Science, 50–55. American Statistical Association.
  • MacEachern, S. N. (2000). “Dependent Dirichlet processes.” Technical report, Department of Statistics, Ohio State University.
  • Müller, P. and Mitra, R. (2013). “Bayesian nonparametric inference—why and how.” Bayesian Analysis, 8(2): 269–302.
  • Navarrete, C., Quintana, F. A., and Müller, P. (2008). “Some issues on nonparametric Bayesian modeling using species sampling models.” Statistical Modelling. An International Journal, 8(1): 3–21.
  • Nieto-Barajas, L. (2008). “A Markov gamma random field for modelling disease mapping data.” Statistical Modelling, 8(1): 97–114.
  • Pitman, J. (1996). “Some developments of the Blackwell–MacQueen urn scheme.” In Statistics, probability and game theory, volume 30 of IMS Lecture Notes-Monograph Series, 245–267. Hayward, CA: Inst. Math. Statist.
  • Pitman, J. and Yor, M. (1997). “The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator.” Annals of Probability, 25(2): 855–900.
  • Plummer, M. (2003). “JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling.” In Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), 1–10. Austrian Association for Statistical Computing (AASC) and the R Foundation for Statistical Computing.
  • Reich, B. J. and Fuentes, M. (2007). “A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields.” Annals of Applied Statistics, 1(1): 249–264.
  • Ren, L., Du, L., Carin, L., and Dunson, D. B. (2011). “Logistic Stick-Breaking Process.” Journal of Machine Learning Research, 12: 203–239.
  • Rodríguez, A. and Dunson, D. B. (2011). “Nonparametric Bayesian models through probit stick-breaking processes.” Bayesian Analysis, 6(1): 145–177.
  • Rodríguez, A. and ter Horst, E. (2008). “Bayesian dynamic density estimation.” Bayesian Analysis, 3(2): 339–365.
  • Rue, H. and Held, L. (2005). Gaussian Markov random fields: Theory and applications. Monographs on Statistics and Applied Probability, 104. Boca Raton: Chapman & Hall.
  • Scott, S. L. (2011). “Data augmentation, frequentist estimation, and the Bayesian analysis of multinomial logit models.” Statistical Papers, 52(1): 639–650.
  • Sethuraman, J. (1994). “A constructive definition of Dirichlet priors.” Statistica Sinica, 4(2): 639–650.
  • Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. London: Chapman & Hall.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and van der Linde, A. (2002). “Bayesian measures of model complexity and fit (with discussion).” Journal of the Royal Statistical Society, Series B, 64(4): 583–639.
  • Sun, D., Tsutakawa, R. K., and Speckman, P. L. (1999). “Posterior distribution of hierarchical models using CAR(1) distributions.” Biometrika, 86(2): 341–350.
  • Tippett, M. K., Barnston, A. G., and Robertson, A. W. (2007). “Estimation of Seasonal Precipitation Tercile-Based Categorical Probabilities from Ensembles.” Journal of Climate, 20(10): 2210–2228.
  • Whittle, P. (1954). “On stationary processes in the plane.” Biometrika, 41: 434–449.
  • Wu, Y. and Ghosal, S. (2008). “Kullback Leibler property of kernel mixture priors in Bayesian density estimation.” Electronic Journal of Statistics, 2: 298–331.

Supplemental materials