Modeling species abundance patterns using local environmental features is an important, current problem in ecology. The Cape Floristic Region (CFR) in South Africa is a global hot spot of diversity and endemism, and provides a rich class of species abundance data for such modeling. Here, we propose a multi-stage Bayesian hierarchical model for explaining species abundance over this region. Our model is specified at areal level, where the CFR is divided into roughly 37, 000 one minute grid cells; species abundance is observed at some locations within some cells. The abundance values are ordinally categorized. Environmental and soil-type factors, likely to influence the abundance pattern, are included in the model. We formulate the empirical abundance pattern as a degraded version of the potential pattern, with the degradation effect accomplished in two stages. First, we adjust for land use transformation and then we adjust for measurement error, hence misclassification error, to yield the observed abundance classifications. An important point in this analysis is that only 28% of the grid cells have been sampled and that, for sampled grid cells, the number of sampled locations ranges from one to more than one hundred. Still, we are able to develop potential and transformed abundance surfaces over the entire region.
In the hierarchical framework, categorical abundance classifications are induced by continuous latent surfaces. The degradation model above is built on the latent scale. On this scale, an areal level spatial regression model was used for modeling the dependence of species abundance on the environmental factors. To capture anticipated similarity in abundance pattern among neighboring regions, spatial random effects with a conditionally autoregressive prior (CAR) were specified. Model fitting is through familiar Markov chain Monte Carlo methods. While models with CAR priors are usually efficiently fitted, even with large data sets, with our modeling and the large number of cells, run times became very long. So a novel parallelized computing strategy was developed to expedite fitting. The model was run for six different species. With categorical data, display of the resultant abundance patterns is a challenge and we offer several different views. The patterns are of importance on their own, comparatively across the region and across species, with implications for species competition and, more generally, for planning and conservation.
"Modeling large scale species abundance with latent spatial processes." Ann. Appl. Stat. 4 (3) 1403 - 1429, September 2010. https://doi.org/10.1214/10-AOAS335