The Annals of Applied Statistics

A new class of flexible link functions with application to species co-occurrence in cape floristic region

Xun Jiang, Dipak K. Dey, Rachel Prunier, Adam M. Wilson, and Kent E. Holsinger

Full-text: Open access

Abstract

Understanding the mechanisms that allow biological species to co-occur is of great interest to ecologists. Here we investigate the factors that influence co-occurrence of members of the genus Protea in the Cape Floristic Region of southwestern Africa, a global hot spot of biodiversity. Due to the binomial nature of our response, a critical issue is to choose appropriate link functions for the regression model. In this paper we propose a new family of flexible link functions for modeling binomial response data. By introducing a power parameter into the cumulative distribution function (c.d.f.) corresponding to a symmetric link function and its mirror reflection, greater flexibility in skewness can be achieved in both positive and negative directions. Through simulated data sets and analysis of the Protea co-occurrence data, we show that the proposed link function is quite flexible and performs better against link misspecification than standard link functions.

Article information

Source
Ann. Appl. Stat. Volume 7, Number 4 (2013), 2180-2204.

Dates
First available in Project Euclid: 23 December 2013

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1387823315

Digital Object Identifier
doi:10.1214/13-AOAS663

Mathematical Reviews number (MathSciNet)
MR3161718

Zentralblatt MATH identifier
1283.62228

Keywords
Bayesian method community ecology generalized linear model MCMC model selection symmetric power link function

Citation

Jiang, Xun; Dey, Dipak K.; Prunier, Rachel; Wilson, Adam M.; Holsinger, Kent E. A new class of flexible link functions with application to species co-occurrence in cape floristic region. Ann. Appl. Stat. 7 (2013), no. 4, 2180--2204. doi:10.1214/13-AOAS663. https://projecteuclid.org/euclid.aoas/1387823315


Export citation

References

  • Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.
  • Aranda-Ordaz, F. J. (1981). On two families of transformations to additivity for binary response data. Biometrika 68 357–363.
  • Arnold, B. C. and Groeneveld, R. A. (1995). Measuring skewness with respect to the mode. Amer. Statist. 49 34–38.
  • Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2004). Hierarchical Modeling and Analysis for Spatial Data 101. Chapman & Hall/CRC, Boca Raton, FL.
  • Besag, J. and Green, P. J. (1993). Spatial statistics and Bayesian computation. J. R. Stat. Soc. Ser. B Stat. Methodol. 55 25–37.
  • Chen, M.-H., Dey, D. K. and Shao, Q.-M. (1999). A new skewed link model for dichotomous quantal response data. J. Amer. Statist. Assoc. 94 1172–1186.
  • Chen, M.-H. and Shao, Q.-M. (2001). Propriety of posterior distribution for dichotomous quantal response models. Proc. Amer. Math. Soc. 129 293–302.
  • Chib, S. and Jeliazkov, I. (2006). Inference in semiparametric dynamic models for binary longitudinal data. J. Amer. Statist. Assoc. 101 685–700.
  • Czado, C. (1994). Parametric link modification of both tails in binary regression. Statist. Papers 35 189–201.
  • Czado, C. and Santner, T. J. (1992). The effect of link misspecification on binary regression inference. J. Statist. Plann. Inference 33 213–231.
  • Elton, C. (1946). Competition and the structure of ecological communities. The Journal of Animal Ecology 15 54–68.
  • Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods. Technical report, DTIC.
  • Gilks, W. R., Best, N. G. and Tan, K. K. C. (1995). Adaptive rejection Metropolis sampling within Gibbs sampling (Corr: 97V46 P541–542 with R. M. Neal). J. Appl. Stat. 44 455–472.
  • Grass Development Team (2008). Geographic resources analysis support system (GRASS). Available at http://grass.osgeo.org.
  • Guerrero, V. M. and Johnson, R. A. (1982). Use of the Box–Cox transformation with binary response models. Biometrika 69 309–314.
  • Gupta, R. D. and Gupta, R. C. (2008). Analyzing skewed data by power normal model. TEST 17 197–210.
  • Ibrahim, J. G., Chen, M. H. and Sinha, D. (2005). Bayesian Survival Analysis. Wiley, New York.
  • Jiang, X., Dey, D. K., Prunier, R., Wilson, A. M. and Holsinger, K. E. (2013). Supplement to “A new class of flexible link functions with application to species co-occurrence in cape floristic region.” DOI:10.1214/13-AOAS663SUPP.
  • Jones, M. C. (2004). Reply to Comments on “Families of distributions arising from distributions of order statistics.” TEST 13 1–43.
  • Kim, S., Chen, M.-H. and Dey, D. K. (2008). Flexible generalized $t$-link models for binary response data. Biometrika 95 93–106.
  • Liu, C. (2004). Robit regression: A simple robust alternative to logistic and probit regression. In Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives (A. Gelman and X.-L. Meng, eds.) 227–238. Wiley, New York.
  • Lunn, D. J., Thomas, A., Best, N. and Spiegelhalter, D. (2000). WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10 325–337.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • Mudholkar, G. S. and George, E. O. (1978). A remark on the shape of the logistic distribution. Biometrika 65 667–668.
  • Nagler, J. (1994). Scobit: An alternative estimator to logit and probit. American Journal of Political Science 38 230–255.
  • Palmgren, J. (1989). Regression models for bivariate binary responses. Working Paper 101, UW Biostatistics Working Paper Series.
  • Paradis, E., Claude, J. and Strimmer, K. (2004). APE: Analyses of phylogenetics and evolution in R language. Bioinformatics 20 289–290.
  • Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) 20–22.
  • Samejima, F. (2000). Logistic positive exponent family of models: Virtue of asymmetric item characteristic curves. Psychometrika 65 319–335.
  • Sanderson, M. J. (2003). r8s: Inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19 301–302.
  • Shipley, B., Vile, D. and Garnier, É. (2006). From plant traits to plant communities: A statistical mechanistic approach to biodiversity. Science 314 812–814.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 583–639.
  • Stukel, T. A. (1988). Generalized logistic models. J. Amer. Statist. Assoc. 83 426–431.
  • Subbotin, M. T. (1923). On the law of frequency of error. Mat. Sb. 31 296–301.
  • Valente, L. M., Reeves, G., Schnitzler, J., Mason, I. P., Fay, M. F., Rebelo, T. G., Chase, M. W. and Barraclough, T. G. (2010). Diversification of the African genus Protea (Proteaceae) in the Cape biodiversity hotspot and beyond: Equal rates in different biomes. Evolution 64 745–760.
  • Vamosi, S. M., Heard, S. B. and Webb, C. O. (2009). Emerging patterns in the comparative analysis of phylogenetic community structure. Mol. Ecol. 18 572–592.
  • Wang, X. and Dey, D. K. (2010). Generalized extreme value regression for binary response data: An application to B2B electronic payments system adoption. Ann. Appl. Stat. 4 2000–2023.
  • Weiher, E. and Keddy, P. A. (1995). The assembly of experimental wetland plant communities. Oikos 73 323–335.

Supplemental materials