## The Annals of Applied Statistics

### A new class of flexible link functions with application to species co-occurrence in cape floristic region

#### Abstract

Understanding the mechanisms that allow biological species to co-occur is of great interest to ecologists. Here we investigate the factors that influence co-occurrence of members of the genus Protea in the Cape Floristic Region of southwestern Africa, a global hot spot of biodiversity. Due to the binomial nature of our response, a critical issue is to choose appropriate link functions for the regression model. In this paper we propose a new family of flexible link functions for modeling binomial response data. By introducing a power parameter into the cumulative distribution function (c.d.f.) corresponding to a symmetric link function and its mirror reflection, greater flexibility in skewness can be achieved in both positive and negative directions. Through simulated data sets and analysis of the Protea co-occurrence data, we show that the proposed link function is quite flexible and performs better against link misspecification than standard link functions.

#### Article information

Source
Ann. Appl. Stat. Volume 7, Number 4 (2013), 2180-2204.

Dates
First available in Project Euclid: 23 December 2013

http://projecteuclid.org/euclid.aoas/1387823315

Digital Object Identifier
doi:10.1214/13-AOAS663

Mathematical Reviews number (MathSciNet)
MR3161718

Zentralblatt MATH identifier
1283.62228

#### Citation

Jiang, Xun; Dey, Dipak K.; Prunier, Rachel; Wilson, Adam M.; Holsinger, Kent E. A new class of flexible link functions with application to species co-occurrence in cape floristic region. Ann. Appl. Stat. 7 (2013), no. 4, 2180--2204. doi:10.1214/13-AOAS663. http://projecteuclid.org/euclid.aoas/1387823315.

#### References

• Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.
• Aranda-Ordaz, F. J. (1981). On two families of transformations to additivity for binary response data. Biometrika 68 357–363.
• Arnold, B. C. and Groeneveld, R. A. (1995). Measuring skewness with respect to the mode. Amer. Statist. 49 34–38.
• Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2004). Hierarchical Modeling and Analysis for Spatial Data 101. Chapman & Hall/CRC, Boca Raton, FL.
• Besag, J. and Green, P. J. (1993). Spatial statistics and Bayesian computation. J. R. Stat. Soc. Ser. B Stat. Methodol. 55 25–37.
• Chen, M.-H., Dey, D. K. and Shao, Q.-M. (1999). A new skewed link model for dichotomous quantal response data. J. Amer. Statist. Assoc. 94 1172–1186.
• Chen, M.-H. and Shao, Q.-M. (2001). Propriety of posterior distribution for dichotomous quantal response models. Proc. Amer. Math. Soc. 129 293–302.
• Chib, S. and Jeliazkov, I. (2006). Inference in semiparametric dynamic models for binary longitudinal data. J. Amer. Statist. Assoc. 101 685–700.
• Czado, C. (1994). Parametric link modification of both tails in binary regression. Statist. Papers 35 189–201.
• Czado, C. and Santner, T. J. (1992). The effect of link misspecification on binary regression inference. J. Statist. Plann. Inference 33 213–231.
• Elton, C. (1946). Competition and the structure of ecological communities. The Journal of Animal Ecology 15 54–68.
• Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods. Technical report, DTIC.
• Gilks, W. R., Best, N. G. and Tan, K. K. C. (1995). Adaptive rejection Metropolis sampling within Gibbs sampling (Corr: 97V46 P541–542 with R. M. Neal). J. Appl. Stat. 44 455–472.
• Grass Development Team (2008). Geographic resources analysis support system (GRASS). Available at http://grass.osgeo.org.
• Guerrero, V. M. and Johnson, R. A. (1982). Use of the Box–Cox transformation with binary response models. Biometrika 69 309–314.
• Gupta, R. D. and Gupta, R. C. (2008). Analyzing skewed data by power normal model. TEST 17 197–210.
• Ibrahim, J. G., Chen, M. H. and Sinha, D. (2005). Bayesian Survival Analysis. Wiley, New York.
• Jiang, X., Dey, D. K., Prunier, R., Wilson, A. M. and Holsinger, K. E. (2013). Supplement to “A new class of flexible link functions with application to species co-occurrence in cape floristic region.” DOI:10.1214/13-AOAS663SUPP.
• Jones, M. C. (2004). Reply to Comments on “Families of distributions arising from distributions of order statistics.” TEST 13 1–43.
• Kim, S., Chen, M.-H. and Dey, D. K. (2008). Flexible generalized $t$-link models for binary response data. Biometrika 95 93–106.
• Liu, C. (2004). Robit regression: A simple robust alternative to logistic and probit regression. In Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives (A. Gelman and X.-L. Meng, eds.) 227–238. Wiley, New York.
• Lunn, D. J., Thomas, A., Best, N. and Spiegelhalter, D. (2000). WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10 325–337.
• McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
• Mudholkar, G. S. and George, E. O. (1978). A remark on the shape of the logistic distribution. Biometrika 65 667–668.
• Nagler, J. (1994). Scobit: An alternative estimator to logit and probit. American Journal of Political Science 38 230–255.
• Palmgren, J. (1989). Regression models for bivariate binary responses. Working Paper 101, UW Biostatistics Working Paper Series.
• Paradis, E., Claude, J. and Strimmer, K. (2004). APE: Analyses of phylogenetics and evolution in R language. Bioinformatics 20 289–290.
• Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) 20–22.
• Samejima, F. (2000). Logistic positive exponent family of models: Virtue of asymmetric item characteristic curves. Psychometrika 65 319–335.
• Sanderson, M. J. (2003). r8s: Inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19 301–302.
• Shipley, B., Vile, D. and Garnier, É. (2006). From plant traits to plant communities: A statistical mechanistic approach to biodiversity. Science 314 812–814.
• Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 583–639.
• Stukel, T. A. (1988). Generalized logistic models. J. Amer. Statist. Assoc. 83 426–431.
• Subbotin, M. T. (1923). On the law of frequency of error. Mat. Sb. 31 296–301.
• Valente, L. M., Reeves, G., Schnitzler, J., Mason, I. P., Fay, M. F., Rebelo, T. G., Chase, M. W. and Barraclough, T. G. (2010). Diversification of the African genus Protea (Proteaceae) in the Cape biodiversity hotspot and beyond: Equal rates in different biomes. Evolution 64 745–760.
• Vamosi, S. M., Heard, S. B. and Webb, C. O. (2009). Emerging patterns in the comparative analysis of phylogenetic community structure. Mol. Ecol. 18 572–592.
• Wang, X. and Dey, D. K. (2010). Generalized extreme value regression for binary response data: An application to B2B electronic payments system adoption. Ann. Appl. Stat. 4 2000–2023.
• Weiher, E. and Keddy, P. A. (1995). The assembly of experimental wetland plant communities. Oikos 73 323–335.