Annals of Applied Statistics

Joining the incompatible: Exploiting purposive lists for the sample-based estimation of species richness

Alessandro Chiarucci, Rosa Maria Di Biase, Lorenzo Fattorini, Marzia Marcheselli, and Caterina Pisani

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


The lists of species obtained by purposive sampling by field ecologists can be used to improve the sample-based estimation of species richness. A new estimator is here proposed as a modification of the difference estimator in which the species inclusion probabilities are estimated by means of the species frequencies from incidence data. If the species list used to support the estimation is complete the estimator guesses the true richness without error. In the case of incomplete lists, the estimator provides values invariably greater than the number of species detected by the combination of sample-based and purposive surveys. An asymptotically conservative estimator of the mean squared error is also provided. A simulation study based on two artificial communities is carried out in order to check the obvious increase in accuracy and precision with respect to the widely applied estimators based on the sole sample information. Finally, the proposed estimator is adopted to estimate species richness in the Maremma Regional Park, Italy.

Article information

Ann. Appl. Stat., Volume 12, Number 3 (2018), 1679-1699.

Received: May 2017
Revised: November 2017
First available in Project Euclid: 11 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Difference estimator probabilistic sampling purposive survey supporting list simulation


Chiarucci, Alessandro; Di Biase, Rosa Maria; Fattorini, Lorenzo; Marcheselli, Marzia; Pisani, Caterina. Joining the incompatible: Exploiting purposive lists for the sample-based estimation of species richness. Ann. Appl. Stat. 12 (2018), no. 3, 1679--1699. doi:10.1214/17-AOAS1126.

Export citation


  • Arrigoni, P. V. (2003). The flora of the Maremma Natural Park (Tuscany, central Italy). Webbia Journal of Plant Taxonomy and Geography 58 151–240.
  • Barabesi, L. and Fattorini, L. (1998). The use of replicated plot, line and point sampling for estimating species abundances and ecological diversity. Environ. Ecol. Stat. 5 353–370.
  • Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Am. Stat. Assoc. 88 364–373.
  • Cayuela, L., Gotelli, N. J. and Colwell, R. K. (2015). Ecological and biogeographic null hypotheses for comparing rarefaction curves. Ecol. Monogr. 85 437–455.
  • Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scand. J. Stat. 11 265–270.
  • Chao, A. (1987). Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43 783–791.
  • Chao, A. and Colwell, R. H. (2017). Thirty years of progeny from Chao’s inequality: Estimating and comparing richness with incidence data and incomplete sampling. SORT 41 3–54.
  • Chao, A. and Lee, M. (1992). Estimating the number of classes via sample coverage. J. Am. Stat. Assoc. 87 210–217.
  • Chao, A., Gotelli, N. J., Hsieh, T. C., Sander, E. L., Ma, K. H., Colwell, R. K. and Ellison, A. M. (2014). Rarefaction and extrapolation with Hill numbers: A framework for sampling and estimation in species diversity studies. Ecol. Monogr. 84 45–67.
  • Chiarucci, A. (2012). Estimating species richness: Still a long way off! J. Veg. Sci. 23 1003–1005.
  • Chiarucci, A., Bacaro, G. and Scheiner, S. M. (2011). Old and new challenges in using species diversity for assessing biodiversity. Philos. T. Roy. Soc. B 366 2426–2437.
  • Chiarucci, A., Enright, N. J., Perry, G. L. W., Miller, B. P. and Lamont, B. B. (2003). Performance of nonparametric species richness estimators in a high diversity plant community. Divers. Distrib. 9 283–295.
  • Chiarucci, A., Di Biase, R. M., Fattorini, L., Marcheselli, M. and Pisani, C. (2018). Supplement to “Joining the incompatible: Exploiting purposive lists for the sample-based estimation of species richness.” DOI:10.1214/17-AOAS1126SUPP.
  • Colwell, R. K. (2013). EstimateS: Statistical estimation of species richness and shared species from samples. Version 9. User’s Guide and application. Published at
  • Colwell, R. K. and Coddington, J. A. (1994). Estimating terrestrial biodiversity through extrapolation. Philos. T. Roy. Soc. B 345 101–118.
  • Colwell, R. K., Chao, A., Gotelli, N. J., Lin, S. Y., Mao, C. X., Chazdon, R. L. and Longino, J. T. (2012). Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblage. J. Plant Ecol. 5 3–21.
  • Conti, F., Abbate, G., Alessandrini, A. and Blasi, C., eds. (2005). An Annotated Checklist of the Italian Vascular Flora. Palombi, Roma.
  • Cormack, R. M. (1989). Log-linear models for capture-recapture. Biometrics 45 395–413.
  • D’Alessandro, L. and Fattorini, L. (2002). Resampling estimators of species richness from presence-absence data: Why they don’t work. Metron 60 5–19.
  • Diekmann, M., Kühne, A. and Isermann, M. (2007). Random vs non-random sampling: Effects on patterns of species abundance, species richness and vegetation-environment relationships. Folia Geobot. 42 179–190.
  • Fattorini, L. (2006). Applying the Horvitz–Thompson criterion in complex designs: A computer-intensive perspective for estimating inclusion probabilities. Biometrika 93 269–278.
  • Fattorini, L. (2007). Statistical inference on accumulation curves for inventorying forest diversity: A design-based critical look. Plant Biosyst. 141 231–242.
  • Fattorini, L. (2009). An adaptive algorithm for estimating inclusion probabilities and performing Horvitz–Thompson criterion in complex designs. Comput. Statist. 24 623–639.
  • Fattorini, S. (2013). Regional insect inventories require long time, extensive spatial sampling and good will. PLoS ONE 8 e62118.
  • Gaston, K. J. (1996). Species richness: Measure and measurement. In Biodiversity. A Biology of Numbers and Difference (K. J. Gaston, ed.) 77–113. Blackwell Science, Oxford.
  • Gotelli, N. J. and Chao, A. (2013). Measuring and estimating species richness, species diversity, and biotic similarity from sampling data. In Encyclopedia of Biodiversity, 2nd ed. (S. A. Levin, ed.) 5 195–211. Elsevier Ltd, Waltham.
  • Gotelli, N. J. and Colwell, R. K. (2001). Quantifying biodiversity: Procedures and pitfalls in the measurement and comparison of species richness. Ecol. Lett. 4 379–391.
  • Gotelli, N. J., Anderson, M. J., Arita, H. T., Chao, A., Colwell, R. K., Connolly, S. R., Currie, D. J., Dunn, R. R., Graves, G. R., Green, J. L., Grytnes, J. A., Jiang, Y. H., Jetz, W., Kathleen Lyons, S., McCain, C. M., Magurran, A. E., Rahbek, C., Rangel, T. F., Soberón, J., Webb, C. O. and Willig, M. R. (2009). Patterns and causes of species richness: A general simulation model for macroecology. Ecol. Lett. 12 873–886.
  • Gregoire, T. G. and Valentine, H. T. (2008). Sampling Strategies for Natural Resources and the Environment. Chapman & Hall, Boca Raton, FL.
  • Hédl, R. (2007). Is sampling subjectivity a distorting factor in surveys for vegetation diversity? Folia Geobot. 42 191–198.
  • Hellmann, J. J. and Fowler, G. W. (1999). Bias, precision, and accuracy of four measures of species richness. Ecol. Appl. 9 824–834.
  • Heltshe, J. F. and Forrester, N. E. (1983). Estimating species richness using the jackknife procedure. Biometrics 39 1–11.
  • Holdridge, L. R., Grenke, W. C., Hatheway, W. H., Liang, T. and Tosi, J. A. (1971). Forest Environments in Tropical Life Zones. Pergamon Press, Oxford.
  • Hortal, J., Borges, P. A. V. and Gaspar, C. (2006). Evaluating the performance of species richness estimators: Sensitivity to sample grain size. J. Anim. Ecol. 75 274–287.
  • Howard, P. C., Viskanic, P., Davenport, T. R. B., Kigenyi, F. W., Baltzer, M., Dickinson, C. J., Lwanga, J. S., Matthews, R. A. and Balmford, A. (1998). Complementarity and the use of indicator groups for reserve selection in Uganda. Nature 394 472–475.
  • Lee, S. M. and Chao, A. (1994). Estimating population size via sample coverage for closed capture-recapture models. Biometrics 50 88–97.
  • Melo, A. S. (2004). A critique of the use of jackknife and related non-parametric techniques to estimate species richness. Community Ecol. 5 149–157.
  • Nichols, J. D. and Conroy, M. J. (1996). Estimation of species richness. In Measuring and Monitoring Biological Diversity. Standard Methods for Mammals (D. E. Wilson, F. R. Cole, J. D. Nichols, R. Rudran and M. Forster, eds.) 226–234. Smithsonian Institution Press, Washington, DC.
  • Oksanen, J., Guillaume Blanchet, F., Friendly, M., Kindt, R., Legendre, P., McGlinn, D., Minchin, P. R., O’Hara, R. B., Simpson, G. L., Solymos, P., Stevens, M. H. H., Szoecs, E. and Wagner, H. (2016). vegan: Community ecology package. R package version 2.4-1.
  • Palmer, M. W. (1990). The estimation of species richness by extrapolation. Ecology 71 1195–1198.
  • Palmer, M. W. (1991). Estimating species richness: The second-order jackknife reconsidered. Ecology 72 1512–1513.
  • Palmer, M. W., Earls, P. G., Hoagland, B. W., White, P. S. and Wohlgemuth, T. (2002). Quantitative tools for perfecting species lists. Environmetrics 13 121–138.
  • Pielou, E. C. (1977). Mathematical Ecology. Wiley, New York.
  • Pignatti, S. (1982). Flora d’Italia, Vol. 3, Edagricole edizioni.
  • Särndal, C. E. and Lundström, S. (2005). Estimation in Survey with Nonresponse. Wiley, New York.
  • Särndal, C. E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
  • Seber, G. A. F. (1982). The Estimation of Animal Abundance. Griffin, London.
  • Skov, F. and Lawesson, J. E. (2000). Estimation of plant species richness from systematically placed plots in a managed forest ecosystem. Nord. J. Bot. 20 477–483.
  • Smith, E. P. and Van Belle, G. (1984). Nonparametric estimation of species richness. Biometrics 40 119–129.
  • Thompson, S. K. (2002). Sampling, 2nd ed. Wiley, New York.
  • Walther, B. A. and Moore, J. L. (2005). The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography 28 815–829.
  • Walther, B. A. and Morand, S. (1998). Comparative performance of species richness estimation methods. Parasitology 116 395–405.
  • Wilson, J. B., Peet, R. K., Dengler, J. and Pärtel, M. (2012). Plant species richness: The world records. J. Veg. Sci. 23 796–802.
  • Xu, H., Liu, S., Li, Y., Zang, R. and He, F. (2012). Assessing non-parametric and area-based methods for estimating regional species richness. J. Veg. Sci. 23 1006–1012.

Supplemental materials

  • Supplement to “Joining the incompatible: Exploiting purposive lists[4] for the sample-based estimation of species richness”. The Supplementary Material contains a table explaining the ecological meaning of some symbols adopted in the papers (Section SM1), details about the D estimator (Section SM2), the R code for computing the ED estimator and to estimate its RRMSE (Section SM3). Moreover, the txt file of the incidence data and the floristic list adopted to estimate the species richness of vascular plants in the Maremma Regional Park, Italy, is also available.