The Annals of Applied Statistics

Joining the incompatible: Exploiting purposive lists for the sample-based estimation of species richness

Alessandro Chiarucci, Rosa Maria Di Biase, Lorenzo Fattorini, Marzia Marcheselli, and Caterina Pisani

The lists of species obtained by purposive sampling by field ecologists can be used to improve the sample-based estimation of species richness. A new estimator is here proposed as a modification of the difference estimator in which the species inclusion probabilities are estimated by means of the species frequencies from incidence data. If the species list used to support the estimation is complete the estimator guesses the true richness without error. In the case of incomplete lists, the estimator provides values invariably greater than the number of species detected by the combination of sample-based and purposive surveys. An asymptotically conservative estimator of the mean squared error is also provided. A simulation study based on two artificial communities is carried out in order to check the obvious increase in accuracy and precision with respect to the widely applied estimators based on the sole sample information. Finally, the proposed estimator is adopted to estimate species richness in the Maremma Regional Park, Italy.

Article information

Ann. Appl. Stat., Volume 12, Number 3 (2018), 1679-1699.

Received: May 2017
Revised: November 2017
First available in Project Euclid: 11 September 2018

Difference estimator probabilistic sampling purposive survey supporting list simulation


Chiarucci, Alessandro; Di Biase, Rosa Maria; Fattorini, Lorenzo; Marcheselli, Marzia; Pisani, Caterina. Joining the incompatible: Exploiting purposive lists for the sample-based estimation of species richness. Ann. Appl. Stat. 12 (2018), no. 3, 1679--1699. doi:10.1214/17-AOAS1126.

Supplemental materials

  • Supplement to “Joining the incompatible: Exploiting purposive lists[4] for the sample-based estimation of species richness”. The Supplementary Material contains a table explaining the ecological meaning of some symbols adopted in the papers (Section SM1), details about the D estimator (Section SM2), the R code for computing the ED estimator and to estimate its RRMSE (Section SM3). Moreover, the txt file of the incidence data and the floristic list adopted to estimate the species richness of vascular plants in the Maremma Regional Park, Italy, is also available.