Bayesian Analysis

Sample size calculation for finding unseen species

Hal Stern and Hongmei Zhang

Full-text: Open access

Abstract

Estimation of the number of species extant in a geographic region has been discussed in the statistical literature for more than sixty years. The focus of this work is on the use of pilot data to design future studies in this context. A Dirichlet-multinomial probability model for species frequency data is used to obtain a posterior distribution on the number of species and to learn about the distribution of species frequencies. A geometric distribution is proposed as the prior distribution for the number of species. Simulations demonstrate that this prior distribution can handle a wide range of species frequency distributions including the problematic case with many rare species and a few exceptionally abundant species. Monte Carlo methods are used along with the Dirichlet-multinomial model to perform sample size calculations from pilot data, e.g., to determine the number of additional samples required to collect a certain proportion of all the species with a pre-specified coverage probability. Simulations and real data applications are discussed.

Article information

Source
Bayesian Anal., Volume 4, Number 4 (2009), 763-792.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340369824

Digital Object Identifier
doi:10.1214/09-BA429

Mathematical Reviews number (MathSciNet)
MR2570088

Zentralblatt MATH identifier
1330.62419

Keywords
Generalized multinomial model Bayesian hierarchical model Markov Chain Monte Carlo (MCMC) Dirichlet distribution geometric distribution

Citation

Zhang, Hongmei; Stern, Hal. Sample size calculation for finding unseen species. Bayesian Anal. 4 (2009), no. 4, 763--792. doi:10.1214/09-BA429. https://projecteuclid.org/euclid.ba/1340369824


Export citation