Open Access
May 2018 On the number of unobserved and observed categories when sampling from a multivariate hypergeometric population
Sungsu Kim, Chong Jin Park
Braz. J. Probab. Stat. 32(2): 309-319 (May 2018). DOI: 10.1214/16-BJPS344

Abstract

Consider taking a random sample of size $n$ from a finite population that consists of $N$ categories with $M_{i}$ copies in the $i$th category for $i=1,\dots,N$. Each observed unit in a sample is presumed to have a probability $1-p$ ($0<p<1$) of getting lost from the sample. Let $S$ denote the number of categories not observed in the sample and $S_{j}$ denote the number of categories where $j$ samples are observed for $j=1,\dots,n$. In this paper, the probability distribution and factorial moments of $S$ and $S_{j}$ are studied. A matrix inversion algorithm is used in order to facilitate numerical computations in obtaining the probabilities and factorial moments. A couple of examples of the problem considered in this paper may include a filing or storage process, where objects are randomly assigned to files or storage bins, and from time to time, objects may be missing or have disappeared, species as categories in a capture-recapture problem, or DNA sequence study.

Citation

Download Citation

Sungsu Kim. Chong Jin Park. "On the number of unobserved and observed categories when sampling from a multivariate hypergeometric population." Braz. J. Probab. Stat. 32 (2) 309 - 319, May 2018. https://doi.org/10.1214/16-BJPS344

Information

Received: 1 February 2015; Accepted: 1 November 2016; Published: May 2018
First available in Project Euclid: 17 April 2018

zbMATH: 06914677
MathSciNet: MR3787756
Digital Object Identifier: 10.1214/16-BJPS344

Keywords: factorial moments , marrix inversion method , multinomial distribution , multivariate hypergeometric distribution , occupancy problem , Sterling’s number of the second kind

Rights: Copyright © 2018 Brazilian Statistical Association

Vol.32 • No. 2 • May 2018
Back to Top