The Annals of Statistics

Asymptotic normality of a nonparametric estimator of sample coverage

Cun-Hui Zhang and Zhiyi Zhang

Full-text: Open access


This paper establishes a necessary and sufficient condition for the asymptotic normality of the nonparametric estimator of sample coverage proposed by Good [Biometrica 40 (1953) 237–264]. This new necessary and sufficient condition extends the validity of the asymptotic normality beyond the previously proven cases.

Article information

Ann. Statist., Volume 37, Number 5A (2009), 2582-2595.

First available in Project Euclid: 15 July 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62f10 62F12: Asymptotic properties of estimators 62G05: Estimation 62G20: Asymptotic properties
Secondary: 62F15: Bayesian inference

Sample coverage Turing’s formula asymptotic normality


Zhang, Cun-Hui; Zhang, Zhiyi. Asymptotic normality of a nonparametric estimator of sample coverage. Ann. Statist. 37 (2009), no. 5A, 2582--2595. doi:10.1214/08-AOS658.

Export citation


  • [1] Chao, A. (1981). On estimating the probability of discovering a new species. Ann. Statist. 9 1339–1342.
  • [2] Chao, A. (1984). Nonparametric estimation of the number of the classes in a population. Scand. J. Statist. 11 265–270.
  • [3] Chao, A. and Lee, S. (1992). Estimating the number of classes via sample covergae. J. Amer. Statist. Assoc. 87 210–217.
  • [4] Efron, B. and Thisted, R. (1976). Estimating the number of unseen species: How many words did Shakespeare know? Biometrika 63 435–447.
  • [5] Esty, W. W. (1982). Confidence intervals for the coverage of low coverage samples. Ann. Statist. 10 190–196.
  • [6] Esty, W. W. (1983). A normal limit law for a nonparametric estimator of the coverage of a random sample. Ann. Statist. 11 905–912.
  • [7] Esty, W. W. (1985). Estimation of the number of classes in a population and the coverage of a sample. Math. Sci. 10 41–50.
  • [8] Esty, W. W. (1986a). The size of a coinage. Numismatic Chronicle 146 185–215.
  • [9] Esty, W. W. (1986b). The efficiency of Good’s nonparametric coverage estimator. Ann. Statist. 14 1257–1260.
  • [10] Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika 40 237–264.
  • [11] Good, I. J. and Toulmin, G.H. (1956). The number of new species, and the increase in population coverage, when a sample is increased. Biometrika 43 45–63.
  • [12] Harris, B. (1959). Determining bounds on integrals with applications to cataloging problems. Ann. Math. Statist. 30 521–548.
  • [13] Harris, B. (1968). Statistical inference in the classical occupancy problem, unbiased estimation of the number of classes. J. Amer. Statist. Assoc. 63 837–847.
  • [14] Holst, L. (1981). Some assymptotic results for incomplete multinomial or Poisson samples. Scand. J. Statist. 8 243–246.
  • [15] Mao, C. X. and Lindsay, B. G. (2002). A Poisson model for the coverage problem with a genomic application. Biometrika 89 669–681.
  • [16] Quackenbush, J., Cho, J., Lee, D., Liang, F., Holt, I., Karamycheva, S., Parvizi, B., Pertea, G., Sultana, R. and White, J. (2001). The TIGR gene indices: Analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res. 29 159–164.
  • [17] Robbins, H. E. (1968). Estimating the total probability of the unobserved outcomes of an experiment. Ann. Statist. 39 256–257.
  • [18] Starr, N. (1979). Linear estimation of probability of discovering a new species. Ann. Statist. 7 644–652.
  • [19] Thisted, R. and Efron, B. (1987). Did Shakespeare write a newly-discovered poem? Biometrika 74 445–455.
  • [20] Zhang, C.-H. (2005). Estimation of sums of random variables: Examples and information bounds. Ann. Statist. 33 2022–2041.