Journal of Applied Probability

Convergence properties in certain occupancy problems including the Karlin-Rouault law

Estáte V. Khmaladze

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Let x denote a vector of length q consisting of 0s and 1s. It can be interpreted as an `opinion' comprised of a particular set of responses to a questionnaire consisting of q questions, each having {0, 1}-valued answers. Suppose that the questionnaire is answered by n individuals, thus providing n `opinions'. Probabilities of the answer 1 to each question can be, basically, arbitrary and different for different questions. Out of the 2q different opinions, what number, μn, would one expect to see in the sample? How many of these opinions, μn(k), will occur exactly k times? In this paper we give an asymptotic expression for μn / 2q and the limit for the ratios μn(k)/μn, when the number of questions q increases along with the sample size n so that n = λ2q, where λ is a constant. Let p(x) denote the probability of opinion x. The key step in proving the asymptotic results as indicated is the asymptotic analysis of the joint behaviour of the intensities np(x). For example, one of our results states that, under certain natural conditions, for any z > 0, ∑1{np(x) > z} = dn z-u, dn = o(2q).

Article information

Source
J. Appl. Probab., Volume 48, Number 4 (2011), 1095-1113.

Dates
First available in Project Euclid: 16 December 2011

Permanent link to this document
https://projecteuclid.org/euclid.jap/1324046021

Digital Object Identifier
doi:10.1239/jap/1324046021

Mathematical Reviews number (MathSciNet)
MR2896670

Zentralblatt MATH identifier
1231.62013

Subjects
Primary: 62D05: Sampling theory, sample surveys 62E20: Asymptotic distribution theory 60E05: Distributions: general theory 60F10: Large deviations

Keywords
Number of unique outcomes sparse tables Karlin-Rouault law Zipf's law Good-Turing index large deviations contiguity

Citation

Khmaladze, Estáte V. Convergence properties in certain occupancy problems including the Karlin-Rouault law. J. Appl. Probab. 48 (2011), no. 4, 1095--1113. doi:10.1239/jap/1324046021. https://projecteuclid.org/euclid.jap/1324046021


Export citation

References

  • Baayen, R. H. (2002). Word Frequency Distribution. Kluwer, Dordrecht.
  • Bahadur, R. R. and Ranga Rao, R. (1960). On deviations of the sample mean. Ann. Math. Statist. 31, 1015–1027.
  • Barbour, A. D. and Gnedin, A. V. (2009). Small counts in the infinite occupancy scheme. Electron. J. Prob. 14, 365–384.
  • Chaganty, N. R. and Sethuraman, J. (1993). Strong large deviation and local limit theorems. Ann. Prob. 21, 1671–1690.
  • Feller, W. (1986). Introduction to Probability Theory, Vol. 2. John Wiley, New York.
  • Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrica 40, 237–264.
  • Greenwood, P. E. and Shiryaev, A. N. (1985). Contiguity and the Statistical Invariance Principle. Gordon and Breach, New York.
  • Hwang, H.-K. and Janson, S. (2008). Local limit theorems for finite and infinite urn models. Ann. Prob. 36, 992–1022.
  • Ivanov, V. A., Ivchenko, G. I. and Medvedev, Y. I. (1985). Discrete problems of probability theory (a survey). J. Soviet Math. 31, 2759–2795.
  • Kallenberg, O. (1997). Foundations of Modern Probability. Springer, New York.
  • Khmaladze, È. V. (1983). Martingale limit theorems for divisible statistics. Theory Prob. Appl. 28, 530–549.
  • Khmaladze, È. V. (1988). The statistical analysis of a large number of rare events. Tech. Rep. MS-R8804, CWI, Amsterdam.
  • Khmaladze, È. V. (2002). Zipf's law. In Encyclopaedia of Mathematics, Supplement III, Kluwer, Dordrecht.
  • Khmaladze, È. V. and Tsigroshvili, Z. P. (1993). On polynomial distributions with a large number of rare events. Math. Meth. Statist. 2, 240–247.
  • Klaassen, C. A. J. and Mnatsakanov, R. M. (2000). Consistent estimation of the structural distribution function. Scand. J. Statist. 27, 733–746.
  • Kolassa, J. E. (1994). Series Approximation Methods in Statistics (Lecture Notes Statist. 88), Springer, New York.
  • Kolchin, V. F., Sevastyanov, B. A. and Chistyakov, V. P. (1978). Random Allocations. Halsted Press, New York.
  • Laplace, P.-S. (1995). Philosophical Essays on Probability (translation of 5th (1825) French edn.). Springer, New York.
  • McAllester, D. and Schapire, R. E. (2000). On the convergence rate of Good–Turing estimators. In Proc. COLT 2000, pp. 1–6.
  • Mirakhmedov, S. M. (2007). Asymptotic normality associated with generalized occupancy problem. Statist. Prob. Lett. 77, 1549–1558.
  • Mnatsakanov, R. M. (1986). Functional limit theorem for additively separable statistics in the case of very rare events. Theory Prob. Appl. 30, 622–631.
  • Oosterhoff, J. and van Zwet, W. R. (1979). A note on contiguity and Hellinger distance. In Contributions to Statistics, ed. J. Jurechkova, Reidel, Dordrecht, pp. 157–166.
  • Orlitsky, A., Santhanam, N. P. and Zhang, J. (2003). Always good Turing: asymptotically optimal probability estimation. Science 302, 427–431.
  • Rouault, A. (1978). Loi de Zipf et sources markoviennes. Ann. Inst. H. Poincaré 14, 169–188.