Source: Ann. Probab. Volume 32, Number 3B
(2004), 2765-2818.
In the standard formulation of the occupancy problem one considers the distribution of r balls in n cells, with each ball assigned independently to a given cell with probability 1/n. Although closed form expressions can be given for the distribution of various interesting quantities (such as the fraction of cells that contain a given number of balls), these expressions are often of limited practical use. Approximations provide an attractive alternative, and in the present paper we consider a large deviation approximation as r and n tend to infinity. In order to analyze the problem we first consider a dynamical model, where the balls are placed in the cells sequentially and “time” corresponds to the number of balls that have already been thrown. A complete large deviation analysis of this “process level” problem is carried out, and the rate function for the original problem is then obtained via the contraction principle. The variational problem that characterizes this rate function is analyzed, and a fairly complete and explicit solution is obtained. The minimizing trajectories and minimal cost are identified up to two constants, and the constants are characterized as the unique solution to an elementary fixed point problem. These results are then used to solve a number of interesting problems, including an overflow problem and the partial coupon collector’s problem.
References
Avram, F., Dai, J. G. and Hasenbein, J. J. (2001). Explicit solutions for variational problems in the quadrant. Queueing Systems 37 259--289.
Barton, D. E. and David, F. N. (1959). Contagious occupancy. J. Roy. Stat. Soc. Ser. B Stat. Methodol. 21 120--133.
Barton, D. E. and David, F. N. (1959). Sequential occupancy. Biometrika 46 218--223.
Mathematical Reviews (MathSciNet):
MR100899
Bertsekas, D. P. (1982). Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York.
Mathematical Reviews (MathSciNet):
MR690767
Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR233396
Boucheron, S., Gamboa, F. and Leonard, C. (2002). Bins and balls: Large deviations of the empirical occupancy process. Ann. Appl. Probab. 2 1--30.
Bucklew, J. (1990). Large Deviations Techniques in Decision, Simulation and Estimation. Wiley, New York.
Cesari, L. (1983). Optimization Theory and Applications. Springer, New York.
Mathematical Reviews (MathSciNet):
MR688142
Charalambides, A. (1997). A unified derivation of occupancy and sequential occupancy distributions. In Advances in Combinatorial Methods and Applications to Probability and Statistics (N. Balakrishnan, ed.) 259--273. Birkhäuser, Boston.
Dupuis, P., Ellis, R. S. and Weiss, A. (1991). Large deviations for Markov processes with discontinuous statistics, I: General upper bounds. Ann. Probab. 19 1280--1297.
Dupuis, P. and Ellis, R. (1997). A Weak Convergence Approach to Large Deviations. Wiley, New York.
Eramo, V. and Listanti, M. (2000). Packet loss in a bufferless optical WDM switch employing shared tuneable wavelength converters. Lightwave Technology 18 1818--1833.
Eramo, V., Listanti, M., Nuzman, C. and Whiting, P. (2002). Optical switch dimensioning and the classical occupancy problem. Int. J. Commun. Syst. 15 127--141.
Feller, W. (1968). An Introduction to Probability Theory and Its Applications 1. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR228020
Feller, W. (1971). An Introduction to Probability Theory and Its Applications 2. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR270403
Harkness, W. L. (1971). The classical occupancy problem revisited. In Random Counts in Physical Sciences (G. P. Patil, ed.) 107--126. Pennsylvania State Univ. Press.
He, F. (2000). Estimating species abundance from occurrence. American Naturalist 156 453--459.
Holst, L. (1977). Some asymptotic results for occupancy problems. Ann. Probab. 5 1028--1035.
Mathematical Reviews (MathSciNet):
MR443027
Holst, L. (1986). On the coupon collectors and other urn problems. Internat. Statist. Rev. 54 15--27.
Mathematical Reviews (MathSciNet):
MR959649
Johnson, N. L. and Kotz, S. (1977). Urn Models and their Applications. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR488211
Kamath, A., Motwani, R., Palem, K. and Spirakis, P. (1995). Tail bounds for occupancy and the satisfiability threshold conjecture. Random Structures Algorithm 7 59--80.
Mandjes, M. and Ridder, A. (2002). A large deviations analysis of the transient of a queue with many Markov fluid inputs: Approximations and fast simulation. ACM Transactions on Modeling and Computer Simulation 12 1--26.
McShane, E. J. (1947). Integration. Princeton Univ. Press.
Mathematical Reviews (MathSciNet):
MR82536
Ramakrishna, M. V. and Mukhopadhyay, P. (1988). Analysis of bounded disorder file organisation. In Proc. of the 7th ACM Symposium on Principles of DataBase Systems, Austin, Texas 117--125.
Sagan, H. (1969). Introduction to the Calculus of Variations. Dover, New York.
Shwartz, A. and Weiss, A. (1995). Large Deviations for Performance Analysis. Chapman and Hall, New York.
Shwartz, A. and Weiss, A. (2002). Large deviations with diminishing rates. Unpublished manuscript.
Weiss, A. (1994). Large deviations for the occupancy problem. Private communication.