In the standard formulation of the occupancy problem one considers the distribution of r balls in n cells, with each ball assigned independently to a given cell with probability 1/n. Although closed form expressions can be given for the distribution of various interesting quantities (such as the fraction of cells that contain a given number of balls), these expressions are often of limited practical use. Approximations provide an attractive alternative, and in the present paper we consider a large deviation approximation as r and n tend to infinity. In order to analyze the problem we first consider a dynamical model, where the balls are placed in the cells sequentially and “time” corresponds to the number of balls that have already been thrown. A complete large deviation analysis of this “process level” problem is carried out, and the rate function for the original problem is then obtained via the contraction principle. The variational problem that characterizes this rate function is analyzed, and a fairly complete and explicit solution is obtained. The minimizing trajectories and minimal cost are identified up to two constants, and the constants are characterized as the unique solution to an elementary fixed point problem. These results are then used to solve a number of interesting problems, including an overflow problem and the partial coupon collector’s problem.
"Large deviation asymptotics for occupancy problems." Ann. Probab. 32 (3B) 2765 - 2818, July 2004. https://doi.org/10.1214/009117904000000135