## The Annals of Applied Probability

### Bins and balls; Large deviations of the empirical occupancy process

#### Abstract

In the random allocation model, balls are sequentially inserted at random into $n$ exchangeable bins. The occupancy score of a bin denotes the number of balls inserted in this bin. The (random) distribution of occupancy scores defines the object of this paper: the empirical occupancy measure which is a probability measure over the integers. This measure-valued random variable packages many useful statistics. This paper characterizes the large deviations of the flow of empirical occupancy measures when $n$ goes to infinity while the number of inserted balls remains proportional to $n$. The main result is a Sanov-like theorem for the empirical occupancy measure when the set of probability measures over the integers is endowed with metrics that are slightly stronger than the total variation distance. Thanks to a coupling argument, this result applies to the degree distribution of sparse random graphs.

#### Article information

Source
Ann. Appl. Probab., Volume 12, Number 2 (2002), 607-636.

Dates
First available in Project Euclid: 17 July 2002

https://projecteuclid.org/euclid.aoap/1026915618

Digital Object Identifier
doi:10.1214/aoap/1026915618

Mathematical Reviews number (MathSciNet)
MR1910642

Zentralblatt MATH identifier
1013.60017

#### Citation

Boucheron, Stéphane; Gamboa, Fabrice; Léonard, Christian. Bins and balls; Large deviations of the empirical occupancy process. Ann. Appl. Probab. 12 (2002), no. 2, 607--636. doi:10.1214/aoap/1026915618. https://projecteuclid.org/euclid.aoap/1026915618

#### References

• [1] BARBOUR, A., HOLST, L. and JANSON, S. (1992). Poisson Approximation. Clarendon Press, London.
• [2] BARTLETT, M. S. (1938). The characteristic function of a conditional statistic. J. London Math. Soc. 13 62-67.
• [3] BOLLOBÀS, B. (1985). Random Graphs. Academic Press, New York.
• [4] BOLLOBÀS, B. and FRIEZE, A. M. (1985). On matching and Hamiltonian cy cles in random graphs. In Random Graphs 83. Annals of Discrete Mathematics 28 23-46.
• [5] BOUCHERON, S. and GARDY, D. (1997). An urn model from learning theory. Random Structures Algorithms 10 43-67.
• [6] CHVATAL, V. (1991). Almost all graphs with 1.44 edges are 3 colourable. Random Structures Algorithms 2 11-28.
• [7] DEMBO, A., VERSHIK, A. and ZEITOUNI, O. (2000). Large deviations for integer partitions. Markov Process. Related Fields 6 147-179.
• [8] DEMBO, A. and ZEITOUNI, O. (1999). Large Deviation Techniques and Applications. Springer, New York.
• [9] ETHIER, S. and KURTZ, T. G. (1986). Markov Processes, Characterization and Convergence. Wiley, New York.
• [10] FELLER, W. (1968). Probability Theory. Wiley, New York.
• [11] GILES, J. R. (1992). Convex analysis with application in the differentiation of convex functions. In Research Notes in Math. 58. Pitman, London.
• [12] JOHNSON, N. L. and KOTZ, S. (1977). Urn Models and Their Application. An Approach to Modern Discrete Probability Theory. Wiley, New York.
• [13] KOLCIN, V. F., SEVAST'JANOV, B. A. and CISTJAKOV, V. P. (1978). Random Allocations. Wiley, New York.
• [14] LÉONARD, C. (1995). Large deviations for particle sy stems associated with spatially homogeneous Boltzmann ty pe equations. Probab. Theory Related Fields 101 1-44.
• [15] LÉONARD, C. (2000). Minimizers of energy functionals under not very integrable constraints. Preprint.
• [16] LÉONARD, C. and NAJIM, J. (2000). An extension of Sanov's theorem. Application to the Gibbs conditioning principle. Preprint.
• [17] MCKAY, B. D. and WORMALD, N. C. (1998). The degree sequence of a random graph. Random Structures Algorithms 11 97-117.
• [18] MORRIS, C. (1975). Central Limit Theorems for multinomial sums. Ann. Statist. 3 165-168.
• [19] PALKA, Z. (1984). On the number of vertices of given degree in a random graph. J. Graph Theory 8 267-270.
• [20] PITTEL, B., SPENCER, J. and WORMALD, N. (1996). Sudden emergence of a giant k-core in a random graph. J. Combin. Theory Ser. B 67 111-151.
• [21] QUINE, M. P. (1979). A functional Central Limit Theorem for generalized occupancy numbers. Stochastic Process. Appl. 9 109-115.
• [22] QUINE, M. P. and ROBINSON, J. (1982). A Berry-Essen bound for an occupancy problem. Ann. Probab. 10 663-671.
• [23] QUINE, M. P. and ROBINSON, J. (1984). Normal approximations to sums of scores based on occupancy numbers. Ann. Probab. 12 794-804.
• [24] RAO, R. and REN, R. (1991). Theory of Orlicz Spaces. Marcel Dekker, New York.
• [25] ROCKAFELLAR, R. T. (1968). Integrals which are convex functionals. Pacific J. Math. 24 525- 538.
• [26] SION, M. (1958). On general Minimax theorems. Pacific J. Math. 8 171-176.
• [27] VARADHAN, S. R. S. (1984). Large Deviations and Applications. SIAM, Philadelphia.
• [28] VITTER, J. S. and FLAJOLET, P. (1990). Average-case analysis of algorithms and data structures. In Handbook of Theoretical Computer Science A (J. Van Leeuwen, ed.) 431- 524. Elsevier, Amsterdam.
• AND ECOLE POLy TECHNIQUE, CMAP 91128 PALAISEAU FRANCE E-MAIL: christian.leonard@u-paris10.fr