Convergence of point processes associated with coupon collector's and Dixie cup problems

We prove that, in the coupon collector's problem, the point processes given by the times of r-th arrivals for coupons of each type, centered and normalized in a proper way, converge toward a non-homogeneous Poisson point process. This result is then used to derive some generalizations and infinite-dimensional extensions of classical limit theorems on the topic.


Introduction
The coupon collector's problem (CCP) as well as its generalization known as the Dixie cup problem (DCP) belong to the classics of combinatorial probability. Their statements are as follows: a person collects coupons, each of which belongs to one of n different types. The coupons arrive one by one at discrete times, the type of each coupon being equiprobable and independent of types of preceding ones. Let T (n) c stand for the (random) number of coupons a person needs to collect in order to assemble c ∈ N complete collections. The most typical questions concern asymptotics of ET (n) c and distributional limit theorems for T (n) c themselves as n → ∞. Sometimes, the case c = 1 refers to CCP while c ≥ 2 to DCP. It should be noted that the terminology is not well established in the literature: sometimes both problems are attributed as CCP or, on the contrary, as DCP. The above terminology follows [10] and [7].
CCP, DCP and their further generalizations have a long history going back to de Moivre, Euler and Laplace. Since the 60s of the past century, there has appeared an extensive literature on the topic. In particular, we recall here a classical result by Erdős and Rényi [5]: with γ = −Γ ′ (1) standing for the EulerMascheroni constant. Subsequently, the theory was developed and generalized in different directions: non-equal probabilities of coupon types (plenty of literature - [17], [3], [1], to cite just a few), various random sceneries ( [11], [4], [7]), collecting pairs ( [16], [8]) and so on. A nice and quite elementary introduction to the topic is given in [6].
In his seminal paper, Holst [10] proposed a fruitful poissonization idea which allowed to prove limit results like (1.1), (1.2) avoiding intricate combinatorial calculations. In a very recent paper by Glavaš and Mladenović [8], the connections between CCP and Poisson processes were shown to be even more tight. As a matter of fact, it was proved that the point processes given by the times of first arrivals for coupons of each type, centered and normalized in a proper way, converge toward a non-homogeneous Poisson point process as n → ∞. The above convergence is, as usual, understood as the distributional one in the space of all locally finite point measures endowed with the vague topology. The proof is based on rather delicate combinatorial arguments. As for the DCP, the authors do not consider the corresponding results, confining themselves to just pointing out that, within the framework of their methods, the relevant formulations and proofs would require much more technical details.
Inspired by this paper, the present note pursues a threefold objective. Firstly, we generalize the above result to the case of DCP. Secondly, to this end, we develop a specific approach involving a poissonization technique in the spirit of [10] and some coupling-based depoissonization procedure. This allows for avoiding sophisticated combinatorial machinery used in [8]. Thirdly, we demonstrate the power of this result from the applications point of view. It can be used to easily derive some generalizations and infinite-dimensional extensions of classical limit theorems on the topic.

Preliminaries and notation
Let Y (n) i,r , i ∈ N n = {1, . . . , n}, r, n ∈ N, stand for the time the r th coupon of type i arrives. So, Y (n) i,r ∼ NegBin r, 1 n , where by NegBin we mean that version of the negative binomial distribution which counts trials until (and including) the r th success: i,r are, clearly, identically distributed but not independent, at least because Y In order to cope with this dependency, following [10], we consider a poissonized scheme. That is, we assume that coupons arrive at random times with independent Exp(1)-distributed intervals E j , j ∈ N. More formally, we introduce the unit-rate independently marked Poisson point process Ξ (n) = ∞ k=1 δ (X k ,M k ) with the uniform on N n mark distribution: P{M k = i} = coupons, and δ u for the Dirac measure 1{u ∈ ·}. Hence, by Theorem 5.8 in [14], are independent 1 n -rate Poisson point processes. The process Ξ (n) i describes arrivals of coupons of i th type. In this setting, the random variables Y (n) i,r introduced at the beginning of the section admit the following representation: i,r , i ∈ N n , r, n ∈ N, stand for the time the r th coupon of type i arrives in the above poissonized scheme. So, Z are now given on a common probability space and coupled by i,r , i ∈ N n , r ∈ N is independent of (E j , j ∈ N), since, by (2.1), the former sequence is determined solely by marks M k .
Let us denote The main object of our study is the (centered and normalized by means of ψ (n) r ) point process of r th arrivals of different types: (

2.4)
In what follows, we will also need the counterpart of ξ (n) r in the poissonized setting:

Main result
Before proceeding to the main result, we recall some basic definitions related to convergence of point processes (see [19], [20], or [12] for details). Let M p (R) denote the space of all locally finite point measures on R. For µ, µ 1 , µ 2 , . . . ∈ M p (R), µ n are said to converge vaguely to µ (denoted as µ n v − → µ) if R f dµ n → R f dµ for each continuous compactly supported f : R → [0, +∞). The set M p (R) endowed with the corresponding topology can be metrized as a complete separable metric space. This setting allows to consider the distributional convergence of point processes ξ, ξ 1 , ξ 2 . . ., denoted as ξ n vd − → ξ. The main result of this note, Theorem 3.1 below, asserts that the point processes ξ (n) r converge in this sense toward a non-homogeneous Poisson process.
Remark 3.1. The limiting process ξ r allows for a simple interpretation. Let ξ be a stationary unit-rate Poisson point process restricted to (0, +∞), and . Indeed, by the mapping theorem (see, e.g., Theorem 5.1 in [14]), ξ • h −1 is a Poisson process with intensity measure of the form Leb •h −1 , where Leb stands for the Lebesgue measure. Since, for any [a, b] ⊂ R, the result follows.
We divide the proof of Theorem 3.1 into several steps. First, we prove a similar result in the poissonized setting.
i,r (nx + n ln n + (r − 1)n ln ln n) Hence, for any x ∈ R, and, moreover, this convergence is uniform over bounded sets. Then, for each such Borel set B, as n → ∞, and so, by Proposition 3.12 in [19], Taking into account the independence of Z (n) i,r for different i, the well-known fact on the convergence of binomial point processes toward a Poisson one (see, e.g., the warm-up in the proof of Proposition 3.21 in [19]) delivers the claim.
In the next stage, we will need some depoissonization procedure in order to turn η Such depoissonization techniques usually involve bounds on distances between random elements in poissonized and depoissonized settings (see, e.g., Lemma 1.4 in [18] for random variables and, as its application, Theorem 3.2 in [15] for point processes). Since ξ with some c r > 0.
Proof. The idea of (3.2) is pretty simple. Roughly speaking, there may be two reasons for ξ . The first term on the right-hand side of (3.2) is responsible for the first reason while the rest for the second one. We proceed to the implementation.
Fix an ε > 0. Then, by (2.3) and (2.2), for each i ∈ N n . So, by Markov inequality, the latter does not exceed Since E j − 1 are centered i.i.d., ∆ ε,n can be easily calculated by a standard conditioning argument: with m k standing for E(E 1 − 1) k . As Y (n) 1,r ∼ NegBin r, 1 n , straightforward calculations lead to the bound ∆ ε,n ≤ c r ε −4 n −2 , n ∈ N, with some c r > 0. Now we can finally bound the probability that some point of ξ (n) r is far away from the corresponding point of η (n) r : by subadditivity, So, (3.2) follows from the reasoning at the beginning of the proof.
In order to deal with the last two terms on the right-hand side of (3.2), we will need an easy technical lemma on the distributional convergence of point processes. Lemma 3.3. Let X (n) , n ∈ N, and X be point processes on R such that X (n) vd − → X, and X has a diffuse intensity measure. If (I n , n ∈ N) is a decreasing sequence of intervals such that I n ↓ I for some interval I, then Proof. The proof is based on an application of the Skorokhod coupling (see, e.g., [20], p. 41). With the latter in mind, we may assume that X (n) , n ∈ N and X are given on a common probability space, and X (n) v − → X a.s. So, Since the intensity measure of X is assumed diffuse, X(∂I) = 0 a.s, and the first term on the right-hand side of (3.3) a.s. vanishes as n → ∞ due to the vague convergence. To deal with the second term, let us fix N ∈ N. For all n ≥ N, and the right-hand side converges a.s. toward X(I N \ I). In other words, Letting N → ∞ proves that the second term on the right-hand side of (3.3) a.s. vanishes as n → ∞ too. Hence, X (n) (I n ) → X(I) a.s., and so X (n) (I n ) d − → X(I), which clearly implies the claim.
We may now proceed to the final part of the proof.
Proof of Theorem 3.1. In (3.2), let us take ε = n − 1 5 . Then, Lemmas 3.1 and 3.3 imply that, as n → ∞, and the same holds for b. Together with the foregoing inequality, we have Let U stand for the ring of finite unions of bounded closed segments in R. For each Moreover, by Lemma 3.1, So, the last two formulas imply lim n→∞ P ξ (n) r (U) = k = P ξ r (U) = k .
Notice that the process ξ r is simple, since it is a Poisson process with diffuse intensity measure (see, e.g., Proposition 6.9 in [14]). Hence, ξ (n) r vd − → ξ r on R by Theorem 4.15 in [12].

Applications
From Theorem 3.1, we may easily deduce a number of known limit results which were often originally proved by direct complicated calculations. Moreover, this approach allows to obtain some far-reaching generalizations and infinite-dimensional extensions of those results. Finally, an application of Theorem 3.1 often makes it possible to clarify some related surprising phenomena. As an example, consider T (n) r,m , m ≤ n − 1, the first time when some n − m (unspecified) of the n coupon types have already arrived at least r times each, and put T (n) r,m = 0 for m ≥ n. Various limit theorems for the case r = 1 were studied in [2], [21], [9], see also §1.2 in [13]. In particular, Theorem 4 in [2] asserts that where Q follows χ 2 2m+2 , the χ 2 -distribution with 2m + 2 degrees of freedom. Theorem 4.1 below gives both the generalization for any r ∈ N and the infinite-dimensional extension in the sense of distributional convergence in R ∞ , and also clarifies why χ 2 -distribution appears.
Consider the random elements V (n) r and V r in R ∞ given by is defined by (2.3), and E j , j ∈ N, are i.i.d. Exp (1).
We now turn to the proof.
Proof of Theorem 4.1. We will need some additional notation. Fix m ∈ N ∪ {0} and let map δ x i into the vector of its "last-but-j"th points (possibly infinite), 0 ≤ j ≤ m. In other words, Further, we denote by µ| K the point measure µ restricted to the compact K.
By Theorem 3.1 and the Skorokhod coupling, we may consider ξ (n) r , n ∈ N and ξ r on a common probability space and assume that ξ Note that, by (2.4), and is thus just the projection of V (n) r onto the first m + 1 coordinates (from 0-th to m-th). On the other hand, by Remark 3.1 and the i.i.d. property of inter-arrival times for ξ, we come to which is, similarly, the projection of V r onto the first m + 1 coordinates. Hence, (4.3) proves the finite-dimensional convergence V To complete the proof, it only remains to recall that in R ∞ the notions of finite-dimensional convergence and convergence in distribution are equivalent (see, e.g., [20], pp. 53-54).
We now consider another application of Theorem 3.1, namely, to "rare" coupon types. Recall that an arriving coupon is assumed to belong to any of the n types with the same probability 1 n . But, due to random factors, some coupon types will need a long time until they arrive for the r-th time. Taking (1.2) into account, we will call a type i ∈ N n x-rare, i,r ≥ nx + n ln n + (r − 1)n ln ln n. r (x) the number of x-rare types: i,r ≥ nx + n ln n + (r − 1)n ln ln n .
Below we state and prove a functional limit theorem for C (n) r = C (n) r (x), x ∈ R in the Skorokhod J 1 -topology.
Let N = N(t), t ≥ 0 be a homogeneous unit-rate Poisson process considered not as a random point measure but classically, as a Lévy process with Poisson increments. Actually, N(t) = ξ(0, t], where ξ is introduced in Remark 3.1. Also, let N r (x) = N e −x (r−1)! , x ∈ R.  and N r are càglàd, not càdlàg, and we should have considered D left (R), the space of left-continuous functions with finite right limits, instead of D(R). But as the J 1 -topology may be introduced on D left (R) in the same way as on D(R), we will close our eyes to these differences (cf. Remark 3.2 on p. 58 in [20]).
Proof. Let h ← (x) = e −x (r−1)! , x ∈ R, be the inverse function to h given by (3.1). Theorem 3.1 and Remark 3.1 imply that, by the continuous mapping, ξ  [19].) According to Theorem 4.20 in [12], the distributional convergence of random point measures on (0, +∞) in the vague topology is equivalent to that of the associated cumulative processes in the J 1 -topology. So, ξ (n) r (h ← ) −1 (0, ·] → N(·) in the latter sense, and thus, by transfer, ξ