Asymptotic distributions and chaos for the supermarket model

In the supermarket model there are n queues, each with a unit rate server. Customers arrive in a Poisson process at rate \lambda n, where 0<\lambda<1. Each customer chooses d>2 queues uniformly at random, and joins a shortest one. It is known that the equilibrium distribution of a typical queue length converges to a certain explicit limiting distribution as n ->oo. We quantify the rate of convergence by showing that the total variation distance between the equilibrium distribution and the limiting distribution is essentially of order n^{-1}; and we give a corresponding result for systems starting from quite general initial conditions (not in equilibrium). Further, we quantify the result that the systems exhibit chaotic behaviour: we show that the total variation distance between the joint law of a fixed set of queue lengths and the corresponding product law is essentially of order at most n^{-1}.


Introduction
We consider the following well-known scenario, often referred to as the 'supermarket model' [11,12,15,16,17]. Let d be a fixed integer at least 2. Let n be a positive integer and suppose that there are n servers, each with a separate queue. Customers arrive in a Poisson process at rate λn, where λ ∈ (0, 1) is a constant. Upon arrival each customer chooses d servers uniformly at random with replacement, and joins a shortest queue amongst those chosen. If there is more than one chosen server with a shortest queue, then the customer goes to the first such queue in her list of d. Service times are independent unit mean exponentials, and customers are served according to the first-come first-served discipline.
Recent work on the supermarket model includes [2,3,4,9,11,12,20]. The survey [17] gives several applications and related results: an important application is to load balancing when computational tasks are dynamically assigned to servers in a large array. It is shown in [2,3] that the system exhibits propagation of chaos given a suitable initial state, and in particular if it is in equilibrium. This means that the paths of members of any fixed finite subset of queues are asymptotically independent of one another, uniformly on bounded time intervals. This result implies a law of large numbers for the time evolution of the proportion of queues of different lengths, more precisely for the empirical measure on path space [2,3]. In particular for each fixed positive integer k 0 , as n tends to infinity the proportion of queues with length at least k 0 at time t converges weakly to a function v t (k 0 ), where v t (0) = 1 for all t ≥ 0 and (v t (k) : k ∈ N) is the unique solution to the system of differential equations for k ∈ N (see [20]). Here one assumes appropriate initial conditions v 0 = (v 0 (k) : k ∈ N) such that 1 ≥ v 0 (1) ≥ v 0 (2) ≥ · · · ≥ 0, and v 0 ∈ l 1 . Further, again for a fixed positive integer k 0 , as n tends to infinity, in the equilibrium distribution the proportion of queues with length at least k 0 converges in probability to λ (d k 0 −1)/(d− 1) , and thus the probability that a given queue has length at least k 0 also converges to λ (d k 0 −1)/(d− 1) . Recent results in [11] include rapid mixing and two-point concentration for the maximum queue length in equilibrium.
The main contribution of the present paper is to give quantitative versions of the convergence results for the supermarket model mentioned above, and to extend them to hold uniformly over all times. We rely in part on combinatorial techniques developed in [10,11].
For each time t ≥ 0 and each j = 1, . . . , n let X (n) t (j) denote the number of customers in queue j, always including the customer currently being served if there is one. We shall keep the superscript 'n' in the notation in this section, but then usually drop it in later sections. We make the standard assumption of right-continuity of the sample paths. Let X (n) t be the queue-lengths vector (X (n) t (1), . . . , X (n) t (n)) ∈ Z n + , where Z n + denotes the set of all n-vectors with components taking non-negative integer values. Note that the l 1 -norm X t 1 of X t is the total number of customers present at time t, and the l ∞ -norm X t ∞ is the maximum queue length.
For a given positive integer n, the n-queue process (X (n) t ) is an ergodic continuous-time Markov chain. Thus there is a unique stationary distribution Π (n) for the vector X (n) t ; and, whatever the distribution of the starting state, the distribution of the vector X (n) t at time t converges to Π (n) as t → ∞. As already noted in [2,3] (and easily verified), the distribution Π (n) is exchangeable, that is invariant under permuting the co-ordinates. We shall usually write Y (n) t to denote the queue-lengths vector in equilibrium: we drop the subscript t when : no explicit reference to a particular time is needed.
The probability law of a random element X will be denoted by L(X). The total variation distance between two probability distributions µ 1 and µ 2 is defined by d TV (µ 1 , µ 2 ) = sup A |µ 1 (A) − µ 2 (A)| where the supremum is over all measurable sets A in the underlying measurable space (see also the start of Section 2). Also, given a vector v = (v(k) : k = 0, 1, . . .) such that let L v denote the law of a random variable V taking non-negative integer values, where Pr(V ≥ k) = v(k) for each k = 0, 1, 2, . . .. In fact, throughout this paper we shall work only with vectors v ∈ l 1 . Finally, throughout we use the asymptotic notations O(), Ω(), o(), ω() in a standard way, to describe the behaviour of functions depending on the number of servers n as n tends to infinity; for instance f (n) = Ω(g(n)) means that, for some constants c > 0 and n 0 , we have f (n) ≥ cg(n) for all n ≥ n 0 .
We may now state four main results, two concerning approximating the distribution of a single typical queue length and two concerning collections of queues and chaos. These will be proved in the following sections, where we also present some further results.

Single queues
We first consider the n-queue system in equilibrium, and investigate how close the distribution of a typical queue length Y (n) (1) is to the limiting distribution.
. It is known (and was mentioned earlier) that L(Y (n) (1)) tends to L λ,d as n → ∞: we now quantify this convergence. Theorem 1.1 For each positive integer n let Y (n) be a queue-lengths n-vector in equilibrium, and consider the length Y (n) (1) of queue 1. Then is of order n −1 up to logarithmic factors.
In fact, we shall see that the above total variation distance is o(n −1 ln 3 n) and is Ω(n −1 ). Also, we shall deduce directly from Theorem 1.1, together with a bound on the maximum queue length from [11] (given as (3) below), the following: Corollary 1.2 For each positive integer k, the difference between the kth moment E[Y (n) (1) k ] and the kth moment of L λ,d is of order n −1 up to logarithmic factors. Now we drop the assumption that the system is in equilibrium, and consider its behaviour from the beginning (that is, from time 0). We state one theorem here, Theorem 1.3. More general though less digestible results (Theorems 3.3 and 3.4) are stated and proved in Section 3, and Theorem 1.3 will follow easily from Theorem 3.4. We assume in Theorem 1.3 that the initial queue lengths are iid and not too large; and see that the law of a typical queue length is close is the solution to the system (1) subject to the natural initial conditions. Given a queue-lengths vector x, that is a vector with non-negative integer co-ordinates, we let u(k, x) be the proportion of queues with length at least k. Theorem 1.3 There is a constant ε > 0 such that the following holds. Let the random variable X take non-negative values, and suppose that E[e X/ε ] is finite. For each positive integer n, let (X (n) t ) be an n-queue process where the random initial queue lengths X

Collections of queues
The above results concern the distribution of a single queue length. We now consider collections of queues and propagation of chaos. The term "propagation of chaos" comes from statistical physics [7], and the original motivation was the evolution of particles in physical systems. The subject has since then received considerable attention culminating in the work of Sznitman [18].
Our results below establish chaoticity for the supermarket model. As before, we first discuss the equilibrium distribution. We see that for fixed r the total variation distance between the joint law of r queue lengths and the product law is at most O(n −1 ), up to logarithmic factors. More precisely and more generally we have: Theorem 1.4 For each positive integer n, let Y (n) be a queue-lengths n-vector in equilibrium. Then, uniformly over all positive integers r ≤ n, the total variation distance between the joint law of Y (n) (1), . . . , Y (n) (r) and the product law L(Y (n) (1)) ⊗r is at most O(n −1 ln 2 n(2 ln ln n) r ); and the total variation distance between the joint law of Y (n) (1), . . . , Y (n) (r) and the limiting product law L ⊗r λ,d is at most O(n −1 ln 2 n(2 ln ln n) r+1 ).
Note that since the distribution of Y (n) is exchangeable, for any distinct indices j 1 , . . . , j r the joint distribution of Y (n) (j 1 ), . . . , Y (n) (j r ) is the same as that of Y (n) (1), . . . , Y (n) (r). Also, if d were 1 we would have 'exact' independence of all queues in equilibrium. Further, note that the above result can yield a bound less than 1 only if r < ln n/ ln ln ln n; but as long as r = o(ln n/ ln ln ln n) the bound is O(n −1+o(1) ). We shall mention in Section 4 how this result relates to Sznitman's treatment of chaos in [18] -see in particular the inequalities (19) and (20) below. Now we drop the assumption that the system is in equilibrium, and consider its behaviour from time 0. We state one theorem here, Theorem 1.5. More general though less digestible results (Theorems 4.1 and 4.2) are stated and proved in Section 4, and Theorem 1.5 will follow easily from Theorem 4.2. As with Theorem 1.3 earlier, we assume in Theorem 1.5 that the initial queue lengths are iid and not too large: now we see that the joint law of r queue lengths is close to the product law, that is we have chaotic behaviour, uniformly for all times.
Given an n-queue process (X     We remark also that an O(n −1 ) upper bound on the total variation distance between the law of a finite r-tuple of queues and the product law is known on fixed time intervals for iid initial queue lengths -see [3], Theorem 3.5. However, the bound in [3] grows exponentially in time, and does not extend to the equilibrium case. We are not aware of any earlier equilibrium bounds or time-uniform bounds like those given in Theorems 1.4, 1.5, 4.1 and 4.2.
Finally, let us mention at this point that connections between rapid mixing, concentration of measure and chaoticity have been observed earlier in various other contexts. The reader is referred to [6,8,18,19] and references therein for more information.

Preliminaries
This section contains some definitions and some results from [11] needed in our proofs. We start with two lemmas which show that the supermarket model is rapidly mixing. Lemma 2.1 upper bounds the total variation distance and Lemma 2.2 upper bounds the Wasserstein distance between L(X t ) and the equilibrium distribution Π.
We say that a real-valued function f defined on a subset A of R n is 1- Let us recall two equivalent definitions of the total variation distance d TV (µ 1 , µ 2 ) between two probability distributions µ 1 and µ 2 on Z n + , and two corresponding definitions of the Wasserstein distance d W (µ 1 , µ 2 ). We have d TV (µ 1 , , where in each case the infimum is over all couplings of X and Y where L(X) = µ 1 and L(Y ) = µ 2 . Also, where the supremum is over measurable functions φ with φ ∞ ≤ 1; and where the supremum is over the set F 1 of 1-Lipschitz functions f on Z n + . The total variation distance between the corresponding laws on Z n + is always at most the Wasserstein distance: to see this, note for example that I X =Y ≤ X − Y 1 .
Lemma 2.2 For each constant c > λ 1−λ there exists a constant η > 0 such that the following holds for each positive integer n. Let M (n) denote the stationary maximum queue length. Consider any distribution of the initial queue-lengths vector X We now introduce the natural coupling of n-queue processes (X t ) with different initial states. Arrival times form a Poisson process at rate λn, and there is a corresponding sequence of uniform choices of lists of d queues. Potential departure times form a Poisson process at rate n, and there is a corresponding sequence of uniform selections of a queue: potential departures from empty queues are ignored. These four processes are independent. Denote the arrival time process by T, the choices process by D, the potential departure time process byT and the selection process byD.
Suppose that we are given a sequence of arrival times t with corresponding queue choices d, and a sequence of potential departure timest with corresponding selectionsd of a queue (where all these times are distinct). For each possible initial queue-lengths vector x ∈ Z n + this yields a deterministic queue-lengths process (x t ) with x 0 = x: let us write x t = s t (x; t, d,t,d). Then for each x ∈ Z n + , the process (s t (x; T, D,T,D)) has the distribution of a queue-lengths process with initial state x. The following lemma from [11] shows that the coupling has certain desirable properties. Part (c) is not stated explicitly in [11], but it follows easily from part (b). Inequalities between vectors in the statement are understood to hold component by component. 'Adding a customer' means adding an arrival time and corresponding choice of queues.

Lemma 2.3 Fix any 4-tuple t, d,t,d as above, and for each
(c) if t ′ and d ′ are obtained from t and d by adding some extra customers then, Next, let us consider the equilibrium distribution, and note some upper bounds on the total number of customers in the system and on the maximum queue length established in [11].

Lemma 2.4 (a) For any constant
for each positive integer j and each time t ≥ 0.
We shall require an upper bound on the maximum length M (n) of a queue in equilibrium, from Section 7 of [11]. Let i * = i * (n) be the smallest integer i such that λ Now we state some concentration of measure results for the queue-lengths process (X (n) t ). Let us begin with the equilibrium case, where we use the notation Y (n) . Recall that F 1 denotes the set of 1-Lipschitz functions on Z n + . (We suppress the dependence on n.)

Lemma 2.5
There is a constant c > 0 such that the following holds. Let n ≥ 2 be an integer and consider the n-queue system in equilibrium. Then for each f ∈ F 1 and each u ≥ 0 Let ℓ(k, x) denote |{j : x(j) ≥ k}|, the number of queues of length at least k. Thus ℓ(k, x) = nu(k, x). Tight concentration of measure estimates for the random variables ℓ(k, Y ) may be obtained directly from the last lemma.
Also, there exists a constant c > 0 such that The first and third parts of this lemma form Lemma 4.2 in [11]: the second part follows directly from the preceding lemma. We now present a time-dependent result, which will be essential in the proof of Theorem 3.3.

Lemma 2.7
There is a constant c > 0 such that the following holds. Let n ≥ 2 be an integer, let f ∈ F 1 , let x 0 ∈ Z n + , and let X (n) 0 = x 0 almost surely. Then for all times t ≥ 0 and all u > 0, We shall also use the following extension of Bernstein's inequality, which will follow easily for example from Theorem 3.8 of [14].
where Z 1 , . . . , Z n are independent real-valued random variables, such that Z j has variance at most σ 2 j and range at most b, for each j = 1, . . . , n. Let v = n j=1 σ 2 j . Then for each u > 0 Proof. Given fixed numbers z 1 , . . . , z j−1 where 1 ≤ j ≤ n, for each z let Then the function g is 1-Lipschitz, so the random variable g(Z j ) has variance at most σ 2 j and range at most b. Thus, in the terms of Theorem 3.8 of [14], the 'sum of variances' is at most v, and the 'maximum deviation' is at most b; and so we may use that theorem to deduce the result here. Let , the expected proportion of queues of length at least k at time t. Also let u(k) denote E[u(k, Y )], the expected proportion of queues with at least k customers when the process is in equilibrium. Then u(0) = 1 and u(1) = λ. Whatever the initial distribution of X 0 , for each positive integer k, and for Y in equilibrium In [11], this last fact is used to show that u(k) is close to λ (d k −1)/(d−1) ; more precisely, for some constant c > 0, for each positive integer n 3 Distribution of a single queue length

Equilibrium case
In this subsection we prove Theorem 1.1. Let us note that the equilibrium distribution Π is exchangeable, and thus all queue lengths are identically distributed. We begin by showing that (a) in equilibrium the total variation distance between the marginal distribution of a given queue length and the limiting distribution L λ,d is small; and (b) without assuming that the system is in equilibrium, a similar result holds after a logarithmic time, given a suitable initial distribution. [In fact, the result (a) will follow easily from (b), since by Lemma 2.4 and (3), if X 0 is in equilibrium and we set c 0 > λ/(1 − λ) then the quantity δ n in Proposition 3.1 below is o(n −1 ).] Part (a) includes the upper bound part of Theorem 1.1 above.
Let us note here that better bounds may be obtained in the simple case d = 1.
Here the n queues behave independently, and L(Y (1)) = L λ,d for each n. The arrival rate at each queue is always λ, regardless of the state of all the other queues. Then it follows from the proof of Lemma 2.1 in [11] (which is stated as Lemma 2.1 here) that we can drop the term Pr( X 0 1 > c 0 n) in the mixing bound of that lemma, and so part (b) of Proposition 3.1 holds with the bound δ n + O(n −1 ln 2 n ln ln n) replaced by Pr( X 0 ∞ > c 0 ln n) + O(n −K ) for any constant K. Proof of Proposition 3.1. Since the distribution of Y is exchangeable, Pr(Y (1) ≥ k) = u(k) for each non-negative integer k. Note that u(0) − u(1) = 1 − λ. Part (a) now follows easily from (7), since for any positive integer k 0 . But if k 0 ≥ ln ln n/ ln d+c, where c = − ln ln(1/λ)/ ln d then λ 1+d+···+d k 0 ≤ n −1 , and so also u(k 0 + 1) = O(n −1 ln 2 n).
We now show that the O n −1 ln 2 n ln ln n upper bound on the total variation distance between the equilibrium distribution of a given queue length and L λ,d in Proposition 3.1 (a) is fairly close to optimal. The next lemma will complete the proof of Theorem 1.1.

Lemma 3.2
For an n-queue system in equilibrium, the expected proportion u(2) of queues of length at least 2 satisfies u(2) ≥ λ d+1 + Ω(n −1 ). Hence Proof. Let F t = ℓ(1, Y t ) be the number of non-empty queues at time t, and write F for F 1 . We shall show that the variance of F is Ω(n), and from that we shall complete the proof quickly.
Recall that we model departures by a Poisson process at rate n (giving potential departure times) together with an independent selection process that picks a uniformly random queue at each event time of this process. If the queue selected is nonempty, then the customer currently in service departs; otherwise nothing happens.
Let Z be the number of arrivals in [0, 1]. By the last part of Lemma 2.3 we have the monotonicity result that for all non-negative integers x and z, Let the integer x = x(n) be a conditional median of F given that Z = ⌊λn⌋. (It is not hard to see that x = λn + O((n ln n) 1 2 ) but we do not use this here.) Since Pr(F ≤ x|Z = ⌊λn⌋) ≥ 1 2 we have by (8) that We shall find a constant δ > 0 such that Pr(F ≥ x + δn 1 2 ) ≥ 1 14 + o(1), which will show that the variance of F is Ω(n) as required.
Let A be the event that Z ≥ λn+(λn) if we ignore any extra customers), and further there are no potential departures from it in this period. We shall see that with high probability there are linearly many such queues. Let S be the number of shunned queues. Let 0 < η < (1 − λ)e −(1+dλ) . We now prove that Pr(S < ηn) = e −Ω(n Choose 0 < α < 1 − λ and 0 < β < αe −λd such that 0 < η < β/e. Let B be the event that the number n − F 0 of queues empty at time 0 is at least αn. Since E[F 0 ] = u(1)n = λn (as we noted earlier), by Lemma 2.5 we have . Let Q consist of the first ⌈αn⌉ ∧ (n − F 0 ) empty queues at time 0, and let R be the set of queues j ∈ Q such that no basic customer has j in their list. Then By the independent bounded differences inequality (see for example [13]), Pr(|R| ≤ βn | B) = e −Ω(n) . Let R ′ be the set of the first ⌈βn⌉ ∧ |R| queues in R. For each queue in R ′ , independently there is no attempted service completion during [0, 1] with probability e −1 . Then, since η < β/e, conditional on |R| ≥ βn with probability 1 − e −Ω(n) at least ηn of the queues in R ′ have no attempted service completion during [0, 1], and so are shunned. Putting the above together yields (9), since Pr(S < ηn) ≤ Pr(S < ηn | |R| ≥ βn) + Pr(|R| < βn | B) + Pr(B).
Let H (for 'hit') be the number of shunned queues which are the first choice of some extra customer. Our next aim is to show (10), which says that when the event A occurs usually H is large.
Let C be the set of extra customers whose first choice is a shunned queue. Let z = ⌊ 2 3 η(λn) 1 2 ⌋. Let C ′ consist of the first |C| ∧ z customers in C. LetZ denote a binomial random variable with parameters ⌊(λn) Now if no shunned queue is first choice for more than two customers in C ′ , then H ≥ |C ′ | − H ′ , where H ′ is the number of shunned queues which are first choice for two customers in C ′ . But, given A and S = s, E[H ′ ] ≤ z 2 /s, and the probability that some shunned queue is first choice for more than two customers in C ′ is at most z 3 /s 2 . So, setting δ = 1 3 ηλ 1 2 , it follows from the above, using Markov's inequality, that and so Pr(H ≥ δn Next, we claim that F ≥F + H. To see this, start with the basic customers, and for each of the H 'hit' shunned queues throw in the first (extra) customer to hit it. With these customers we have exactlyF + H non-empty queues at time 1. If we now throw in any remaining extra customers, then by Lemma 2.3 (c) we have F ≥F + H as claimed. Now Hence, which shows that the variance of F is Ω(n). Thus we have completed the first part of the proof. For m sufficiently large we have j k λ j/2 ≤ 1 for all j ≥ m, and then the last bound is at most n j≥m λ j/2 = nλ m/2 /(1−λ 1 2 ). Now let m 2 = ⌈4 ln n/ ln 1 λ ⌉. Then for n sufficiently large E[X k I X≥m2 ] ≤ e − ln n+O(1) . Putting the above together we have

Non-equilibrium case
Here we aim to prove Theorem 1.3, where we still consider a single queue length but we no longer assume that the system is in equilibrium. We shall first prove a rather general result, namely Theorem 3.3; and then deduce Theorem 3.4, from which Theorem 1.3 will follow easily. We consider the behaviour of the system from time 0, starting from general exchangeable initial conditions. Theorem 3.3 below shows that uniformly over all t ≥ 0 the law of a typical queue length at time t is close to L vt , where v t = (v t (k) : k = 0, 1, 2, . . .) is the unique solution to the system (1) of differential equations subject to the natural initial conditions. The upper bound on the total variation distance involves three quantities δ n , γ n and s n defined in terms of the initial distribution for X (n) 0 . Here δ n concerns the total number of customers and the maximum queue length, γ n concerns concentration of measure, and s n concerns how close the proportions of queues of length at least k are to the equilibrium proportions.
where the supremum is over all non-negative functions f ∈ F 1 bounded above by n.
It is known [20] that the vector v (n) t satisfies (2) for each t ≥ 0, and so L v (n) t is well-defined. Note that the above result is uniform over all positive integers n, all exchangeable initial distributions of X (n) 0 , and all times t ≥ 0. We shall prove Theorem 3.3 shortly, but first let us give a corresponding result for a particular form of initial conditions, which we describe in three steps. For each n, we start with an initial vector x which is not 'too extreme' ; then we allow small independent perturbations Z j , where we quantify 'small' by bounding the moment generating function; and finally we perform an independent uniform random permutation of the n co-ordinates, in order to ensure exchangeability. If say x = 0 and the Z j are identically distributed (and nonnegative) then we may avoid the last 'permuting' step in forming X (n) 0 , as that last step is simply to ensure that the distribution of X Note that if we set x to be the zero vector above then we obtain the simpler result we stated earlier as Theorem 1.3. It remains here to prove Theorems 3.3 and 3.4. Proof of Theorem 3.3. We aim to prove that there is a constant c 2 > 0 such that for any constant c 1 large enough and any constant c 0 , there is a constant ǫ > 0 such that we have uniformly over the possible distributions for X To see that the theorem will follow from (11), consider a state x with x ∞ ≤ m. Couple (X t ) where X 0 = x with (X t ) whereX 0 = 0 in the way described in the previous section. By Lemma 2.3, we always have X t −X t ∞ ≤ m. Hence always u(k + m, X t ) ≤ u(k,X t ), and so E[u(k + m, X t )] ≤ E[u(k,X t )] ≤ u(k) (where the last inequality again uses Lemma 2.3). Thus, dropping the condition that X 0 = x, for any non-negative integers m and k, we have E[u(k + m, X t )| X 0 ∞ ≤ m] ≤ u(k); and so We shall choose a (small) constant ǫ > 0 later, and let m = ⌈2ǫ ln n⌉. Note that u(k) = o(n −1 ) if k = Ω(ln n) by (3), and that Pr( X 0 ∞ > m) ≤ δ n . But now we may complete the argument as in the proof of Proposition 3.1 (a).
It remains to prove (11). First we deal with small t. We begin by showing that there is a constantc > 0 such that for each positive integer k, each integer n ≥ 2, each t > 0 and each w > 0, To prove this result, note that the left side above can be written as . Then by Lemma 2.7, there is a constantc such that for each x ∈ S Also, the contribution from each x ∈ Z n + \ S is at most Pr(X 0 = x), and summing shows that the total contribution is at most . This completes the proof of (12). Now the function E[ℓ(k, X t )|X 0 = x] is a 1-Lipschitz function of x by Lemma 2.3, and is non-negative and bounded above by n, so where the supremum is over all functions f ∈ F 1 such that 0 ≤ f (x) ≤ n for each x ∈ Z n + . Hence, replacing w by nw, since e − a b+c ≤ e − a 2b + e − a 2c for a, b, c > 0. Now let c 1 ≥ 2/c be a sufficiently large constant and set w = 2 c 1 n −1 (1 + t) ln n 1 2 . Then by the last inequality, we have, uniformly over t ≥ 0 Hence, arguing as in the proof of Lemma 4.2 in [11] (Lemma 2.6 above), uni- We shall use this result only for 0 ≤ t ≤ ln n. By (5) we have, uniformly over 0 ≤ t ≤ ln n and over k, Now recall that v t is the unique solution to (1) subject to the initial conditions v 0 (k) = u 0 (k) for each k. Also, 2λd + 2 is a Lipschitz constant of equation (1) under the infinity norm. More precisely, 2λd + 2 is a Lipschitz constant of the operator A defined on the space of all vectors v = (v(k) : k ∈ N) with components in [0, 1] by

Equation (1) may be expressed succinctly in terms of
Thus by Gronwall's Lemma (see for instance [1]) there exists a constant c 3 > 0 such that uniformly over 0 ≤ t ≤ ln n u t − v t ∞ ≤ c 3 (n −1 ln 2 n + γ n ) ln n e (2λd+2)t .
Let ε 1 = 1 2d+2 so 0 < ε 1 < 1 2λd+2 . Then there exists ε 2 > 0 such that, uniformly over t with 0 ≤ t ≤ ε 1 ln n, we have For larger t we introduce the equilibrium distribution into the argument. Note that sup k |u t (k) − u(k)| ≤ n −1 d W (L(X t ), Π). Consider Lemma 2.2 with c = max{c 0 , 2λ 1−λ }, let η > 0 be as there, and suppose 2ε ≤ ηε 1 . By (3), there is a constant c 4 > 0 such that Pr(M ≥ ln ln n/ ln d + c 4 ) = o(n −2 ). Hence by Lemma 2.2, there is a ε 3 > 0 such that, uniformly over t ≥ ε 1 ln n, Also, by inequality (7) there is a constant c 5 such that for each n Let s n (t) ≥ 0 be given by s 2 ) 2 θ k ; thus s n (0) = s n . We now use Theorem 2.12 in [4], which says that for some constants γ θ > 0 and C θ < ∞, s n (t) 2 ≤ e −γ θ t C θ s 2 n . Hence we deduce that there exists a constant c 6 > 0 such that for all t ≥ 0 Combining the last three inequalities, we see that for some ε 4 > 0, uniformly over t ≥ ε 1 ln n, we have This result together with (14) completes the proof of (11) (with c 2 = ε 1 /c 6 ), and thus of the whole result. Proof of Theorem 3.4. Clearly X 0 is exchangeable, and thus by symmetry so is X t for each t ≥ 0. Let us ignore the last step in constructing the initial state, which involves the random permutation. Note that this does not affect u t = (u t (k) : k = 0, 1, 2, . . .), and similarly v 0 (k) = E[u(k, X 0 )] = E[u(k,Z)].
We shall choose a small constant ε > 0 later, and assume that E[e Zj /ε ] ≤ β. By Markov's inequality, for each n and each b ≥ 0, Let us take b = b(n) = ln 2 n, and let A = A(n) be the event that max j Z j ≤ b. Then by (15)  Also, the variance of Z j given Z j ≤ b is at most var(Z j )/Pr(Z j ≤ b), which is at most 2β for n sufficiently large that Pr(Z j ≤ b) ≥ 1 2 . Thus given Z j ≤ b, the variance ofZ j is also at most 2β, and furtherZ j has range at most 3 2 b, assuming that 2ε ≤ ln n. Hence, by Lemma 2.8, for each sufficiently large integer n and each w > 0, 4nβ+bw , and so the quantity nγ n in Theorem 3.3 is O(n −1 ) provided the constant c 1 is large enough.
It remains to upper bound the term s n . Note that The second of these terms certainly takes a finite value independent of n, so it suffices to ensure that the first term Using Markov's inequality again, Pr(Z j ≥ k/2) ≤ βe −k/2ε , and so the second term is bounded by a constant uniformly in n, provided θ < e 1/ε . This is certainly true if ε > 0 is small enough. As for the first term, if we let k 0 = 2ε ln n + 2, we have if ε is sufficiently small. Combining the above estimates, we obtain the required bound on s n .

Asymptotic chaos 4.1 Equilibrium case
In this subsection we prove Theorem 1.4 concerning chaoticity of the queuelengths process, where the system is stationary. This result quantifies the chaoticity in terms of the total variation distance between the joint law of the queue lengths and the corresponding product laws. In the proof we first bound another natural measure of these distances, following the treatment in [18] see (19) and (20) Let F ′ 1 be the set of functions f ∈ F 1 with f ∞ ≤ n. By the last result, uniformly over positive integers a, Let G 1 denote the set of measurable real-valued functions φ on Z + with φ ∞ ≤ 1. For any measurable real-valued function φ on Z + letφ be the function on Z n + defined by settingφ(y) = 1 n n i=1 φ(y i ) for y = (y 1 , . . . , y n ) ∈ Z n + . Observe that Observe also that since the distribution of Y is exchangeable we have E[φ(Y )] = E[φ(Y (1))]: let us call this mean valueφ.
Let f (y) = (n/2)φ(y) = 1 2 n i=1 φ(y i ). It is easy to see that if φ ∈ G 1 then f ∈ F ′ 1 . Hence by (16), uniformly over positive integers a, sup φ1,...,φa∈G1 E a s=1 |φ s (Y ) −φ s | ≤ (2cn −1/2 ln n) a +o(a2 a n −2 ) ≤ (3cn −1/2 ln n) a (17) for n sufficiently large. But, by writingφ s (Y ) as (φ s (Y ) −φ s ) +φ s , we see that Hence by (17), uniformly over all positive integers r ≤ n Now uniformly over r and all r-tuples φ 1 , . . . , φ r ∈ G 1 since when we expand the middle expression there are at most r 2 n r−1 terms for which the values of j are not all distinct. Hence from (18), uniformly over all positive integers r ≤ n, (For r > n Following Sznitman [18], the vector Y = (Y (1), . . . , Y (n)) is chaotic (in total variation) since the left hand side of (19) tends to 0 as n → ∞ for each fixed positive integer r. Thus (19) quantifies the chaoticity of the equilibrium queue lengths Y in terms of this definition. Similarly, the inequality (20) quantifies Y being L λ,d -chaotic. Let us note that, up to factors logarithmic in n, the bound in (20) is of the same order as the time-dependent bound of Theorem 4.1 in [5] obtained for a related class of models.
The results (19) and (20) yield bounds on the total variation distance between the joint law of Y (1), . . . , Y (r) and the product law L(Y (1)) ⊗r , and between the joint law of Y (1), . . . , Y (r) and the product law L ⊗r λ,d respectively, as follows.
[In general, even for random variables Y (s) taking values in Z + and r = 2, it does not follow that if the left hand side of (19) tends to 0 as n → ∞ then the total variation distance between the joint law of Y (1), . . . , Y (s) and the product law L(Y (1)) ⊗r tends to 0.] Putting φ s as the indicator of the set {k s }, we obtain, uniformly over positive integers r ≤ n  Hence in total variation distance, for any fixed ǫ > 0, the joint law of Y (1), . . . , Y (r) differs from the product law L(Y (1)) ⊗r by at most O(n −1 ln 2 n ((1+ǫ) ln ln n/ ln d) r ), and by at most O(n −1 ln 2 n ((1 + ǫ) ln ln n/ ln d) r+1 ) from the law L ⊗r λ,d .

Non-equilibrium case
We now no longer assume that the system is in equilibrium, and show that under quite general exchangeable initial conditions, chaotic behaviour occurs in the system, uniformly for all times. We need to prove Theorem 1.5. We first state two general results, Theorem 4.1 and 4.2, and note that Theorem 1.5 will follow easily from the latter. We then prove Theorem 4.1 and deduce Theorem 4.2.
Given an n-queue process (X (n) t ) and a positive integer r ≤ n, let L (n,r) t denote the joint law of X (n) t (1), . . . , X (n) t (r), and letL (n,r) t denote the product law of r independent copies of X (n) t (1). The following result shows that, as long as initially there are not too many customers in the system, the maximum queue is not too long and the system has sufficient concentration of measure, there will be chaotic behaviour uniformly for all times. is exchangeable (which implies that the law of X ≤ c[(n −1 ln n)(m+ln n)+γ n ](m+2 ln ln n) r +(r+1)δ n,m .
We shall see later that it is straightforward to deduce Theorem 1.4 from Theorem 4.1. Theorem 4.2 below is a straightforward consequence of Theorem 4.1 and shows that, in particular, there is chaotic behaviour uniformly for all times t ≥ 0, when the initial state is obtained by perturbing a "nice" queue-lengths vector by a set of 'small' independent random variables. Note that Theorem 1.5 will follow from this last result on taking x as the zero vector. It remains to prove Theorems 4.1 and 4.2.
Proof of Theorem 4.1. First we deal with small t. We start by arguing in a similar way to the proofs of Theorems 3.3 and 1.4. As in the inequality (13), there exists a constant c 1 > 0 such that uniformly over t ≥ 0, over n, and over all non-negative functions f ∈ F 1 bounded above by n, we have Pr(|n −1 f (X t ) − E[n −1 f (X t )]| ≥ (c 1 (1 + t)n −1 ln n) 1/2 ) ≤ o(n −2 ) + γ n .
With notation as in the proof of Theorem 1.4, since the distribution of X  Let M (n) denote the maximum queue length in equilibrium. Then, arguing as in the proof of Theorem 3.3, given that X