Discrete small world networks

Small world models are networks consisting of many local links and fewer long range 'shortcuts', used to model networks with a high degree of local clustering but relatively small diameter. Here, we concern ourselves with the distribution of typical inter-point network distances. We establish approximations to the distribution of the graph distance in a discrete ring network with extra random links, and compare the results to those for simpler models, in which the extra links have zero length and the ring is continuous.


Introduction
There are many variants of the mathematical model introduced by Watts and Strogatz [15] to describe the "small-world" networks popular in the social sciences; one of them, the great circle model of Ball et.al. [4], actually precedes [15].
See [1] for a recent overview, as well as the books [5] and [8].A typical description is as follows.Starting from a ring lattice with L vertices, each vertex is connected to all of its neighbours within distance k by an undirected edge.Then a number of shortcuts are added between randomly chosen pairs of sites.Interest centres on the statistics of the shortest distance between two (randomly chosen) vertices, when shortcuts are taken to have length zero.
Newman, Moore and Watts [12], [13] proposed an idealized version, in which the lattice is replaced by a circle and distance along the circle is the usual arc length, shortcuts now being added between random pairs of uniformly distributed points.Within their [NMW] model, they made a heuristic computation of the mean distance between a randomly chosen pair of points.Then Barbour and Reinert [7] proved an asymptotic approximation for the distribution of this distance as the mean number Lρ of shortcuts tends to infinity; the parameter ρ describes the average intensity of end points of shortcuts around the circle.In this paper, we move from the continuous model back to a genuinely discrete model, in which the ring lattice consists of exactly L vertices, each with connections to the k nearest neighbours on either side, but in which the random shortcuts, being edges of the graph, are taken to have length 1; thus distance becomes the usual graph distance between vertices.However, this model is rather complicated to analyze, so we first present a simpler version, in which time runs in discrete steps, but the process still lives on the continuous circle, and which serves to illustrate the main qualitative differences between discrete and continuous models.This intermediate model would be reasonable for describing the spread of a simple epidemic, when the incubation time of the disease is a fixed value, and the infectious period is very short in comparison.In each of these more complicated models, we also show that the approximation derived for the [NMW] model gives a reasonable approximation to the distribution of inter-point distances, provided that ρ (or its equivalent) is small; here, the error in Kolmogorov distance is of order O(ρ 1 3 log( 1 ρ )), although the distribution functions are only O(ρ) apart in the bulk of the distribution.

The continuous circle model for discrete time
In this section, we consider the continuous model of [7], which consists of a circle C of circumference L, to which are added a Poisson Po (Lρ/2) number of uniform and independent random chords, but now with a new measure of distance between points P and Q.This distance is the minimum of d(γ) over paths γ along the graph between P and Q, where, if γ consists of s arcs of lengths l 1 , . . ., l s connected by shortcuts, then d(γ) := s r=1 ⌈l r ⌉, where, as usual, ⌈l⌉ denotes the smallest integer m ≥ l; shortcuts make no contribution to the distance.We are interested in asymptotics as Lρ → ∞, and so assume throughout that Lρ > 1.
We begin with a dynamic realization of the network, which describes, for each n ≥ 0, the set of points R(n) ⊂ C that can be reached from a given point P within time n, where time corresponds to the d(•) distance along paths.
Pick Poisson Po (Lρ) uniformly and independently distributed 'potential' chords of the circle C; such a chord is an unordered pair of independent and uniformly distributed random points of C. Label one point of each pair with 1 and the other with 2, making the choices equiprobably, independently of everything else.We call the set of label 1 points Q, and, for each q ∈ Q, we let q ′ = q ′ (q) denote the label 2 end point.Our construction realizes a random subset of these potential chords as shortcuts.We start by taking R(0) = {P } and B(0) = 1, and let time increase in integer steps.R(n) then consists of a union of B(n) intervals of C, each of which is increased by unit length at each end point at time n + 1, but with the rule that overlapping intervals are merged into a single interval; this defines a new union of B ′ (n + 1) intervals R ′ (n + 1); note that B ′ (n + 1) may be less than B(n).Furthermore, forms a square integrable martingale, so that (1+2ρ) −n M (n) → W ρ a.s.for some W ρ such that W ρ > 0 a.s. and EW ρ = 1.Hence also ( Our strategy is to pick a starting point P , and run both constructions up to an integer time τ r , chosen in such a way that R(n) and S(n) are (almost) the same for n ≤ τ r .Pick where ⌊x⌋ denotes the largest integer no greater than x, and let small.Now let τ r = n 0 + r, and assume that implying in particular that τ r ≤ 2 log(Lρ) 3 log(1+2ρ) .Then, writing R r = R(τ r ), S r = S(τ r ), M r = M (τ r ), and s r = s(τ r ), we have Next, independently and uniformly, we pick a second point P ′ ∈ C, and a second set of potential chords, Q ′ , and run both constructions for time τ r ′ , where r ′ also satisfies (2.1), yielding Then, at least for small ρ, there are about φ 2 0 Lρ(1 + 2ρ) r+r ′ pairs of intervals, with one in S r and the other in S ′ r ′ , and each is of typical length ρ −1 , so that the expected number of intersecting pairs of intervals is about which, in the chosen range of r, r ′ , grows from almost nothing to the typically large value 2φ 2 0 (Lρ) 1/3 .For later use, label the intervals in S r as I 1 , . . ., I Mr , and the intervals in S ′ r ′ as J 1 , . . ., J N r ′ ; then we can write the number V r,r ′ of intersecting pairs of intervals as where Now the probability that V r,r ′ = 0 is the same as when the construction for S ′ uses the original set Q of potential chords, because of the independence of Poisson processes on disjoint subsets; the event V r,r ′ = 0 indicates that the two processes have no intersecting pairs of intervals when stopped at the times τ r , τ r ′ , and thus use disjoint sets of chords.Furthermore, we can show that the event V r,r ′ = 0 is with high probability the same as the event V r,r ′ = 0, where V r,r ′ is the number of intersections of R(r) and R ′ (r ′ ).Finally, if R(r) and R ′ (r ′ ) have no intersections, then the "small worlds" distance between P and P ′ is more than Hence we have solved the problem if we can find a good approximation to the probability that V r,r ′ = 0; this we do by showing that V r,r ′ approximately has a mixed Poisson distribution, and by identifying the mixture distribution.We usually take r = r ′ or r = r ′ + 1, the latter to allow for the possibility of the number of steps in the shortest path being odd.
After this preparation, we are in a position to summarize our main results.
These are treated in more detail in the next section, in Theorem 3.9, Corollary 3.10 and Theorem 3.15.We let D denote the small worlds distance between a randomly chosen pair of points P and Q on C, so that, as above, The following theorem approximates the distribution of D by that of another random variable D * , whose distribution is more accessible; in this theorem, ρ and the derived quantities φ 0 , n 0 , N 0 and x 0 all implicitly depend on L, as does the distribution of D * .
Theorem 2.1 Let ∆ denote a random variable on the integers with distribution given by and set 1.If ρ is large, let N 0 be such that (1+2ρ) N0 ≤ Lρ < (1+2ρ) N0+1 , and define so that ∆ concentrates almost all its mass on x 0 , unless α is very close to 1.
2. If ρ → 0, the distribution of ρ∆ approaches that of the random variable T defined in [7], Corollary 3.10: The errors in these distributional approximations are also quantified, for given choices of L and ρ(L).
This result shows that, for ρ small and x = lρ with l ∈ Z, where W and W ′ are independent NE(1) random variables.Indeed, it follows from Lemma 3.13 below that W ρ → D W as ρ → 0. One way of realizing a random variable T with the above distribution is to realize W and W ′ , and then to sample T from the conditional distribution where G 1 := − log W and G 2 := − log W ′ both have the Gumbel distribution.
With this construction, whatever the values of W and W ′ , and hence of G 1 and G 2 , implying that where G 1 , G 2 and G 3 are independent random variables with the Gumbel distribution.The cumulants of T can thus immediately be deduced from those of the Gumbel distribution, given in Gumbel [9]: Note that, in view of Corollary 3.2 below, the conditional construction (2.5) can be interpreted in terms of the processes S and S ′ , since W ρ and W ′ ρ are essentially determined by the early stages of the respective pure birth processes, and the extra randomness, conditional on the values of W ρ and W ′ ρ , comes from the random arrangement of the intervals on the circle C.
In the NMW heuristic, the random variable T N MW is logistic, having distribution function e 2x (1 + e 2x ) −1 ; note that this is just the distribution of 1 2 (G 1 −G 3 ).Hence the heuristic effectively neglects some of the initial branching variation.

The continuous circle model: proofs
The first step in the argument outlined above is to establish a Poisson approximation theorem for the number of pairs of overlapping intervals, one in S r and the other in S ′ r ′ .The following result has been shown in [7].
Proposition 3.1 Let M intervals I 1 , . . ., I M with lengths t 1 , . . ., t M and N intervals J 1 , . . ., J N with lengths u 1 , . . ., u N be positioned uniformly and indepen- The proposition translates immediately into a useful statement about V r,r ′ , when P ′ is chosen uniformly at random, independently of all else.Corollary 3.2 For the processes S and S ′ of the previous section, we have Remark.If P ′ is not chosen at random, but is a fixed point of C, the result of Corollary 3.2 remains essentially unchanged, provided that P and P ′ are more than an arc distance of τ r + τ r ′ apart.The only difference is that then X 11 = 0 a.s., and that N t + M u is replaced by N t + M u − 2τ r − 2τ r ′ .If P and P ′ are less than τ r + τ r ′ apart, then P[ V r,r ′ = 0] = 0.The next step is to show that P[ V r,r ′ = 0] is close to P[V r,r ′ = 0].We do this by directly comparing the random variables V r,r ′ and V r,r ′ in the joint construction.As for Corollary 3.5 in [7], the following assertion can easily be shown to hold.

Proposition 3.3 With notation as above, we have
To apply Corollary 3.2 and Proposition 3.3, it remains to establish more detailed information about the distributions of M r and s r .In particular, we need to bound the first and second moments of M r , and to approximate the quantity E(exp{−L −1 (N r ′ s r + M r u r ′ )}).We begin with the following lemma.

Lemma 3.4
The random variable M (n) has as probability generating function 1) , where f (n) denotes the nth iteration of f .In particular, we have Proof: Since M (n) is a branching process with 1 + Po (2ρ) offspring distribution, the probability generating function is immediate, as are the moment calculations The moments of M r follow from the definition of τ r .
[] These estimates can be directly applied in Corollary 3.2 and Proposition 3.3. Define Corollary 3.5 We have and Consideration of the quantity E(exp{−L −1 (N r ′ s r + M r u r ′ )}) now gives the immediate asymptotics of where D denotes the "small world" distance between P and P ′ .
, where W ρ and W ′ ρ are independent copies of the limiting random variable associated with the pure birth chain M .
Proof: The conditions ensure that τ r and τ r ′ both tend to infinity as L → ∞, at least as fast at c log(Lρ), for some c > 0.Then, since uniformly for r, r ′ in the given ranges.
[] Hence P[D > 2n 0 + r + r ′ ] can be approximated in terms of the distribution of the limiting random variable W ρ associated with the pure birth chain M .
However, in contrast to the model with time running continuously, this distribution is not always NE ( 1), but genuinely depends on ρ.Its properties are not so easy to derive, though moments can be calculated, and, in particular, it is also shown in Lemma 3.13 that L(W ρ ) is close to NE (1) for ρ small.We also need the following lemma, which is useful in bounding the behaviour of the upper tail of L(D).
Lemma 3.7 For all θ, ρ > 0, Proof: The offspring generating function of the birth process M satisfies for all 0 ≤ s ≤ 1.Hence, with m = 1 + 2ρ, The last equality follows from (8.11), p.17 in [10], noting that the right-hand side is the Laplace transform of the NE(1) -distribution.Furthermore, we have and so, applying (3.4) twice, and because the function ( The simple asymptotics of Corollary 3.6 can be sharpened.At first sight surprisingly, it turns out that it is not necessary for the times τ r and τ r ′ to tend to infinity, since, for values of ρ so large that n 0 is bounded, the quantities where and and thus Then we have where W (τ r ) := W (τ r ) and U r := U (τ r ), so that, by Taylor's expansion, and because EW (n) = 1 for all n, and Using these results, we obtain the following theorem.
Theorem 3.8 If P ′ is randomly chosen on C, then where η 1 , η 2 are given in (3.1) and (3.2), and where, as before, D denotes the shortest distance between P and P ′ on the shortcut graph.
[] Theorem 3.8 can be translated into a uniform distributional approximation, as follows.
Theorem 3.9 If ∆ denotes a random variable on the integers with distribution given by and D * = ∆ + 2n 0 , then In particular, for Proof: It is easy to see that ∆, defined as above, is indeed a random variable.
The above bound tends to zero as L → ∞ as long as For larger ρ and for L large, it is easy to check that n 0 can be no larger than 4, so that interpoint distances are extremely short, few steps in each branching process are needed, and the closeness of L(D) and L(D * ) could be justified by direct arguments.Even in the range covered by Theorem 3.9, it is clear that L(D) becomes concentrated on very few values, once ρ is large, since the factor 2φ 2 0 (1 + 2ρ) x in the exponent in (3.12) is multiplied by the large factor (1 + 2ρ) if x is increased by 1.The following corollary makes this more precise. and Proof: The result follows immediately from Jensen's inequality; as EW ρ = 1, and from Lemma 3.7 with Thus the distribution is essentially concentrated on the single value x 0 if ρ is large and α is bounded away from 0 and 1. If, for instance, α is close to 1, then both x 0 and x 0 + 1 may carry appreciable probability.
If ρ → ρ 0 as L → ∞, then the distribution of ∆ becomes spread out over Z, converging to a non-trivial limit as L → ∞ along any subsequence such that φ 0 (L, ρ) converges.Both this behaviour and that for larger ρ are quite different from the behaviour found in the continuous model of [7].However, if ρ becomes smaller, the differences become less; we now show that, as ρ → 0, the distribution of ρ∆ approaches the limiting distribution of T obtained in [7].
The argument is based on showing that the distribution of W ρ is close to NE (1).To do so, we employ the characterizing Poincaré equation for Galton-Watson branching processes (see Harris [10], Theorem 8.2, p.15); if is the Laplace transform of L(W ρ ), then We show that when ρ ≈ 0 then φ ρ (θ) is close to φ e (θ) = (1 + θ) −1 , the Laplace transform of the NE (1) distribution. Let and let for some g ∈ G} .
Then H contains all Laplace transforms of probability distributions with mean 1 and finite variance.On H, define the operator Ψ by is the probability generating function of 1 + Po (2ρ), and m = 1 + 2ρ > 1.Thus the Laplace transform φ ρ of interest to us is a fixed point of Ψ.
[] Lemmas 3.11 and 3.12 together yield the following result.
Proof: With Lemmas 3.11 and 3.12, it follows that Note that indeed φ ρ − φ e ∈ G. Thus, since m > 1, it follows that As an immediate consequence, L(W ρ ) → NE(1) as ρ → 0. Theorem 3.14 reformulates this convergence as a pointwise comparison theorem directly relevant to the distribution functions of ∆ and T .
Theorem 3.15 As in Theorem 3.9, let ∆ be a random variable on Z with distribution given by Let T denote a random variable on R with distribution given by Proof: We use an argument similar to that used for Theorem 3.9.For a large, we can use the bound from which, for z > 0 and with c(ρ) defined by we have Similarly, from Lemma 3.7, we have Complementing these upper tail bounds, from Theorem 3.14 and for z ∈ ρZ, we have (3.17) Using the facts that (1 + 2ρ) z/ρ = e 2zc(ρ) and that (1 it also follows that and then, from (3.18) and (3.15), we have For any larger values of z, the upper tail bounds give a maximum discrepancy of order O{ρ 1/3 (1 + log(1/ρ))}, as required.Note that, in the main part of the distribution, for z of order 1, the discrepancy is actually of order ρ.
[] Numerically, instead of calculating the limiting distribution of W ρ , we would use the approximation ) and (3.11),where the distributions of W (τ r ) and W ′ (τ r ′ ) can be calculated iteratively, using the generating function from Lemma 3.4.As D is centred around 2n 0 = 2⌊ N 2 ⌋, and as r is of order at most log(Lρ) log(1+2ρ) , only order log(Lρ) log(1+2ρ) iterations would be needed.

The discrete circle model: description
Now suppose, as in the discrete circle model of Newman et al. [13], that the circle C becomes a ring lattice with Λ = Lk vertices, where each vertex is connected to all its neighbours within distance k by an undirected edge.In the notation of [13], a number of shortcuts are added between randomly chosen pairs of sites, with probability φ per connection in the lattice, of which there are Λk; thus, on average, there are Λkφ shortcuts in the graph.In contrast to the previous setting, it is natural in the discrete model to use graph distance, which implies that all edges, including shortcuts, have length 1.This turns out to make a significant difference to the results when shortcuts are very plentiful.
For ease of comparison with the previous model, which collapsed the kneighbourhoods, we adopt a different notation.The model can be formulated as the union of a Bernoulli random graph G Λ, σ Λ and the underlying ring lattice on Λ vertices.Here we write σ = ρ k , so that the expected number of edges in G Λ, σ Λ is close to the value Lρ/2 in the previous model; comparing the expected number of shortcuts with that given in [13], we also have relating our parameter σ to those of [13].In particular, we have The model can also be realized by a dynamic construction.Choosing a point P 0 ∈ {1, . . ., Λ} at random, set R(0) = {P 0 }.Then, at the first step (distance 1), the 'island' consisting of P 0 is increased by k points at each end, and, in addition, M ∼ Bi (Λ − 2k − 1, σ Λ ) shortcuts connect to centres of new islands.At each subsequent step, starting from the set R(n) of vertices within distance n of P 0 , each island is increased by the addition of k points at either end, but with overlapping islands merged, to form a set R ′ (n + 1); this is then increased to R(n + 1) by choosing a Bernoulli-σ Λ thinning of the edges joining The branching analogue of this process, which agrees with the current process until its first self-overlap occurs, has individuals, here representing the islands, of two types: newly formed type 1 islands, consisting of just one vertex, and existing type 2 islands.A type 1 island at time n becomes a type 2 island at time n+1, and, in addition, has a Bi (Λ, σ Λ )-distributed number of type 1 islands as 'offspring'.A type 2 island at time n stays a type 2 island at time n + 1, and has a Bi (2kΛ, σ Λ )-distributed number of type 1 islands as offspring.Each new island starts at an independent and randomly chosen point of the circle, and at each subsequent step acquires k more vertices at either end.Writing for the numbers of islands of the two types at time n, where the superscript T denotes the transpose, their development over time is given by the branching recursion The total number of intervals at time n is denoted by and the total number of vertices in these intervals by As before, we use the branching process as the basic tool in our argument.
It is now a two type Galton-Watson process with mean matrix The characteristic equation of A yields the eigenvalues From the equation f A = λf , we find that the left eigenvectors f (i) , i = 1, 2, satisfy We standardize the positive left eigenvector f (1) of A, associated with the eigenvalue λ, so that = (λ − σ) for f (2) , we choose Then, for i = 1, 2, we have where F (n) denotes the σ-algebra σ( M (0), . . ., M (n)).Thus, from (4.7), is a (non-zero mean) martingale, for i = 1, 2; we let be the almost sure limit of the martingale W (1) (n).
Our main conclusions can be summarized as follows: the detailed results and their proofs are given in Theorems 5.6 and 5.9.Let ∆ d denote a random variable on the integers with distribution given by Note that the expectation in (4.11) is taken under the initial condition (4.2); we shall later need also to consider the distribution of W k,σ under other initial conditions.

The discrete circle model: proofs
We begin the detailed discussion with some moment formulae.
Lemma 5.1 For the means, and For the variances, for j ≤ n, (5.5) Note that, from (4.5), we have (5.7) Proof: First, observe that for all n, by the martingale property.From (4.9) and (4.7), we have and thus .
Moreover, from (5.9), and hence From this, using the inequality and that of Corollary 3.2 gives The estimates (5.14) and (5.15) can be made more explicit with the help of the bounds which follow from from Lemma 5.1; together, they give the following result, in which D denotes the shortest distance between P 0 and a randomly chosen vertex P ′ of C.

Lemma 5.2 With the above notation and definitions, we have
and We now need to examine To start with, from (5.1) in Lemma 5.1, we have where we have used (4.5) and (4.6) to simplify, and this expression is rather close to (λ/σ)E M + (n) as given in (5.1).This reflects the fact that both ŝ(n) and (λ/σ) M + (n) are rather close to .
The theorem can be translated into a uniform bound, similar to that of Theorem 3.9.To do so, we need to be able to control E{e −ψW k,σ W ′ k,σ } for large ψ.The following analogue of Lemma 3.7 makes this possible.To state it, we first need some notation.
For W k,σ as in (4.10), let φ k,σ := (φ 1 , φ 2 ) denote the Laplace transforms 1) }; (5.23) , where e (i) is the i'th unit vector.Although we now need to distinguish other initial conditions for the branching process, unconditional expectations will always in what follows presuppose the initial condition M0 = e (1) , as before.Then, as in Harris [10], p.45, φ k,σ satisfies the Poincaré equation where g i is the generating function of M1 if M0 = e (i) : where p i (r 1 , r 2 ) is the probability that an individual of type i has r 1 children of type 1 and r 2 children of type 2. Here, from the binomial structure, The Laplace transforms φ k,σ can be bounded as follows.
Lemma 5.5 For θ, σ > 0, we have and hence Proof: We proceed by induction.Put Then .
By the Poincaré recursion, Hence, using the induction assumption, and, also from (4.5), Taking limits as n → ∞ proves the first two assertions.The last assertion follows as in Lemma 3.7.
[] Theorem 5.6 Let ∆ d denote a random variable on the integers with distribution given by (5.7) that the branching process, which die away only slowly when ρ is large.However, if ρ → ∞ and k = O(σ 1−ε ) for any ε > 0, then lim inf γ(k, σ) > 0, and it becomes possible for L(D) and L(D * ) to be asymptotically close in total variation.This can be deduced from the proof of the theorem by taking k ∼ L α and σ ∼ L α+β , for choices of α and β which ensure that σ 2 dominates ρ.Under such circumstances, the effect of two successive multiplications by σ in the branching process dominates that of a single multiplication by 2ρ at the second step, and approximately geometric growth at rate λ ∼ σ results.However, as in all situations in which ρ is a positive power of Λ, interpoint distances are asymptotically bounded, and take one or at most two values with very high probability; an analogue of Corollary 3.10 could for instance also be proved.
If ρ = kσ is small, we can again compare the distribution of W k,σ with the NE(1) distribution of the limiting variable W in the Yule process (see [7]), using the fact that its Laplace transforms satisfy the Poincaré equation (5.24).Define the operator Ξ by Then H contains φ k,σ = (φ 1 , φ 2 ) as defined in (5.23), since and taking limits in (5.3) shows that Var W k,σ exists.We next show that Ξ is a contraction on H.
Lemma 5.7 The operator Ξ is a contraction on H, and, for all ψ, χ ∈ H, Taking the maximum of the bounds finishes the proof.
As a candidate ψ, we try (5.28) Lemma 5.5 shows that this pair dominates φ k,σ .
Lemma 5.8 For ψ given in (5.28), we have Proof: For θ > 0, we have where Moreover, From Taylor's expansion, it follows that where Using (4.5), we obtain  For the main part of the distribution, we write For the large values of z, where the bound given in (5.34) becomes useless, we can estimate the upper tails of the random variables separately.First, for x ∈ Z, we have 1 ) −2 W k,σ W ′ k,σ , so that, by Lemma 5.5, it follows that .20) Combining the bounds (3.17),(3.19)and (3.20) for e 2z ≤ ρ −1/3 gives a supremum of order ρ 1/3 for |P[ρ∆ > z] − P[T > z]|; note that z may actually be allowed to take any real value in this range, since T has bounded density.

Theorem 4 . 1
) for any x ∈ Z, and set D * = ∆ d + 2n d .Here, n d and φ d are such that λ n d = φ d (Λσ) 1/2 and λ −1 < φ d ≤ 1.Let D denote the graph distance between a randomly chosen pair of vertices on the ring lattice C. If Λσ → ∞ and ρ = kσ remains bounded, then it follows that d T V (L(D), L(D * )) → 0. If ρ → 0, then ρ∆ d → D T , where T is as in Theorem 2.1.