Rumor source detection for rumor spreading on random increasing trees

In a recent paper, Shah and Zaman proposed the rumor center as an effective rumor source estimator for rumor spreading on random graphs. They proved for a very general random tree model that the detection probability remains positive as the number of nodes to which the rumor has spread tends to inﬁnity. Moreover, they derived explicit asymptotic formulas for the detection probability of random d -regular trees and random geometric trees. In this paper, we derive asymptotic formulas for the detection probability of grown simple families of random increasing trees. These families of random trees contain important random tree models as special cases, e.g., binary search trees, recursive trees and plane-oriented recursive trees. Our results show that the detection probability varies from 0 to 1 across these families. Moreover, a brief discussion of the rumor center for unordered trees is given as well.


Introduction and Results
Rumor spreading on random trees has a long history in the biology, computer science and probability literature and has been investigated from many different angles.In a recent paper, Shah and Zaman [10,11] added a new angle by putting forth the rumor source detection problem which asks for the correct identification of the rumor source when only information about the underlying model and the infected nodes is known.In [10,11], this problem was discussed for random d-regular trees and random geometric trees.Then, in [12], the authors generalized their approach to obtain results for very general families of random trees.Their studies, even though all of them very recent, have attracted a lot of attention and have led to many follow-up works (e.g., according to a google scholar search from August 14, 2014, the number of citations of the paper [11] had already reached 80).
From now on, we assume that some random tree model is fixed.After some time has elapsed, the rumor has spread to n nodes which form a tree Γ.The main idea in [10,11] was to assign a score to the nodes of Γ.The so-called rumor center is then the node which receives the highest score (where ties are either ignored or broken uniformly at random).In [10,11], the authors showed that the rumor source estimator obtained in this way is the maximum likelihood (ML) estimator if the underlying random tree model are random d-regular trees.However, for most other random tree models, the rumor source estimator is not the ML estimator.Nevertheless, it was shown in [12] that for very general families of random trees, the rumor source estimator is still effective in the sense that the detection probability tends to a positive value as the number of infected nodes n tend to infinity.
Precise asymptotic values for detection probabilities have so far only been found in the special cases of d-regular trees and geometric trees.It is the purpose of this work to derive detection probabilities for other classes of random trees, namely, all subclasses of simple families of random increasing trees whose random model arises from a (natural) tree evolution process.These subclasses will contain d-regular trees and, e.g., the following important random tree models: • Recursive Trees: they have been proposed as a simple model for the spread of epidemics (a situation very similar to rumor spreading); see Moon [8].We will show that they constitute the limiting case of d-regular trees as d tends to infinity.
• Plane-oriented Recursive Trees: they are one of the most simplest models for real complex networks; see the important paper of Barabási and Albert [1].
We will give a precise mathematical definition of simple families of random increasing trees below and describe some of their properties; for more information see Bergeron, Flajolet, and Salvy [2].We now provide some more details in order to be able to state our results.We fix some notations.Recall that Γ denotes the tree of the nodes to which the rumor has spread.We will denote by V (Γ) the nodes of Γ with |Γ| = #V (Γ) and by E(Γ) the edges of Γ.If v ∈ V (Γ), Γ v will denote Γ rooted at v with an (arbitrary) embedding in the plane, where we will draw Γ in such a way that v is at the top (and the subtrees are below).If u ∈ V (Γ), then Γ v u will denote the subtree at the fringe of Γ v rooted at u.
Rumor Center.In this paragraph, we will recall the definition of the rumor center from [10,11].For v ∈ V (Γ), we define a score as follows . This is the so-called shape functional; see for instance Fill [4].In order to explain its meaning, we need to recall some further notation from graph theory.We call a rooted tree ordered if it comes with a fixed embedding into the plane (where in this paper, we always draw the root at the top); otherwise, the tree is called unordered.Moreover, a rooted increasing tree of n nodes is a tree whose nodes are labeled with labels from the set {1, . . ., n} in such a way that every sequence of labels from to the root to a leaf forms an increasing sequence.Now, we can explain the meaning of the shape functional: it gives the number of rooted ordered increasing trees which are isomorphic to Γ v .
We next recall the definition of rumor center from [10,11].
Thus, a rumor center v of Γ is a node such that the number of rooted ordered increasing trees which are isomorphic to Γ v is maximal.Every such increasing tree corresponds to a spreading order in which the rumor has spread from the source v. Consequently, if all spreading orders are equally likely (as is the case, e.g., for d-regular trees; see [10,11] and below), then the rumor center is the most likely rumor source or in other words the rumor center is the ML estimator for the rumor source.
It was shown in [10,11] that the rumor center has a surprisingly easy characterization.We will give two versions of this characterization.For the first, we need the following definition.
Then, Shah and Zaman proved the following result in [10,11].
Theorem 1.3 (Shah and Zaman; 2010 -Version 1).Let Γ be a tree.Then, every local rumor center is a rumor center.
The second version of Sha and Zaman's result (which is in fact only a more precise version of the first one) characterizes a rumor center by graph-theoretical properties.
Theorem 1.4 (Shah and Zaman; 2010 -Version 2).Let Γ be a tree with n nodes.Then, Moreover, if all inequalities are strict there is only one rumor center; otherwise, there are exactly two adjacent rumor centers.
The rumor source estimator is now defined as follows: if there is only one rumor center, then we choose this node; if there are two, we either ignore them or choose one of them uniformly at random.
The appropriateness of the rumor source estimator as defined above depends on the random model.In the definitions above, we considered ordered trees.This, however, might not be always appropriate, for instance if the underlying tree model has not a fixed but dynamic structure (e.g., if a node can spread the rumor to an arbitrary large number of neighbors; see the definition of recursive trees below).Then, considering unordered trees might be advantageous.For such trees, the above definition of R(v, Γ) has to be suitable modified.Unfortunately, the resulting characterization of nodes v which maximize the score becomes messier; see Section 5 of this paper for details.
(Grown) Simple Families of Increasing Trees.In this paragraph, we are going to explain the random tree models which will be used in this paper.First, consider the set of all rooted ordered increasing trees.A simple family of increasing trees consists of this set together with a sequence of weights (φ i ) i≥0 with φ 0 > 0 and φ i > 0 for some i ≥ 2.
For every tree T , we define its weight as where d(v) is the out-degree of v (= the number of edges of v which point away from the root).Moreover, set Then, a probability space on trees of size n is defined as follows: a tree T of size n has probability w(T )/τ n .The resulting family of random trees is called a simple family of random increasing trees.
We give some prominent examples.
• d-ary trees: for all i ≥ 0 (here, r > 1 is a real number).These three families contain, e.g., random binary trees (d-ary trees with d = 2) which are equivalent to random binary search trees from computer science and plane-oriented recursive trees (PORTs for short; these are generalized PORTs with r = 2); see the introduction and [2] for more explanation concerning the relevance of these two random tree models.
The above three families of random increasing trees are very special; see Panholzer and Prodinger [9].More precisely, it was shown in [9] that out of all families of random increasing trees they are the only ones for which the random model alternatively can also be obtained from a (natural) tree evolution process.Consequently, they have been nicknamed grown simple families of random increasing trees; see, e.g., Kuba and Panholzer [6].
We briefly describe the tree evolution process for the above three families.
• d-ary trees: the first node is the root and d empty leaves are attached; for the second node, one leaf is chosen uniformly at random and the node together with d empty leaves is placed there; for the third node, again one of the leafs is chosen uniformly at random, etc.
• Recursive trees: assume that a tree with n − 1 nodes was already constructed; for the next node, choose one of the nodes uniformly at random and add the next node as child.(Note that the tree here is unordered.) • Generalized plane-oriented recursive trees: again assume that a tree with n − 1 nodes was already constructed; for the next node, choose an existing node v with probability proportional to d(v) + r − 1 (d(v) is the out-degree of v) and add the next node as child.(The tree is again unordered; however, for r = 2, this random model is equivalent to the uniform model on rooted ordered increasing trees.) From these descriptions, it is obvious that the random model of d-ary trees is the uniform model on rooted ordered increasing d-ary trees and the random model for PORTs (generalized PORTs with r = 2) is the uniform model on rooted ordered increasing trees (as already mentioned above).Moreover, the random model for recursive trees is the uniform model on rooted unordered increasing trees.Thus, the rumor source estimator described in the previous paragraph is a ML estimator only for the former two families of random increasing trees but not for the latter (and also not for generalized PORTS with r = 2).
For later purpose, we need some more properties of the above three families of random increasing trees.Therefore, set Then, it is straightforward to show that τ (z) = φ(τ (z)).
Solving this differential equation for the above families gives the following: From this τ n is easy to derive by standard Taylor series expansion.

Results.
In this paragraph, we explain our results.Consider a random increasing tree with n nodes (as random model, we choose one of the three random models from the previous paragraph).We denote by C n the probability that the node obtained from the rumor source estimator is indeed the rumor source, where we use here the strategy that ties are ignored (since ties anyway occur only with asymptotic probability zero; see below).Then, we have the following result for grown simple families of random increasing trees.Thus, for d ≥ 3, (c) (Generalized PORTs) We have, with k r decreasing in r and Thus, for r > 1, 1 − ln 2 < k r < 1. the first node is the root and d 1 empty leafs are attached; for the next node, one leaf is chosen uniformly at random and the next node together with d 2 empty leafs is placed there; for the third node, again one leaf is chosen uniformly at random, etc.Such random tree models where the root is treated different have appeared before in literature; see for instance [7].
For this more general random tree model, we have the following result.
Theorem 1.7.We have, where I x (a, b) is the regularized incomplete beta function.
Note that for d 1 = d 2 = d, we obtain the above result for d-ary trees.Moreover, this result also contains one of the main results from [11], namely, d 1 = d and d 2 = d − 1 which are d-regular trees.
Remark 1.9.As observed in [12], Stirling's formula implies that Hence, recursive trees are also the limiting case of d-regular trees as d tends to infinity (this is of course not surprising).
Page 5/12 ecp.ejpecp.org We conclude the introduction with a brief sketch of the paper.In the next section, we prove Theorem 1.7.In contrast to [12] this will be done by using tools from Analytic Combinatorics (in [12] the authors used Pólya urn models and tools from the theory of stochastic processes).As a consequence, we will obtain part (a) of Theorem 1.5 and Theorem 1.8.In Section 3, we will prove part (b) of Theorem 1.5.In Section 4, we will prove part (c) of Theorem 1.7.Finally, in Section 5, we will give a brief discussion of the rumor center for rooted unordered trees.

Generalized d-ary Trees
In this section, we will prove Theorem 1.7.We start by fixing some notation.First, recall the definition of the trees from Theorem 1.7 (see the paragraph preceding the theorem).The number of these trees with n nodes will be denoted by τn .Moreover, we will denote by τ n the number of d 2 -ary trees with n nodes.Then, observe that τn = where j 1 , . . ., j d1 ≥ 0 are the sizes of the d 1 subtrees of the root and τ 0 := 1.Consequently, where τ (z) is as in the introduction and 1) . (2.2) Now, we turn to the probability of C n .By Theorem 1.4, we have Denote by I the size of the leftist subtree.Then, (2.4) In the sequel, we need the following standard lemma from analytic combinatorics.
Then, as n → ∞, where e k (α) is a polynomial of degree 2k.
Applying this result to (2.2) gives .
Similarly, applying the result to (2.1) yields .
3), we need to compute n/2≤j≤n−1 P (I = j), where P (I = j) is given by (2.4).To accomplish this task, we again use Theorem 1 and the expansions for τ n and τn from above.This gives n/2≤j≤n−1 Observe that We need the following lemma.
Lemma 2.2.For α > 0, where B(a, b) denotes the beta function.Then, as in the last section P (C n ) = 1 − P (one subtree of the root has size ≥ n/2).

Generalized Plane-oriented Recursive Trees
Since now the subtrees are ordered, we obtain P (one subtree of the root has size = j) where j ≥ n/2.
This proves the claimed limit result.The claimed properties of monotonicity and limit behavior of k r follow by simple calculus.
Remark 4.1.Alternatively to the above asymptotic derivation, one can also derive an exact expression (similar as in the last section).To give more details, note that from (4.1), one obtains that τ n = n!(−1) n+1 r n 1/r n .
Consequently, from (4.Plugging this into the above formula gives the following result.

Remark 1 . 6 .
Due to part (b), recursive trees can be seen as the limiting case of d-ary trees as d tends to infinity.Moreover, note that the detection probability increases from 0 (for d-ary trees with d = 2) all the way to 1 as one goes from d-ary trees to recursive trees to generalized PORTs.Part (a) of Theorem 1.5 will follow from a result on a more general family of random trees: the subtree of the root has d 1 subtrees and all other subtrees have d 2 subtrees (subtrees are possibly empty).The random model of this family of trees is as follows:

7 .
where B(x; a, b) denotes the incomplete beta function.Plugging this into (2.5) and (2.5) in turn into (2.3)yields Theorem 1.Proof of Theorem 1.5, part (a).Setting d 1 = d 2 = d and evaluating the expression obtained in Theorem 1.7 yields the claimed result for k d-ary .Moreover, the claims concerning monotonicity and limit behavior of k d-ary follow by simple calculus.Next, we consider the case of d-regular trees, where we set d 1 = d and d 2 = d − 1.

v 1 v 2 Figure 2 :
Figure 2: Every node of the tree on the left is a rumor center; the nodes v 1 and v 2 of the tree on the right are (non-adjacent) rumor centers.