in PROBABILITY Supercriticality of an annealed approximation of Boolean networks

We consider a model recently proposed by Chatterjee and Durrett as an "annealed approximation'' of boolean networks, which are a class of cellular automata on a random graph, as defined by S. Kauffman. The starting point is a random directed graph on $n$ vertices; each vertex has $r$ input vertices pointing to it. For the model of Chatterjee and Durrett, a discrete time threshold contact process is then considered on this graph: at each instant, each vertex has probability $q$ of choosing to receive input; if it does, and if at least one of its input vertices were in state 1 at the previous instant, then it is labelled with a 1; in all other cases, it is labelled with a 0. $r$ and $q$ are kept fixed and $n$ is taken to infinity. Improving a result of Chatterjee and Durrett, we show that if $qr > 1$ , then the time of persistence of activity of the dynamics is exponential in $n$ .


Introduction
Random boolean networks were introduced by Stuart Kauffman in 1969 [5] as models of gene regulatory networks.A gene regulatory network is a set of genes in a cell that iteratively communicate with each other, using their RNA transcripts as messages, and this communication affects each gene's activity.They are thus information networks and control systems for the activity of the cell.
Let us define Kauffman's model.The following definition depends on three parameters: n, r ∈ N with r < n and p ∈ (0, 1) (though Kauffman only considered the case p = 1/2).The letters a, b will denote two possible states of a gene.Let V n = {x 1 , . . ., x n } be the set of genes.For each x ∈ V n , we independently choose: • a set y(x) = {y 1 (x), . . ., y r (x)} ⊂ V n − {x}.The choice is made uniformly among all possibilities.y(x) is called the influence set of x.We define the set of directed edges E n by E n = {(y i (x), x) : x ∈ V n , 1 ≤ i ≤ r}.
• a function f x : {a, b} r → {a, b}.The values {f x (ω) : ω ∈ {a, b} y(x) } are chosen independently, with probability p to be equal to a and 1 − p to be equal to b.
Having made all these random choices, we define Φ : {a, b} Vn → {a, b} Vn by [Φ(η)](x) = f x η(y 1 (x)), . . ., η(y r (x)) and, given an initial configuration η 0 ∈ {a, b} Vn , we define a deterministic, discrete time dynamics (η t ) t=0,1,... by putting η t+1 = Φ(η t ), t ≥ 0. The dynamics is explained in words as follows: at each instant, and for each vertex x, we inspect the previous states in the influence set of x and from these, determine the state of x using the function f x .
A set Γ ⊂ {a, b} Vn such that Φ(Γ) = Γ and Φ(Γ ) = Γ for any proper subset Γ of Γ is called a periodic orbit of Φ.Since the state space is finite, every initial configuration η 0 is in the domain of attraction of a periodic orbit Γ (meaning that, for some t 0 , {η t : t ≥ t 0 } = Γ).Typical aspects of interest in random boolean networks are the number of these orbits, their stability, periods and the time to reach them.As thoroughly explained in [6], simulations of the model suggested the existence of two regimes, depending on the choice of parameters, in which drastically different behaviours arise.Among other important differences, in the ordered (or subcritical) regime, the lengths of the orbits grow slowly with n, whereas in the chaotic (or supercritical) regime, they grow rapidly with n.
In [2], Derrida and Pomeau proposed an "annealed approximation" of random boolean networks; in it, the random aspects of the network (namely, the underlying graph and the rules of evolution) are updated at each time step instead of remaining fixed.The process thus obtained is a Markov chain.The simplification destroys important correlations in the system, but allowed the authors to identify (through a not fully rigorous analysis of the transition kernel) a phase transition given by a curve that agrees with simulations, 2rp(1 − p) = 1 (the ordered regime corresponding to 2rp(1 − p) < 1).
In [1], Chatterjee and Durrett proposed a model which was an approximation to the activity of Boolean networks.The activity process associated to (η t ) is the process (η t ) t=0,1,... with state space {0, 1} Vn and given by where I is the indicator function.The idea in considering (η t ) rather than (η t ) is the possibility of identifying the phase transition in a process that is in some respects easier to study than the original process.Indeed, (ξ t ), the proposed approximation to (η t ) to be defined below, has the more tractable dynamics of a threshold contact process on a random graph (in particular, the graph is sampled only once, and not re-sampled as the dynamics advances).For (ξ t ), Chatterjee and Durrett proved the phase transition and identified the same critical curve as the one mentioned above, 2rp(1 − p) = 1.Their work allows for insight into this phase transition by an analogy between the flow of information in random boolean networks and the evolution of branching processes.
Let us now define the model of [1].We start with parameters n, r ∈ N with r < n and q ∈ (0, 1).Define the oriented random graph G n = (V n , E n ) exactly as before.We will now define a discrete time Markov chain (ξ t ) t≥0 with state space {0, 1} Vn and initial configuration ξ 0 ≡ 1.Its transition kernel is given by where ξ, ξ ∈ {0, 1} Vn .It will be useful to construct this Markov chain with a set of auxiliary Bernoulli random variables.Let {B x t : x ∈ V n , t ≥ 1} be a family of independent Bernoulli random variables with parameter q; given ξ t ∈ {0, 1} Vn , we put When B x t = 1, we say that x receives input at time t; therefore, a vertex is set to 1 if and only if it receives input at that time and at least one of its input vertices y 1 (x), . . ., y r (x) was set to 1 at the previous time.We sometimes abuse notation and associate ξ ∈ {0, 1} Vn with {x ∈ V n : ξ(x) = 1}.
In the comparison with boolean networks, q plays the role of 2p(1 − p), which is the probability that two independent random variables with distribution p • δ {a} + (1 − p) • δ {b} are different.See [1] for a more detailed explanation of the relationship between (ξ t ) and (η t ).
It is readily seen that the identically zero configuration is absorbing for the chain (ξ t ) and that it is eventually reached with probability 1.In [1], the authors study the time τ n it takes for this to occur and the typical proportion of sites that are in state 1 at times before τ n .By a simple comparison between the time dual of the model (as defined below) and a subcritical branching process, it is easy to show that, if qr < 1, then τ n behaves as log n, and this is associated to the ordered regime of random boolean networks.In [1], the following result is shown, characterizing the chaotic regime.Let ρ = ρ(q, r) denote the probability of survival for a branching process in which individuals have probability q of having r children and probability 1 − q of having none.Let |A| denote the cardinality of the set A. Finally, let P n denote a probability measure both for the choice of G n and for the family {B x t } (they are of course taken independently).Theorem.[1] If q(r − 1) > 1, then for every > 0 there exists c > 0 such that, as n → ∞, Under the more general hypothesis qr > 1, only a weaker result was obtained: the function e cn in the above infimum had to be replaced by a function of the form e cn b , for b, c > 0. The proof of this weaker result was established through a different method than that of the proof of the above theorem.In this paper we give a unified proof that establishes the stronger result Theorem 1.1.If qr > 1, then there exists c > 0 such that, for any > 0 and any sequence (t n ) with t n → ∞ and t n ≤ e cn , To explain why this is to be expected and, in particular, the link with the mentioned branching process, we introduce the time dual of the process.Fix a realization of Ên as the set of directed edges obtained by inverting the edges of E n and Ĝn = (V n , Ên ).Note that that is, in Ĝn each vertex "points to" r vertices.Fix T > 0 and put Bx, When ξA,T t (x) = 1 and Bx,T t = 1, we say that x gives birth at time t.Let us describe the dual dynamics in words.Given the configuration ξt , we go over every vertex that is in state 1 and determine which of them give birth at time t -for each vertex, this happens with probability q and independently.For each vertex x that gives birth at time t, we set the vertices y 1 (x), . . ., y r (x) to 1 at time t + 1. Vertices that are not set to 1 by this procedure are then set to 0. We then have the duality equation (recall that we take ξ 0 ≡ 1).The above equality holds since both events are equal to By taking A = {x} for each x ∈ V n in the duality equation, we see that, under P n , |ξ T | and |{x : ξ{x},T T = ∅}| have the same distribution.Since we will mostly work with the dual process, we drop the superscript T and assume that ξA t is defined for all t ≥ 0 with the evolution rule explained above.We also write ξx t instead of ξ{x} t .The convergence in Theorem 1.1 can be re-stated as Now, assume that n is very large with respect to r.If g is another integer that is much larger than r and much smaller than n, then with high probability, the subgraph of Ĝn with vertex set {z ∈ V n : for some k ≤ g and z 1 , . . ., and edge set equal to the set of edges of Ên that start and end at vertices in the above set will simply be a directed tree of degree r rooted in x.Conditioning on the event that this subgraph is indeed a tree, the evolution of | ξx t | up to time g will be exactly that of the branching process mentioned before Theorem 1.1.In addition, it is not difficult to see that, without any conditioning, | ξx t | is stochastically dominated by such a process.
These remarks clarify why the model exhibits two phases in exact correspondence with the branching process.If the expected offspring size qr < 1, then ξx t dies out faster than the corresponding subcritical branching process, and the primal ξ t rapidly reaches the zero state.On the other hand, if qr > 1, the above theorem states that the system survives for a time that is exponentially large in n, characterizing the supercritical regime.
The structure of our proof is similar to that of [1].First, using the comparison with the branching process and a second moment argument, we show that with probability tending to 1 as n → ∞, the set of vertices S = {x : | ξx sn | > k n }, where k n = (log n) 2 and s n = (log log n) 2 , has size close to ρ • n (see Proposition 2.1).Second, in Proposition 2.2, we show that with probability tending to 1 as n → ∞, the graph Ĝn is "fertile" in the following sense.For any choice of A ⊂ V n with |A| ≥ (log n) 2 , the process ξA t defined on Ĝn has probability larger than 1/n 2 of remaining active up to time e cn , for some fixed constant c.We can then use a simple union bound to argue that with high probability, for every x in S, ( ξx t ) remains active until time e cn .
Our main contribution is Proposition 2.2; let us briefly explain the ideas that go into its proof.Given A ⊂ V n , suppose we reveal, one by one, the elements of the set and edge set equal to the edges of Ên which start and end at vertices in this set.For most choices of A, B(A, g) is just a disjoint union of |A| directed trees, so that {| ξA t |} 0≤t≤g is exactly a branching process.However, for some choices of A, when revealing A 1 , • • • , A g , we will see some "collisions", that is, some vertices will be found more than once.We say that A is expansive if the number of collisions is not too large, so that {| ξA t |} 0≤t≤g is not too far from the branching process and consequently, | ξA g | is very likely to be larger than |A| (see Lemma 2.4).We then show that, with high probability, for some c > 0, there is no set A ⊂ V n with (log n) 2 ≤ |A| ≤ cn that is not expansive (Lemma 2.5).It is then quite easy to put Lemmas 2.4 and 2.5 together to obtain Proposition 2.2.

Proof of Theorem 1.1
In this section, we will exclusively work with the dual process.Let us present the notation we will use.For fixed n, P n is a probability measure under which the random graph Ĝn is defined; as explained in the Introduction, Ĝn is a directed graph in which each vertex x "points to" r distinct vertices y 1 (x), . . ., y r (x).Ĝn will denote the (finite) set of possible realizations of Ĝn .For a fixed realization of the graph Ĝn , P Ĝn is a probability measure under which independent Bernoulli(q) variables { Bx t : t ≥ 0, x ∈ V n } are defined, and thus the family of processes {( ξA t ) t≥0 : A ⊂ V n } are all defined by the rule (1.1).Finally, P n is the annealed probability measure: under P n , we first sample the graph Ĝn (with the probability measure P n ) and then the processes {( ξA t ) : A ⊂ V n } on Ĝn (with the probability measure P Ĝn ).
In all results and proofs that follow, we assume that qr > 1.We start with two propositions that together will yield Theorem 1.1.Proposition 2.1 is proved essentially by a repetition of arguments in [1]; we include a proof for completeness.For any > 0, we have Proof.Let (Z t ) t=0,1,... be the branching process with Z 0 = 1 and the offspring distribution that gives mass q to r and 1 − q to 0. Also let H = {Z t = 0 ∀t}, ρ = P(H) and M t = Zt (qr) t .By Theorem 5.3.9 and Exercise 5.3.12 in [3], our assumption that qr > 1 implies ρ > 0 and the fact that M t almost surely converges to a limit M , which is strictly positive on H and identically zero on H c .Let ρ n = P(Z an > b n ) = P (M an > b n /(qr) an ).For a set of vertices A in the graph Ĝn , let y (0) (A) = A, y (1) (A) = y(A) = {y i (x) : x ∈ A, 1 ≤ i ≤ r} and y (k+1) (A) = y(y (k) (A)) for k ≥ 0. Given a vertex x and R ∈ N, we define the ball B(x, R) as the subgraph of Ĝn with vertices ∪ R k=0 y (k) (x) and all the edges of Ên that start and end at these vertices.Let F (x, R) denote the event that B(x, R) has no cycles and F (x, y, R) the event that B(x, R) and B(y, R) have no cycles and are disjoint.We claim that We will prove only that the first limit is 1, and it should be clear that a similar proof works for the second.We explore the ball B(x 1 , a n ) level by level: we reveal the vertices of y(x 1 ) one by one (in any order we desire), then the vertices of y (2) (x 1 ) one by one, and so on, and say that a collision occurs if at some point before having revealed all vertices in B(x 1 , a n ), we reveal a vertex that had already been revealed at an earlier step; the exploration is then stopped and said to have been unsuccessful.The exploration is thus successful if and only if F (x 1 , a n ) occurs.Note that the maximum number of vertices revealed in the whole exploration is an i=0 r i ≤ r an+2 .Also, at any point in the exploration, there are at least n − r choices for the next vertex (since for any x ∈ V n , y 1 (x), . . ., y r (x) are necessarily all distinct and different from x), so the probability that the next vertex results in a collision (and thus an unsuccessful exploration) is less than r an+2 /(n − r).The probability that the exploration is unsuccessful is thus less than (r an+2 ) 2 /(n − r), which tends to 0 as n → ∞ since a n / log n → 0.
We will write For t > 0, let us say that a graph Ĝn ∈ G n is t-fertile if for every A ⊂ V n with |A| ≥ k n , P Ĝn ξA t = ∅ < n −2 . (2.7) Let H n (t) denote the set of graphs in Ĝn that are t-fertile.

Proposition 2.2.
There exists c > 0 such that lim n→∞ P n H n (e cn ) = 1.
Proving this result takes most of our effort.We postpone the proof and first show how the two propositions are used to establish the main theorem.
Proof of Theorem 1.1.We will use the fact that Let c be the constant of Proposition 2.2.Fix > 0 and a sequence (t n ) as in the statement of the theorem.If t ≤ e cn , by (2.8) we have (2.9) We can now apply Proposition 2.1 with a n = min(t n , s n ) and b n ≡ 0; the right-hand side thus vanishes as n → ∞.Thus, (2.10) and (2.11) together yield (1.2).
We now need to prove Proposition 2.2; three preliminary results will be needed: Lemmas 2.3, 2.4 and 2.5.
We now give some definitions and notations.
T m is thus the disjoint union of m rooted, directed trees, each with g generations above the root.If we can go from σ to σ by following a path of oriented edges of the tree, we say that σ is an ancestor of σ and that σ is a descendant of σ.
In words, vertices are inspected in order; the roots are all set to 0 and the other vertices are set to 0 either if one of their ancestors has already been marked with a 1 or if their image under the map σ → z σ has never been seen before; otherwise they are set to 1. Figure 1 presents an example of the effect of the algorithm.
As will become clear in the proof of Lemma 2.4, an essential property of this construction is the fact that σ → z σ injectively maps the set {σ ∈ T m : ψ(σ) = 0 and ψ(σ ) = 0 for every ancestor σ of σ} onto the vertex set of B(A, g).Note that this property does not depend on the value of ψ at any vertex σ such that ψ(σ) = 1 for some ancestor σ of σ .On the other hand, we will want to argue that with high probability there are few vertices of T m where ψ is equal to 1.This is why we set the algorithm to "artificially" set ψ to 0 at all vertices that descend from a vertex σ such that ψ(σ) = 1; these should be understood as "dummy" 0's, that is, they have no counterpart in the geometry of B(A, g).Proof of Proposition 2.2.Assume that n is large enough that δk n > 1 and that Ĝn satis- fies for every A ⊂ V n with k n ≤ |A| ≤ K n , ψ(A) is expansive.
(2.18) Let c = c1κ 2 , where c 1 and κ are the constants of the two previous lemmas.We will prove that Ĝn is e cn -fertile, that is, we will verify that (2.7) holds with t = e cn .Together with Lemma 2.5, this will imply the result we need.when n is large enough, proving (2.7).

Figure 1 :
Figure 1: Example of the algorithm.Here r = 2, g = 2.The numbers in the arrows in the left diagram serve to distinguish y 1 (x) and y 2 (x) for each vertex x

;
here C is a constant that only depends on r, g and δ, and whose value has changed in the last inequality.Now choose κ such that Cκ δ < 1/e.The probability in the statement of the lemma is then less thanKn i=kn e −i ≤ κne −(log n) 2 n→∞ − −−− → 0.