A Sharp Threshold for Bootstrap Percolation in a Random Hypergraph

Given a hypergraph $\mathcal{H}$, the $\mathcal{H}$-bootstrap process starts with an initial set of infected vertices of $\mathcal{H}$ and, at each step, a healthy vertex $v$ becomes infected if there exists a hyperedge of $\mathcal{H}$ in which $v$ is the unique healthy vertex. We say that the set of initially infected vertices percolates if every vertex of $\mathcal{H}$ is eventually infected. We show that this process exhibits a sharp threshold when $\mathcal{H}$ is a hypergraph obtained by randomly sampling hyperedges from an approximately $d$-regular $r$-uniform hypergraph satisfying some mild degree and codegree conditions; this confirms a conjecture of Morris. As a corollary, we obtain a sharp threshold for a variant of the graph bootstrap process for strictly $2$-balanced graphs which generalises a result of Kor\'{a}ndi, Peled and Sudakov. Our approach involves an application of the differential equations method.


Introduction
Given a hypergraph H, the H-bootstrap process begins with an initial set of infected vertices of H (a vertex that is not infected is healthy) and, in each step, a healthy vertex becomes infected if there exists a hyperedge of H in which it is the unique healthy vertex. The set of initially infected vertices is said to percolate if every vertex of H is eventually infected. This process was first studied by Balogh, Bollobás, Morris and Riordan [9] and is motivated by numerous connections to other variants of bootstrap percolation; see Subsection 1.1.
The focus of this paper is on estimating the critical probability of the H-bootstrap process, denoted p c (H), which is the infimal density at which a random subset of V (H) is likely to percolate. More formally, where, for a finite set X and p ∈ [0, 1], we let X p denote a random subset of X obtained by including each element with probability p independently of one another. Specifically, we are interested in estimating the quantity p c (H q ), where H is a "sufficiently well behaved" hypergraph and, for q ∈ [0, 1], H q denotes the hypergraph obtained from H by including each hyperedge of H independently with probability q. Our main result applies to all r-uniform hypergraphs satisfying some mild conditions. To precisely state these conditions we require a few standard definitions. The following theorem is a corollary of our main result (Theorem 1.3 below), but captures the main essence of the paper. In particular, it confirms (in a strong form) a conjecture of Morris [33]. Perhaps the most interesting feature of this theorem is that, for a certain range of q and a broad class of hypergraphs H, the main term of the asymptotics of the critical probability depends only on the value of r and not on the specific underlying structure of H. The following definition is useful for stating the full version of our result. Definition 1.2. Given an integer r ≥ 2 and real numbers d, ν > 0 and ρ ∈ [0, 1], we say that an r-uniform hypergraph H is (d, ρ, ν)-well behaved if the following conditions hold: Observe that the conditions in Theorem 1.1 simply amount to H being r-uniform, d-regular and d, d −s , d β -well behaved. We are now ready to state the main result of the paper. Here, and throughout the paper, log denotes the natural (base e) logarithm. Theorem 1.3. For fixed r ≥ 3 and real numbers c, α, β, ε > 0 there exists a positive constant K = K(r, c, α) such that, for d sufficiently large, if H is an r-uniform d, log −K (d), d β -well behaved hypergraph, then • if c r−2 α < (r−2) r−2 (r−1) r−1 , then P V (H) c/d 1/(r−1) percolates in H α/d 1/(r−1) ≤ ε. • if c r−2 α > (r−2) r−2 (r−1) r−1 , then P V (H) c/d 1/(r−1) percolates in H α/d 1/(r−1) ≥ 1 − ε.
Remark 1.4. The value of K is simply chosen large enough so that log K (d) grows faster than any relevant constant power of log(d) that appears throughout the proof. We have not attempted to optimise the dependence of K on r, c and α. Theorem 1.3 is stronger than Theorem 1.1 in two ways. Firstly, it applies to a much wider class of hypergraphs (allowing larger codegrees, neighbourhood intersections, etc.) and, secondly, it implies that the probability of percolation transitions from close to zero to close to one within a small window of the critical probability; i.e. the process exhibits a sharp threshold.

Connections to Other Bootstrap Processes.
The H-bootstrap process is motivated by its connection to the so called graph bootstrap process introduced by Bollobás [15] in 1968 (under the name "weak saturation"). Given graphs G and F , the F -bootstrap process on G starts with an initial set of infected edges of G and, at each step, a healthy edge becomes infected if there exists a copy of F in G in which it is the unique healthy edge. Clearly, the F -bootstrap process on G is equivalent to the H G,F -bootstrap process where H G,F is a hypergraph in which each vertex of H G,F corresponds to an edge of G and the hyperedges of H G,F are precisely the edge sets of copies of F in G.
The original motivation behind the F -bootstrap process stemmed from its connections to the notion of "saturation" in extremal combinatorics. Because of this, most of the known results on the F -bootstrap process are extremal in nature (see, e.g. [2,27,28,[34][35][36]). Balogh, Bollobás and Morris [8] were the first to analyse the behaviour of the graph bootstrap process relative to a random initial infection. This line of research is motivated by connections between the F -bootstrap process and the well-studied r-neighbour bootstrap process which was introduced by physicists Chalupa, Leath and Reich [19] in the late 1970s and has found many applications to modeling real-world propagation phenomena; for more background see, e.g., [1,3,4,6,7,17,18,24,26,34]. The central probabilistic problem for the F -bootstrap process in G is to estimate the critical probability defined by p c (G, F ) := inf {p ∈ (0, 1) : P(G p percolates) ≥ 1/2} where G p is the graph obtained from G by including each edge of G with probability p independently of one another. Following the initial paper of Balogh, Bollobás and Morris [8], probabilistic questions regarding the F -bootstrap process have been studied by Gunderson, Koch and Przykucki [23], Angel and Kolesnik [5] and Kolesnik [30].
The topic of the current paper (i.e. the conjecture of Morris [33]) was initially inspired by a result of Korándi, Peled and Sudakov [31] which is essentially equivalent to the special case of Theorem 1.1 where H = H Kn,K 3 and α = 1/2. As an application of Theorem 1.1, we generalise their result to a wider class of graphs.
For a graph F with at least two edges, the 2-density of F is defined to be d 2 (F ) := |E(F )|−1 |V (F )|−2 . A graph F with at least two edges is said to be 2-balanced if d 2 (F ) ≥ d 2 (F ) for every proper subgraph F of F with at least two edges. If the inequality is strict for every such F , then we say that F is strictly 2-balanced. Given a graph F , observe that H Kn,F is |E(F )|-uniform and d(n, F )-regular for some integer d(n, F ) such that d(n, F ) = Θ n |V (F )|−2 (the constant factor is related to the number of automorphisms of F which fix an edge). We will derive the following result from Theorem 1.1. (r − 2) α 1/(r−2) (r − 1) (r−1)/(r−2) .
We remark that, for F strictly 2-balanced, Theorem 1.5 can be viewed as a sharp threshold for a variant of the graph bootstrap process where we first "activate" each copy of F in K n with probability Θ n −1/d 2 (F ) independently of one another and then, given a random initial set of infected edges of K n , at each step of the process a healthy edge becomes infected if it is the unique healthy edge in an active copy of F .
(i.e. concentration inequalities) that we will apply. A few preliminary lemmas will be proved in Section 4 before moving on to the main meat of the proof. The proof of Theorem 1.3 is divided into four parts which are contained in Sections 5, 6, 7 and 8. Finally, in Section 9 we use Theorem 1.1 to derive Theorem 1.5 and a generalisation of it to "strictly k-balanced hypergraphs" (defined in the section itself).

Outline of the Proof
Rather than attempting to apply the differential equations method to the H q -bootstrap process directly (which is completely deterministic and, therefore, ill-suited to the method), we will analyse two different random processes in which the hypergraph H q is revealed iteratively and the infection spreads in a way which depends on the structure of H q unveiled so far. These processes, to be defined shortly, are equivalent to the H q -bootstrap process in the sense that the final set of infected vertices is the same. The purpose of this section is to provide a fairly detailed outline of the proof of Theorem 1.3; in particular, we will describe the two random processes that we will analyse and will define (and motivate) the variables that we wish to track. 2 The analysis is divided into two phases. The first phase involves an application of the differential equations method and is done in essentially the same way regardless of whether c r−2 α is smaller or larger than (r−2) r−2 (r−1) r−1 . The analysis in the second phase differs depending on which of these cases we are in. Throughout both phases, we will track a family of variables which will allow us to determine, with high probability, whether or not the initial infection percolates.
Before diving deeply into the details, we fix some parameters and notation that will be used throughout the paper. Let r, c, α, β > 0 be fixed, let K be large with respect to r, c and α and, for large d, let H be an r-uniform d, log −K (d), d β -well behaved hypergraph (defined in Definition 1.2). Set N := |V (H)|. Note that, by property (e) of Definition 1.2, we have N ≤ d β . Since each set of size r contains exactly r sets of size , by the pigeonhole principle, for 1 ≤ ≤ r − 1 we have,
2 Throughout the paper, when we say that we track a random variable, it will always mean one of two things: either (a) we show that it is concentrated or (b) we show that it satisfies a certain upper or bound with high probability (i.e. with probability tending to 1 as V (H) → ∞).
2.1. The First Phase. Here we define the random hypergraph process that we will analyse during the first phase. At time m, we will have a hypergraph H(m) formed by the unsampled hyperedges of H and a set of infected vertices I(m) (where H(m) and I(m) will be formally defined below). Both H(m) and I(m) depend on the outcomes of the process up to this point. Due to the nature of the H-bootstrap process, it should not come as a surprise to be told that the most important variable for us to track is the cardinality of the collection of hyperedges containing a unique healthy vertex; to this end, define As a slight abuse of notation, for a collection X(m) of subhypergraphs or vertices of H(m) we will often write |X(m)| simply as X(m) (for example, we will write Q(m) to mean |Q(m)| and I(m) to mean |I(m)|). In all cases, it should be clear from context whether we are referring to the collection X(m) or its cardinality.
We will run the first phase process up to some time M at which point, with high probability, Q(M ) will be either large enough or small enough to be able to determine if the infection is likely to percolate by other methods in the second phase (summarised in Subsection 2.2). Our goal in the first phase will be to show that Q(m) stays close to its expected trajectory with high probability. As was described in Subsection 1.2, this will involve finding a suitable collection of random variables containing Q(m) whose "one step changes" depend on other variables in the collection and to apply martingale concentration inequalities to get control over all of these variables simultaneously.
It is sometimes more convenient to think of our random variables as depending on a continuous variable t rather than the discrete variable m. The scaling that we will use when moving between discrete and continuous settings is (2.4) t = t m := m/N, for m ≥ 0. Throughout the paper we will alternate between the discrete and continuous settings without further comment. In the first phase, we will only consider values of m up to (2.5) O (N ) if c r−2 α < (r−2) r−2 (r−1) r−1 , O (N log(d)) if c r−2 α > (r−2) r−2 (r−1) r−1 . The fact that m does not get too large during the first phase will be used in some of the heuristic discussions which follow.
At time zero, each vertex is infected with probability p and, provided that Q(m − 1) = ∅, at the mth step of the process a new vertex is infected with probability q. So if Q(m − 1) = ∅ then we would expect I(m) ≈ pN + mq. Letting M be the number of steps we run the first phase for, using the Chernoff bound (Theorem 3.2) we will prove the following (see Lemma 6.5 of Section 6). We will use the fact that H is well behaved to show that I(m) behaves similarly to a random infection in which each vertex is infected independently with probability p + qt m (which we shall call a uniformly random infection of density p + qt m ), in the sense that Q(m) is close to the value that one would expect in this case. First, let us determine the value of Q(m) which we would expect if I(m) were a uniformly random infection of density p+qt m . If this were the case, then a particular hyperedge of H would contain r − 1 infected vertices with probability r · (p + qt m ) r−1 (1 − (p + qt m )). Since H is roughly d-regular and p + qt m = o(1), we have |E(H)| ≈ d · N/r. Also, recall that, at each step, we sample precisely one open hyperedge which is immediately discarded from the hypergraph. Thus, we would expect The main point of the first phase is to show that, up to a small error, Q(m) follows this trajectory (see Lemma 2.14). Before stating this more precisely, let us discuss the motivation behind the choice of M . Define Observe that γ (t) = α(r − 1)(c + αt) r−2 − 1. Therefore, since c, α > 0, we have that γ (t) has exactly one real root if r is odd and two real roots (one positive, one negative) if r is even. The rightmost root of γ (t) is a local minimum for γ(t) located at which is negative if and only if c r−2 α < (r−2) r−2 (r−1) r−1 . From this and the fact that γ(0) > 0, we see that, if c r−2 α < (r−2) r−2 (r−1) r−1 , then γ(t) has precisely two distinct positive real roots, say T 0 and T 1 where 0 < T 0 < t min < T 1 . Coming back to the random process, this tells us that if c r−2 α < (r−2) r−2 (r−1) r−1 , then we expect the number of open hyperedges to become very small as t m approaches T 0 from the left. What we will do in this case is track our variables until γ(t) < ζ where ζ is a constant chosen small with respect to r, c and α; the value of ζ is given in Definition 7.3. At this point we initiate the second phase in which we prove that, with high probability, percolation does not occur.
On the other hand, if c r−2 α > (r−2) r−2 (r−1) r−1 , then γ (t min ) > 0 and γ(t) has no positive real roots. Since p + qt m = o(1) for all values of m that we consider in the first phase (see (2.11)), we expect our supply of open hyperedges not to run out (in fact, when t m > t min , we expect the number of open hyperedges to be typically increasing). What we will do in this case is track the above variables until step N log(N ) α = O (N log(d)), at which point the number of open hyperedges will be large enough that we can deduce that percolation occurs with high probability in the second phase. See Figure 1 for examples of how the trajectory of Q(m) depends on the relationship between α, c and r. Remark 2.9. Let us briefly discuss why we have chosen to focus on values of p and q of order d −1/(r−1) . As we have argued above, if the infection at time m resembles a uniform infection of density p + qt m , then we expect the variable Q(m) to be roughly The nice thing about considering p and q of order d −1/(r−1) is that the expression inside the square brackets becomes a function of t m only. Thus, as long as t m is bounded by a constant, open hyperedges are both being created and discarded at a constant rate independent of d. It would of course be natural (and interesting) to consider more general values of p and q, but one would likely require a different approach. 3 3 Actually, one can apply Theorem 1.3 directly to get a result in the case that d −1+o(1) ≤ q d −1/(r−1) . Choose q and q so that q q = q and q = (q d) −1/(r−1) . Given a d-regular hypergraph H satisfying some appropriate conditions, one can deduce that, with high probability, the random hypergraph H q satisfies the conditions of Theorem 1.3 with (1 + o(1))q d playing the role of d. Thus, we get a sharp threshold for bootstrap percolation in (H q ) q = Hq. The case q d −1/(r−1) , on the other hand, is likely to require different ideas.
As we said above, the main aim of the first phase is to show that for 0 ≤ m ≤ M , the value of Q(m) is within a small error term of γ(t m ) · N . The following function describes the relative error that we will allow ourselves in these bounds, (2.10) (t) := (t + 1) K/10 Note In what follows, we write an interval of the form [(1 − )g(t), (1 + )g(t)] as (1 ± )g(t) for brevity. To summarise, we track the process for M steps, where We are now ready to formally state the bounds we will prove on Q(m) in the first phase.
Lemma 2.14. With high probability the following statement holds. For all 0 ≤ m ≤ M , One way to prove that Q(m) is controlled in this way, or indeed to prove bounds for any of our variables, involves determining their expected and maximum change (conditioned on what has previously occurred during the process) at each time step and applying a martingale concentration inequality. Before thinking in more detail about the expected change of Q(m) we introduce some notation that will be helpful.
For each v ∈ V (H) \ I(m) we write For u = v ∈ V (H), the sets Q u (m) and Q v (m) are disjoint. Note that, Let us now think about the expected change of Q(m). Firstly, which open hyperedges from Q(m) are not present in Q(m + 1)? At each step, the hyperedge e we sample from H(m) is deleted and is not present in H(m + 1). Also, with probability q, the unique healthy vertex v of e becomes infected and so all the hyperedges in Q(m) whose unique healthy vertex is v are no longer open. This results in a loss of (Q v (m) − 1) hyperedges (in addition to e). Now let us consider how we gain a new open hyperedge. This occurs when some hyperedge e is successfully sampled, and the vertex v of e that becomes infected is contained in a hyperedge e with exactly r − 2 infected vertices (the hyperedge e will now be open).
Observe that for each vertex v ∈ V (H)\I(m), the probability an open hyperedge containing v is sampled is Q v (m)/Q(m). Given the above discussion, we can express the expectation of Q(m + 1) − Q(m) conditioned on H(m) and I(m) as So to be able to determine (2.16) we can see that we would need to have control over Y r−2 v (m). So let us consider how a new copy of Y r−2 is created at a time step. One way a member of (m) can be created is from a pair of hyperedges {e 1 , e 2 } ⊆ E(H) where: v ∈ e 1 \ e 2 ; e 2 is open; e 1 has exactly r − 3 infected vertices and intersects e 2 on its unique healthy vertex; and e 2 is successfully sampled at time m. Thus to determine the expected change of Y r−2 v (m), we need to also have control over this family Z of pairs. And similarly, to do this there are a number of other variables that we must keep track of.
To summarise this train of thought, to prove Lemma 2.14 we must have control over a number of families of variables; in particular, variables of the two types described above. We briefly remark that, in our proof, we do not explicitly calculate the expected change of Q(m) in the manner we have alluded to above. In fact, we show that having control over a more general family of variables will imply the required bounds on Q(m) in a different way (see Lemma 4.1). However to prove bounds on our other variables, we do calculate their expected and maximum changes. The point of performing this thought exercise on Q(m) was to illustrate its interdependence on a number of other variables and to motivate the following discussion.
In order to describe formally the families of variables that we wish to track, it is helpful to introduce a few definitions. Each variable that we wish to control counts the number of "copies" of some particular subhypergraph F ⊆ H(m) such that these copies of F are "rooted" at a particular subset S ⊆ V (H) (in the sense that these vertices are contained within the copy) and some particular vertices of these copies are infected (i.e. contained in I(m)). We begin by introducing some notation to describe the particular structures (which we call configurations) we are interested in counting "copies" of. Definition 2.17. A configuration is a triple X = (F, R, D), where F is an r-uniform hypergraph in which every vertex is contained in at least one hyperedge and R and D are disjoint subsets of V (F). The vertices of R are called the roots of X, the vertices of D are called the marked vertices of X, and the vertices of V (F) \ (D ∪ R) are called the neutral vertices of X. Now that we have a good way to describe the things we are interested in counting, we will formally define what we mean by a copy of a configuration. Definition 2.18. For m ≥ 0, given a configuration X = (F, R, D) and a set S ⊆ V (H), a copy of X in H(m) rooted at S is a subhypergraph F of H(m) such that there exists an isomorphism φ : F → F with φ(R) = S and φ(D) ⊆ I(m). Also define X S (m) to be the collection of copies of X in H(m) rooted at S. We denote Take note that in H(m), a copy of a configuration (F, R, D) can contain elements of I(m) apart from those in φ(D). In particular, it is even possible for the set φ(R) to contain elements of I(m) (despite the fact that R and D are disjoint).
Before discussing specific families of configurations, let us discuss heuristically how many copies we expect there to be of some fixed configuration X = (F, R, D) in H(m) rooted at S ⊆ V (H). IfX S is the number of copies of (F, R, ∅) rooted at S in H, then if I(m) is a uniformly random infection of density p + qt m , we would expect as each vertex is independently infected with probability p+qt m . That is, each infected vertex contributes a factor of at most log O (1) ). Now, heuristically, how do we boundX S ? In our proof, for some families of configurations we will only require an upper bound, but for some we need to be more careful and also need a lower bound. All the configurations (F, R, D) that we are interested in tracking during the first phase will satisfy the following properties: F is connected and contains at most r + 1 hyperedges, no vertex of F is contained in the intersection of more than two hyperedges, every root is contained in a unique hyperedge, and |R| ≥ 1.
So suppose X satisfies these conditions. To find a bound onX S , we can break F up into its hyperedges e 1 , . . . , e k , where |e 1 ∩R| ≥ 1 and each e i intersects <i e , and bound the number of choices for each hyperedge using properties (c) and (d) of Definition 1.2. Let S 1 , . . . , S k be a fixed partition of S such that |S i | = |R ∩ e i \ ∪ <i e |. We will bound the number of members ofX S in H such that S i ⊆ e i \ <j e . As there are O(1) such partitions of S, the total number of members ofX S will be a constant factor away from this.
First consider the number of ways to choose e 1 . By conditions (a), (b) and (c) of Definition 1.2, if |R| = 1, then the number of choices is within (1 ± log −K (d)) · d and, if |R| ≥ 1, Similarly, we can then bound the number of ways to choose e 2 . By our choice of hyperedge order, e 2 intersects e 1 . Given a choice of e 1 , there are O(1) ways e 2 can intersect it. Defining b := |e 2 ∩ (R ∪ e 1 )| (by assumption on hyperedge order b ≥ 1), by conditions (a) and (c) of Definition 1.2 there are at most choices for e 2 . When 2 ≤ b ≤ r − 1, using condition (c) of Definition 1.2 gives a stronger bound of choices for e 2 . Given these bounds, the number of choices for e 2 can be thought of as being O d a r−1 , where a is the number of vertices of e 2 that are not in e 1 or R (i.e. the number of "new" vertices). We can bound the number of choices for e 3 , . . . , e k analogously. A more careful version of this argument will be applied later to give the bound in Lemma 2.27.
So, heuristically, for most configurations X, up to a log O(1) (d) factor we generally expect there to be about d copies of X rooted at S in H(m). One way of thinking about this is to imagine each hyperedge contributes a factor of d, but for each vertex that is either in the intersection of two hyperedges or not neutral we lose a factor of d 1 r−1 (up to some powers of log(d)). Alternately, (again up to some powers of log(d)) we get a factor of d 1 r−1 for each neutral vertex in the configuration. It will be helpful to bear this rough heuristic in mind throughout the calculations which come later.
We now introduce our most important family of configurations, the Y configurations. These are a generalisation of the two variables Y r−2 v and Z discussed above. The control we have over these more general variables in H(m) dictates the bounds we can prove on Q(m) (see Lemma 4.1) and on the Y configurations in H(m + 1). The two further sets of variables we will discuss below (see Definitions 2.25 and 2.29) do affect how the Y configurations behave, but due to the codegree conditions on H (see Definition 1.2), we can ensure that they only contribute lower order terms.
In general, the Y configurations consist of a hyperedge e containing a root and a fixed number of marked vertices, with open hyperedges that are disjoint from one another and only intersect e on their unique unmarked vertex. See Figure 2 for a visualisation of some of these configurations in the case r = 4. See also Figure 3 for some examples of copies of Y configurations in the case r = 6.
. Some examples of copies of Y configurations rooted at v in the case r = 6. The infected vertices are shaded, the healthy vertices are white and the central hyperedge is drawn with a thick outline. Each copy is labelled by which configuration it is a copy of. Notice that a copy of a configuration could have more infections than marked vertices in the configuration itself, as is demonstrated by the third example.
We now formally define the family of Y configurations.
Definition 2.19. For non-negative integers i and j such that i + j ≤ r − 1, let Y i,j denote the configuration (F, R, D) such that F is a hypergraph containing a hyperedge e, called the central hyperedge, such that (a) e contains exactly i marked vertices, (b) there is a unique root and e is the only hyperedge of F containing the root, (c) F has exactly j non-central hyperedges, (d) for each non-central hyperedge e we have |e ∩ e | = 1 and the unique element of e ∩ e is neutral, (e) every vertex of V (F) \ e is marked, and (f) no two non-central hyperedges intersect one another. However, as we will see in a moment, we should be able to prove concentration for Y i,j v (m) when i + j = r − 1. This is why we need to track all of the configurations Y i,j v (m) individually and cannot just bound where Q is an upper bound on Q v (m) which holds for all v ∈ V (H) \ I(m) with high probability. That is, we would not be able to get a good enough bound on Q to prove bounds as tight as we would like on Y i,j v (m). Now we discuss how we expect the variables Y i,j v (m) to behave. Consider first the variable Y i,0 v (m) for 0 ≤ i ≤ r − 2 and v ∈ V (H). By properties (a) and (b) of Definition 1.2, every vertex has degree between (1 − o(1)) d and d. Therefore if I(m) is a uniformly random infection of density p + qt m , then we would expect Before stating the bounds we wish to prove on the Y configurations, it is helpful to introduce the following notation which will be used throughout the paper. Define We will prove the following.
Lemma 2.24. With high probability the following statement holds.
To track the Y configurations and thus prove Lemma 2.24 we will bound the expectation of given H(m) and I(m). Let us consider how a new copy of Y i,j rooted at v is created. Let Y i,j = (F, R, D). For each new copy Y of Y i,j rooted at v, there exists some u ∈ D such that Y is created from a copy of a configuration X = (F, R ∪ {u}, D \ {u}), such that the image of u in H(m) is the unique healthy vertex of a hyperedge that gets successfully sampled. So in fact it is created from a copy of configuration X = (F , R, D ), where F is the union of F and one new hyperedge e whose intersection with F contains some u ∈ D, and D is the union of D \ {u} and e \ {u}. See Figure 4 for some examples of this. Either e intersects F on precisely one vertex, or e intersects F on several vertices, including vertices of D. In the first case, as we will see in Section 6, X can be expressed as a combination of Y configurations. However, the family of Y configurations does not contain the type of intersections we get in the second case, so we need to control another family of variables which includes those with such intersections. In Section 6.3 the calculation of the expectation given H(m) and I(m) will be presented in full detail; for now we have motivated the introduction of our next family, the secondary configurations. These configurations are so called because they will not usually contribute to the main order term in our calculations, but still need to be controlled. See Figure 5 for some examples of these configurations. R R R R R Figure 5. Three examples of secondary configurations X in the case r = 4. The marked vertices are shaded, the neutral vertices are white, the root is labelled with an R and the central hyperedge is drawn with a thick outline.
The family of secondary configurations contains configurations with multiple roots and more complicated intersections of hyperedges than the Y configurations. Due to the codegree conditions on H (see Definition 1.2), we are able to prove stronger upper bounds on secondary configurations than on Y configurations. Therefore we do not need to prove that the secondary configurations are concentrated. It will suffice to prove an upper bound that shows that they do not affect the main order term in our calculations for the Y configurations.
This will ensure that any way of creating Y configurations using secondary configurations (the second case above) is a lower order term than those terms given by the first case above. So the dominant behaviour of each Y configuration is dictated purely by other Y configurations. This is one reason why it is important for us to have codegree conditions on H.  In order to determine how we expect X(m) to behave, we require the following lemma giving an upper bound on X S (m) when X is a secondary configuration with no marked vertices. This lemma will be proved in the next section and applied throughout the rest of the paper.
We now consider what the expected number of copies of any secondary configuration X = (F, R, D) in H(m) rooted at S would be, if I(m) were a uniformly random infection with density p + qt m . The case D = ∅ is covered by Lemma 2.27, so now we consider the case that D = ∅. By Remark 2.26, the configurationX := (F, R, ∅) is also a secondary configuration and so, by Lemma 2.27, the expected number of copies ofX is O d For any such copy, if I(m) were a uniformly random infection with density p + qt m , then the probability that every vertex of the image of D in this copy is infected would be precisely (p + qt m ) |D| . Putting this together we get that, if I(m) were uniform, then we would expect Now we formally describe the upper bound that we will prove on X S (m) for a general secondary configuration X = (F, R, D) with central hyperedge e.
By Definition 2.25, every secondary configuration contains at least one neutral vertex. To bound the maximum change of our variables we will also need to have control over configurations that consist of a single hyperedge with no neutral vertices. So now we will define the final set of variables we wish to control.  Of course, the variable W 1 v (m) is the same as Y r−1,0 v (m), which is the same as Q v (m) if v / ∈ I(m). For i = 1 and S = {v}, the degree of v in H is at most d and so, if I(m) were a uniformly random infection with density p + qt m , then we would expect W 1 S (m) to be at most d · (p + qt m ) r−1 . For i ≥ 2, by condition (c) of Definition 1.2, the number of hyperedges of H(m) containing S is at most d 1− i−1 r−1 log −K (d). So, if I(m) were a uniformly random infection with density p + qt m , then we would expect W i S (m) to be at most Here, we use the fact K is large and that we are only considering values of t up to O (log(d)). Thus we cannot hope to prove tight concentration bounds for these variables. However, we will prove the following.
. The parts of the paper concerning the first phase are structured as follows. In Section 4 we show (in Lemma 4.1) that Lemma 2.14 can be deduced from Proposition 2.6 and Lemmas 2.24 and 2.30. Then we prove the case m = 0 of Lemmas 2.24, 2.28 and 2.30 in Section 5 using a version (Corollary 3.6) of the Kim-Vu Inequality (Theorem 3.5). The case 1 ≤ m ≤ M of Lemmas 2.24, 2.28 and 2.30 are proved in Section 6 using Freedman's Inequality (Theorem 3.10) and the differential equations method. This concludes our discussion of the first phase of the proof of Theorem 1.3.

2.2.
The Second Phase. In the second phase of the proof of Theorem 1.3 we define a different process which involves sampling a large set of open hyperedges in each round, rather than sampling one hyperedge at a time like we did in the first phase. As a slight abuse of notation, in the second phase we let I(0) denote I(M ) and H(0) denote H(M ); that is, after the first phase, we "restart the clock" from zero before running the second phase process. The second phase process will be defined slightly differently depending on whether we are in the subcritical or supercritical case.
For m ≥ 1, in round m we will sample a set of open hyperedges and use the outcomes to define I(m + 1) and H(m + 1). Again we will let Q(m) be the set of hyperedges e of H(m) such that |e \ I(m)| = 1 (the set of open hyperedges). Analogous to the first phase, for each configuration X = (F, R, D), S ⊆ V (H) with |S| = |R| and integer m ≥ 0, we let X S (m) denote the set of copies of X in H(m) rooted at S. As before, we still write X S (m) when referring to |X S (m)| and Q(m) when referring to |Q(m)|.
We now define the number of steps for which we will run the processes in the second phase. For λ < 1/8 depending on only r, c and α (λ is defined in (7.1)): 2.3. The Subcritical Case. First, let us consider the "subcritical case"; i.e. when c r−2 α < (r−2) r−2 (r−1) r−1 . Recall that, in this case, we track the first phase process until the number of open hyperedges is bounded above by ζN for some constant ζ chosen small with respect to r, c and α.
The Second Phase Process in the Subcritical Case. We obtain I(m + 1) and H(m + 1) in the following way. For m ≥ 0, sample every hyperedge in Q(m). We let I(m + 1) be the union of I(m) and all of the vertices in hyperedges which were successfully sampled and let H(m + 1) := H(m) \ Q(m).
Our main result in the subcritical case is the following lemma, which immediately implies Theorem 1.3 in the case c r−2 α < (r−2) r−2 (r−1) r−1 .
Lemma 2.32. If c r−2 α < (r−2) r−2 (r−1) r−1 , then with high probability, (i) Q(M 2 ) = 0, and The key ingredient in the proof Lemma 2.32 is the following bound on E(Q(m)): Given such a hypergraph, picking a hyperedge e = e and deleting it gives a hypergraph that may be a copy of a Y configuration (rooted at the healthy vertex of e ∩ e ), or may not be (because of additional overlaps between the non-central hyperedges). In order to track Q(m) in the second phase, we wish to track these sorts of configurations. The reason that we consider configurations of this type (where the edge e is not present rather than with it also included), is that we wish to express Q(m + 1) in terms of Q(m) and breaking up the configuration this way allows us to do this (see for example (2.36) below). This motivates the introduction of another family of configurations: the Z configurations.
If one of the non-central hyperedges intersects e on more than one vertex or two of them intersect one another, then Z consists of a copy F of a secondary configuration with one root, r − 1 − i neutral vertices and a set of at most j − 1 copies of W 1 rooted at vertices of F .
We will see in Lemma 7.9 that the members of Z i,j v (m) that come from secondary configurations just contribute a lower order term. Given the above discussion, conditioned on H(m) and I(m) the expected value of Q(m + 1) is at most The proof of Lemma 2.32 comes down to proving that Z r−2−j,j w (m) satisfies strong enough upper bounds (with high probability) so that evaluating this expression gives the bound in (2.33).
So to prove Lemma 2.32, it suffices to control Z i,j v (m). To do this, we must also keep control over Y i,j v (m), X S (m) and W i S (m), as before. This is achieved via multiple applications of a version of the Kim-Vu Inequality (Corollary 3.6). Full details will be given in Section 7.
The Second Phase Process in the Supercritical Case. We obtain I(m+1) and H(m+1) in the following way. Each round contains two steps. In the first step we choose some Q (m) ⊆ Q(m) and sample every open hyperedge of Q (m). We define We let I 0 (m+1) be the union of I(m) and all of the hyperedges of Q (m) that were successfully sampled. We let H 0 (m + 1) := H(m) \ Q (m). In the second step for i ≥ 0, if there exists a healthy vertex v contained in at least d Call this set Q i v (m + 1) and define H i+1 (m + 1) := H i (m + 1) \ Q i v (m + 1). Let I i+1 (m + 1) be the union of I i (m) and the vertices of all of the hyperedges of Q i v (m + 1) that were successfully sampled. The second step ends when we reach j such that no healthy vertex of H j (m + 1) is contained in d In the second step, each Q i v (m) is large enough such that, with high probability, v will become infected when we sample every hyperedge in Q i v (m). The proof of Theorem 1.3 relies on the following bound: To prove this bound, we use a version of Janson's Inequality for the lower tail (Theorem 3.9). Given (2.37), after at most O (log log N ) rounds, every healthy vertex has at least d 1/(r−1)+1/10 open hyperedges containing it. Now, by the Chernoff Bound (Theorem 3.2) and the fact that N = d O(1) (by property (b) of Definition 1.2), with high probability every vertex is infected after only one additional round. Therefore, percolation occurs with high probability. See Section 8 for full details. This concludes our outline of the proof.

Probabilistic Tools
Here, for convenience, we collect together the probabilistic tools we apply throughout the paper. We will also formally define the probability space that we are working in.
3.1. Standard Concentration Inequalities. The following two theorems will be repeatedly applied in the rest of the paper. The first is Markov's Inequality, which is perhaps the simplest concentration inequality in probability theory.
Theorem 3.1 (Markov's Inequality). If X is a non-negative random variable and a > 0, then The second is a version of the Chernoff Bound taken from [20].

Kim-Vu and Janson Inequalities.
A central theme in the study of large deviation inequalities is that if a random variable X depends on a sequence of independent trials in which, for any outcome of the trials, changing the result of a small set of the trials does not influence the value of X too much, then X is often concentrated (see, e.g., [20,32,40] for further discussion). In our case, it is clear that the value of any of the variables that we track at time zero depends on the N independent random trials which determine whether or not each vertex of V (H) is contained in I(0). However, as it turns out, our codegree and neighbourhood similarity conditions (conditions (c) and (d) of Definition 1.2) are not strong enough to obtain good control over the worst case influence of changing a small set of trials. Fortunately for us, there exist a number of different tools for obtaining strong concentration when the worst case influence is rather large, but the typical influence is small. The tool that we use in Section 5 to prove bounds on our variables when m = 0 is a version of the Kim-Vu Inequality [29] due to Vu [39] which is particularly well suited to our situation. Other such tools include large deviation versions of Janson's Inequality (Theorem 3.9), which we apply in Section 8, and the "method of typical bounded differences" developed by Warnke [40]. Before stating the Kim-Vu inequality, we require some definitions.

Definition 3.3. Let V be a finite index set and let f be a multivariate polynomial in variables
where the maximum is taken over all multisets of indices from V with cardinality at least j. Theorem 3.5 (Vu [39]). There exist positive constants c k and C k such that if X is a random variable as in Definition 3.4 and E 0 > E 1 > · · · > E k = 1 and are positive numbers such that In order to apply Theorem 3.5 to some random variable of the form f (ξ v : v ∈ V ), we require that the coefficients of f are in [0, 1]. Therefore, in practice, to apply this theorem to most of our variables we first need to rescale them, then apply the theorem, then scale them back to get the bounds we want on the original variable. For this reason, we prove a corollary to Theorem 3.5 which applies this theorem in precisely the form we will use it throughout the paper. This should make the later calculations easier to follow.
nonnegative coefficients and no variable in f has an exponent greater than 1. Let τ and E 0 ≥ log 2k+1 N be positive numbers such that Then for N sufficiently large, with probability at We show that Z is close to its expectation with high probability and use this to obtain bounds on X which hold with high probability. Using the definition of Z, from (i) we obtain that and from (ii) we obtain that for 1 ≤ j ≤ k, In particular this implies that, when N is sufficiently large, every term of g has a coefficient which is at most one. Indeed, for a monomial u∈A x u appearing with a non-zero coefficient Therefore, we may apply Theorem 3.5 to obtain that Now rescaling by τ gives that with probability at least 1 − N −20 √ log N , We will also use Corollary 3.6 to prove bounds on our variables for the "subcritical" case during the second phase. For the "supercritical" case, we find it more convenient to apply the following lower tail version of Janson's Inequality. Theorem 3.9 (Janson's Inequality for the Lower Tail [25]). Let G be a hypergraph and, for p ∈ (0, 1) and e ∈ E(G), let I e be the indicator variable for the event e ⊆ V (G) p . Set where the final sum is over all ordered pairs, (so each pair is counted twice). Then, for any ε ∈ [0, 1], 3.3. Martingales and Concentration. In Section 6 we will use standard martingale concentration inequalities to prove bounds on our variables throughout the first phase for m > 0. Recall that a sequence 0 = B(0), B(1), . . . of random variables is said to be a supermartingale with respect to a filtration F(0), F(1), . . . if, for all m ≥ 0, we have that B(m) is F(m)measurable and In what follows, when we say that B(0), B(1), . . . is a supermartingale, it is always with respect to the natural filtration corresponding to our process (which will be formally defined in the next subsection). A sequence Our main tool in Section 6 is the following concentration inequality of Freedman [22]. Then, for all a, ν > 0, 3.4. The Probability Space. A natural candidate for the probability space on which to view our process is where P(E(H)) is the collection of all subsets of E(H). For any point in Ω, the first N coordinates determine the infection at time zero, the next |E(H)| coordinates list the hyperedges sampled during the first phase process (although, note that the first phase stops before |E(H)| hyperedges have been sampled), the next |E(H)| coordinates list the sets of hyperedges sampled during the second phase process and the last |E(H)| coordinates determine which hyperedges of H are contained in H q . One should notice that Ω contains a large number of infeasible points (i.e. points of measure zero); for example, it contains points corresponding to evolutions of the processes in which some hyperedges are sampled more than once, or the mth hyperedge sampled in the first phase is not even chosen from Q(m − 1), etc. We let Ω be the subspace of Ω consisting of only those points which have positive measure.
For m ≥ 0, let F m be the σ-algebra generated by the partitioning of Ω in which two points are in the same class if they correspond to evolutions of the processes which have the same initial infection and which are indistinguishable after steps of the first phase process for every in the range 1 ≤ ≤ m. For example, any two points of Ω corresponding to evolutions in which the first phase process runs for fewer than m steps are in the same class if and only if they are indistinguishable at every step of the first phase. Similarly, for m ≥ 0, let F m be the σ-algebra generated by the partitioning of Ω in which two points are in the same class if they are indistinguishable at every step of the first phase and, for every 1 ≤ ≤ m, they are indistinguishable after the th step of the second phase process. We will work in this probability space throughout the proof without further comment.

Preliminaries
In this section, we will prove four preliminary results. First we prove Proposition 2.6, which gives a bound on the number of infected vertices at each step of the first phase. Then we deduce that to track Q(m), it is enough to have control over the Y and W configurations and the number of infected vertices at time m. After that, we will prove Lemma 2.27, which bounds the number of copies of any secondary configuration (F, R, ∅) in H(m) rooted at S (where |S| = |R|), for any 0 ≤ m ≤ M . At the end of the section, we will prove an analytic lemma that will be used in the application of the differential equations method in Section 6.
We restate here Proposition 2.6 from Section 2, to aid the reader.
Proof. The expected number of vertices which are infected at time zero is cN d −1/(r−1) . By the Chernoff bound (Theorem 3.2) with = 1/2, we have that, with probability at least 1 − e −Ω(N d −1/(r−1) ) , there are at most 3c 2 N d −1/(r−1) vertices infected at time zero. At each step, one hyperedge is sampled and becomes infected with probability q.
As I( ) ≤ I(M ) for any ≤ M , this completes the proof.
As mentioned in Section 2, to prove Lemma 2.14, it is sufficient to prove Lemmas 2.24 and 2.30 and Proposition 2.6. More formally: Proof. The sum counts the number of ways of choosing Viewing v as the root, such a configuration is a copy of Y 0,1 rooted at v but not a copy of Y 1,1 . Define a hyperedge e ∈ H(m) containing v and at most one infected vertex, (3) a vertex w ∈ e \ {v} such that if e ∩ I(m) = ∅ then w ∈ I(m), and (4) a hyperedge e ∈ W 1 w (m), which contains everything that S counts. So S ≤ S 2 . The configurations counted by S 2 but not S are those given by choosing a hyperedge e ∈ H(m) containing v such that e has a unique infected vertex w, (3) a hyperedge e ∈ W 1 w (m). See Figure 8 for an illustration of the difference between what is counted in S and S 2 when r = 6. v w Figure 8. Here r = 6. Copies of this hypergraph are counted by S only when w is healthy. Copies are counted by S 2 whether w is infected or not. The (unlabelled) vertices shaded dark grey are infected, the unshaded vertices are healthy.

Applying hypotheses (i) and (ii) to bound
Thus, . Using the definition of S 2 and applying the bounds on Y 1,1 v (m) given by the hypotheses of the lemma, we have Using the bounds on Y 1,0 v (m) and (for w ∈ V (H) \ I(m)) the bounds on Q w (m) = W 1 w (m) given by the hypotheses of the lemma, from (4.2) we get Combining (4.4) and (4.3) gives We also have by hypothesis that Putting all this together gives 1±x ∈ 1 ± 3x for x sufficiently small and I(m) = O log N · N d −1/(r−1) by hypothesis (iii). The result follows.
We will now present the proof of Lemma 2.27. It may be helpful to first recall the definition of a secondary configuration from Definition 2.25. We restate here the result from Section 2 to aid the reader.
Proof. Let X = (F, R, D) be a secondary configuration. First we see that, for any ordering e 1 , . . . , e |E(F )| of the hyperedges of F, we have that To see this, count the number of vertices by first counting the vertices of R and then, for each i in turn, count the vertices of e i which have not yet been counted. This will be used several times in the calculations below.
Our goal is to bound the number of copies of X in H(m) rooted at a set S ⊆ V (H) of cardinality |R|. By construction, H(m) is a subhypergraph of H(0) = H. Therefore, it suffices to upper bound the number of copies of X in H(0) = H rooted at S. We do this in the way we described in the previous section, by breaking F up into individual hyperedges and bounding the number of ways to choose each one individually, given the previous choices. We will consider a number of different cases.
First, suppose there exists a hyperedge e 1 ∈ E(F) such that |e 1 ∩ R| ≥ 2. By definition of a secondary configuration, every hyperedge of a secondary configuration contains a neutral vertex, so e 1 R and |e 1 ∩ R| is between 2 and r − 1. Thus, by condition (c) of Definition 1.2 the number of hyperedges Note that this bound is already enough to complete the proof in the case |E(F)| = 1 since, by definition of a secondary configuration, the unique hyperedge of F contains at least two roots. So, in what follows, we may assume that |E(F)| ≥ 2. Let e 2 be a hyperedge which intersects e 1 (which exists by definition of a secondary configuration) and, if |E(F)| ≥ 3, then let e 3 be the remaining hyperedge. The number of copies of X rooted at S is at most the number of ways to choose a hyperedge f 1 of H intersecting S on exactly |e 1 ∩ R| vertices, a hyperedge f 2 intersecting S ∪ f 1 on exactly |e 2 ∩ (R ∪ e 1 )| vertices and, if |E(F)| ≥ 3, a hyperedge f 3 intersecting S ∪ f 1 ∪ f 2 on exactly |e 3 ∩ (R ∪ e 1 ∪ e 2 )| vertices. Using the bound that we have already proven for the number of ways of choosing f 1 , we get that this is , and so we are done when there exists some hyperedge e 1 such that |e 1 ∩ R| ≥ 2.
So from now on we assume that every hyperedge of F contains at most one root. In particular, by definition of a secondary configuration, the central hyperedge has exactly one root. Now, let e 1 denote the central hyperedge and suppose that there is a non-central hyperedge e 2 such that e 2 ⊆ R ∪ e 1 . Then, since e 2 contains at most one root, we must have that |e 1 ∩ e 2 | = r − 1 and that the unique vertex of e 2 \ e 1 is a root. The vertex of e 1 \ e 2 is also a root because, by definition of a secondary configuration, e 1 contains a root and this root cannot be contained in e 2 (as every hyperedge contains at most one root). By condition (d) of Definition 1.2, the number of ways to choose two vertices x, y of S and two hyperedges f 1 and If e 1 and e 2 are the only two hyperedges of F, then |V (F)| − |R| = r − 1 and so this bound is what we wanted to prove. If |E(F)| ≥ 3, then there are ways to choose a third hyperedge to form a copy of F. Combining this with the bound on the number of ways to choose the first two hyperedges and applying (4.5) gives the desired bound. So every non-central hyperedge contains at least one non-root vertex which is not contained in the central hyperedge. We can now conclude the proof in the case |E(F)| = 2. Indeed, let e 1 be the central hyperedge and e 2 be the non-central hyperedge. We can bound the number of copies of X by By definition of a secondary configuration and the fact that e 1 has at most one root, we know that |e 2 ∩ (e 1 ∪ R)| must be at least two. Also, it is at most r − 1 by the result of the previous paragraph. So, by condition (c) of Definition 1.2, we get an upper bound of which, by (4.5) and the fact that |e 1 ∩ R| = 1, is the desired bound. It remains to consider the case |E(F)| = 3. Let e 1 be the central hyperedge and let e 2 and e 3 be the other two hyperedges. Suppose that |R| ≥ 2. Then, since e 1 contains exactly one root, there must be a non-central hyperedge, say e 2 , such that e 2 \ e 1 contains a root. We can now bound the number of copies of X by Since |e 2 ∩ (R ∪ e 1 )| is at least two and at most r − 1, this is bounded above by and so, in this case, we are again done by (4.5) and the fact that |e 1 ∩ R| = 1. Thus, there is exactly one root and it is contained in e 1 . By definition of a secondary configuration, this implies that e 1 ∩ e 2 ∩ e 3 = ∅ and that e 2 intersects e 3 . In particular, it implies that |e 3 ∩(e 1 ∪e 2 )| ≥ 2. We assume that e 2 was chosen to be the non-central hyperedge such that |e 1 ∩ e 2 | is maximised. As above, the number of copies of X is bounded above by which gives the desired bound by condition (c) of Definition 1.2 unless |e 2 ∩ e 1 | = 1 and |e 3 ∩ (e 1 ∪ e 2 )| = r (since we already know that |e 3 ∩ (e 1 ∪ e 2 )| ≥ 2). So, we assume that F satisfies these conditions. By our choice of e 2 , we also get that |e 3 ∩ e 1 | = 1 as well. The last case to consider is therefore when e 2 and e 3 each intersect e 1 on on a single vertex (where these two vertices are distinct) and |e 2 ∩ e 3 | = r − 1. In this case, the number of copies of X is bounded above by the number of ways to choose a hyperedge f 1 containing S, choose two vertices x, y of f 1 \ S and then choose two hyperedges which is what we needed since |V (F)| − |R| = 2(r − 1) in this case. This completes the proof.
In our application of the differential equations method in Section 6, it is often useful for us to approximate certain sums by a related integral. For this, we use the following simple lemma. We remark that a very similar statement is derived in [31] using the same proof.
Summing up these expressions and applying the triangle inequality, we have |s (t)| as desired.

Concentration at Time Zero
Our goal in this section is to prove Lemmas 2.24, 2.28 and 2.30 in the case m = 0. Lemma 2.14 will follow from Lemmas 2.24 and 2.30 and Proposition 2.6 via Lemma 4.1. In fact, we will actually prove the following stronger bounds in order to give ourselves some extra room in the next section.
Lemma 5.2. With probability at least 1 − N −10 √ log(N ) the following statement holds. For every secondary configuration X = (F, R, D) and set S ⊆ V (H), we have Recall the definitions of y i,0 (t) and y i,j (t) from (2.22) and (2.23). We will prove the following.
Note that we get the following concentration result for Q(0) from Lemmas 5.1, 5.3 and Proposition 2.6 via Lemma 4.1.
We will prove Lemmas 5.1, 5.2 and 5.3 by applying Corollary 3.6. Although the Y configurations are arguably the most important, we save proving Lemma 5.3 until last; the proofs of the first two lemmas involve more simple applications of Corollary 3.6 and hence provide a more gentle introduction for the reader to the style of arguments we will be using throughout the section. We remark that in the proof of Lemma 5.1, we technically do not need to rescale our random variable, and so could apply Theorem 3.5 directly. However it is marginally simpler to apply Corollary 3.6, so this is what we shall do.
We will use the following random variables throughout the rest of the section. Given w ∈ V (H), let ξ w be the Bernoulli random variable which is equal to one if and only if w ∈ I(0). Without further ado we present the proofs of Lemmas 5.1, 5.2 and 5.3.
Proof of Lemma 5.1. Let S ⊆ V (H) be a set of cardinality i. If i = r, then clearly W i S (0) ≤ 1 and so we may assume that 1 ≤ i ≤ r − 1. Observe that W i S (0) can be written as Observe that no variable in f has an exponent greater than 1 and the degree of f is r − i. We wish to apply Corollary 3.6 to obtain an upper bound on W i S (0) which holds with high probability. In order to do this, we must bound Therefore, by linearity of expectation and independence we have In the case that |A| = r − i, the above expression is simply equal to 0 or 1 (depending on whether A ∪ S is a hyperedge of H or not). Otherwise, By conditions (a) and (c) of Definition 1.2, this expression is o(1) if |A| + i ≥ 2 and is at most c r−1 otherwise (i.e. if A = ∅ and i = 1).
This analysis gives applying Corollary 3.6 gives that with probability at least 1 − N −20 √ log N , Before proving Lemmas 5.2 and 5.3 it is helpful to introduce the following definition and simple claim.
Definition 5.5. Let X = (F, R, D) be a configuration. Let T X be the collection of all pairs (F , D ), with F a subhypergraph of H and D ⊆ V (F ), such that there exists an isomorphism φ from F to F such that φ(R) = S and φ(D) = D . Therefore as required.
Observation 5.7. Let F be a copy of X = (F, R, D) rooted at S in H(0). Let us consider how F may be counted multiple times by f (ξ v : v ∈ V (H)). This will happen precisely when there exist two witness triples (defined in the proof above) for F of the form (F , φ 1 , D 1 ) and (F , φ 2 , D 2 ) such that D 1 = D 2 (and so φ 1 = φ 2 ). When |I(0) ∩ F | = |D|, there is only one choice for the set D in a witness triple and so no such pair (F , φ 1 , D 1 ) and (F , φ 2 , D 2 ) exists. However, if |I(0) ∩ F | > |D|, then there may exist subsets D 1 = D 2 of F (and isomorphisms φ 1 and φ 2 ) such that (F , φ 1 , D 1 ) and (F , φ 2 , D 2 ) are both witness triples for X. In this case both (F , D 1 ) and (F , D 2 ) are in T X and F is counted multiple times by f (ξ v : v ∈ V (H)).
So the difference between X S (0) and f (ξ v : v ∈ V (H)) is at most O(1) times the number of copies of configurations X = (F, R, D ), where D := D ∪ {u}, for some u ∈ V (F) \ (D ∪ R).
We will now return to proving Lemmas 5.2 and 5.3. actually stronger than we need. So, from now on, we assume that D = ∅. By Claim 5.6, letting T := T X , we have that the variable X S (0) is bounded above by Observe that the degree of f is |D| and no variable in f has an exponent greater than 1.
We wish to apply Corollary 3.6 with and E 0 := log 2|D|+1 (N ) to obtain an upper bound for X S (0) which holds with high probability. As above, in order to apply Corollary 3.6 we must bound E j (X S (0)) for 0 ≤ j ≤ |D|.
If A contains an element of S, then ∂ A f = 0. On the other hand, if A is a subset of V (H)\S of cardinality at most |D|, then x v   and so, by linearity of expectation and independence, Recalling Remark 2.26, we see that the number of (F , D ) ∈ T with A ⊆ D is at most the sum of X S∪A (0) over all secondary configurations X with |V (F)| vertices, |R| + |A| roots and zero marked vertices multiplied by a constant factor (as there is a choice for which |A| roots are in D ). So, by Lemma 2.27, we get that the right side of (5.8) is bounded above by So for 0 ≤ j ≤ |D|, As X is secondary, |D| ≤ 3r − 1. So for K large with respect to r, Using this and the fact that E(X S (0)) = o(τ ), applying Corollary 3.6 gives that with probability at least 1 − N −20 √ log(N ) . The result follows by taking a union bound over all secondary configurations and choices of S.

5.1.
Proof of Lemma 5.3. First, note that it suffices to consider the case that i and j are not both zero, since Y 0,0 v (0) = deg(v) and so the bounds hold for Y 0,0 v (0) by conditions (a) and (b) of Definition 1.2. Thus, from now on, we assume i + j ≥ 1.
Write the configuration Y i,j as (F, R, D). By Claim 5.6, setting T : Note that by definition of Y i,j , f has degree i + j(r − 1). Observe that no variable in f has an exponent greater than 1. We will prove the following.
We now show thatỸ i,j v is a good approximation for Y v (0), and hence it suffices to prove Proposition 5.9.
and for all w ∈ V (H), The proof of Lemma 5.3 follows from Propositions 5.9 and 5.10 and Lemma 5.1 by applying the union bound.
Proof of Propostion 5.10. Fix v ∈ V (H). By Observation 5.7, it may be the case thatỸ i,j v counts an element F of Y i,j v (0) more than once if it contains more than i + (r − 1)j elements of I(0). However, we have from which the claim follows.
Using this, as j < r, from (5.11) we get as required. The claim follows.
It remains to prove Proposition 5.9.
Proof of Proposition 5.9. Fix v ∈ V (H). We wish to apply Corollary 3.6 with and E 0 := log K (d) to obtain bounds onỸ i,j v that hold with high probability. As before, in order to apply Corollary 3.6, we must bound and for 1 ≤ ≤ i + (r − 1)j we have Proof. If A contains v, then ∂ A f = 0. On the other hand, if A is a subset of V (H) \ {v} of cardinality at most i + (r − 1)j, then and so, by linearity of expectation and independence, Now let us bound the number of (F , D ) ∈ T with A = ∅. This is at most the number of ways to: (1) choose a hyperedge e ∈ H such that v ∈ e, Therefore, applying (5.13) to the case A = ∅, we have then the number of (F , D ) ∈ T with A ⊆ D is at most the number of ways to partition A into j + 1 sets A 0 , . . . , A j (some of which may be empty) and do the following: ( Combining this with (5.13), we get that This completes the proof of Claim 5.12.
The degree of f can be crudely bounded by r 2 . So as for K sufficiently large with respect to r, given by Claim 5.12 and applying Corollary 3.6 gives that with This completes the proof of Proposition 5.9.
Proposition 5.9 was the final piece for the proof of Lemma 5.3. Thus this completes the proof of Lemma 5.3 and hence our analysis of m = 0 is now complete.

The First Phase After Time Zero
In this section, we will use the differential equations method to prove Lemmas 2.24, 2.28 and 2.30 for 1 ≤ m ≤ M , where M is defined in (2.11).
Definition 6.1. For 0 ≤ m ≤ M , let B m be the event (in Ω , which was defined in Subsection 3.4) that there exists 0 ≤ ≤ m such that, for some v ∈ V (H) or S ⊆ V (H), one of the following four statements fails to hold.
(B.1) For all 0 ≤ i ≤ r − 2 and 0 ≤ j ≤ r − 1 − i: (B.2) For any secondary configuration X = (F, R, D), Note that these are precisely the bounds we wish to prove for Lemmas 2.30, 2.28 and 2.24 and Proposition 2.6. One should think of B m as the event that one of the variables that we are tracking has strayed far from its expected trajectory at or before the mth step.
In this section, we prove the following lemma.
Observe that Lemmas 2.24, 2.28 and 2.30 all follow immediately from Lemma 6.2. By Lemma 4.1, when ω / ∈ B M , we obtain so Lemma 2.14 is implied as well. Now, given a point ω ∈ B M , let That is, ω corresponds to a trajectory of the process in which at least one of the variables strays far from its expectation at step J but not before. Define Our goal is to show that the probability of each of the events W, X , Y and Z is small, from which Lemma 6.2 will follow. Getting a sufficient bound on P(Z) follows directly from Proposition 2.6. We obtain the following.
Proof. The event Z is contained within the event that there exists some 0 ≤ ≤ M , such that the bound fails to hold. By Proposition 2.6, the result follows.
We devote the rest of the section to proving that each of W, X and Y occurs with probability at most N −Ω( √ log N ) . To prove this, we will apply the differential equations method and Theorem 3.10. The sequences of variables that we track are not themselves supermartingales or submartingales and so we cannot apply Theorem 3.10 to them directly. What we do is show that the difference between each variable in the sequence and its expected trajectory, plus or minus some appropriate (growing) error function, is bounded above by an η-bounded supermartingale and below by an η-bounded submartingale (actually we only need to bound W i S (m) and X S (m) from above). As in many applications of the differential equations method, the trick to verifying that these sequences are indeed η-bounded sub-or supermartingales is to define them in such a way that, if none of our sequences have strayed far from their expected trajectory, then we can use the fact that they have not strayed to prove that the properties hold and, otherwise, the properties hold for trivial reasons.
As in Section 5, despite the Y configurations being the most important, we first consider the W configurations, then the X configurations, then the Y . This is because the proofs of their respective lemmas increase in complexity and it is helpful for the reader to first see a more simple application of Theorem 3.10, before diving into the proof of Lemma 6.29. In the proof of Lemma 6.29, we need to be more careful than for Lemmas 6.19 and 6.6. This is because we are proving that the Y configurations are tightly concentrated, whereas we just prove weak upper bounds on the X and W configurations. Proof. For ≥ 0 define E to be the event that W i S ( + 1) ≥ W i S ( ). We remark that it is possible for W i S ( ) to decrease in a step. This will happen if a copy of W i rooted at S is an open hyperedge which is successfully sampled. However, our choice of martingale will reflect the fact that we are only concerned with proving an upper bound on W i S ( ).
Given an event E, we let 1 E denote the indicator function of E. For ≥ 0, define and where y r−2,1 (t) is defined in (2.23) and γ(t) is defined in (2.8). Also, set Note that, by definition, B S (0) = 0. Also, if ω / ∈ B m−1 , then Therefore, to obtain an upper bound on W i S (m), it suffices to bound the three quantities on the right side of this expression. Since (t ) = o(1) for all ≤ M and γ(t) is bounded away from zero (by Remark 2.13) the sum can be bounded above in the following way: (2.2). Still assuming ω / ∈ B m−1 , by the above analysis It follows that the event W(S) is contained within the event that either W i S (0) > log 2r (d) or that B S (m) > 1 2 log r 3 (r−i) (d) for some 0 ≤ m ≤ M . By Lemma 5.1, so to prove Proposition 6.8 it suffices to show that B S (m) is unlikely to be large. We will show that (6.10) We wish to apply Theorem 3.10 to the sequence B S (0), . . . , B S (M ). In order to do this, we must show that B S (0), . . . , B S (M ) is an η-bounded supermartingale and we must also bound the sum m−1 =0 Var(A s ( ) | F ). Claim 6.11. B S (0), . . . , B S (M ) is a supermartingale.
Proof. This is equivalent to showing that, for 0 ≤ ≤ M − 1, the expectation of A S ( ) given F is non-positive. For ω ∈ B we have A S ( ) = 0, and so it suffices to consider ω / ∈ B . That is, we can assume that none of the variables that we track has strayed at or before time . The only hyperedges e which can be counted by W i S ( + 1) but not by W i S ( ) are those which contain S and have the property that there is a unique vertex x ∈ e \ S such that x / ∈ I( ). Also, such a hyperedge e contributes to W i S ( + 1) − W i S ( ) if and only if an open hyperedge e * = e containing x is successfully sampled at the ( + 1)th step. Let T be the set of all such pairs (e, e * ). As the probability that a particular open hyperedge is successfully sampled is q Q( ) and as ω / ∈ B , using (6.3) we have where γ(t) is defined in (2.8). Therefore it suffices to bound |T |.
If i ≥ 2, then e is a copy of a secondary configuration (F, R, D) where F is a single hyperedge, |R| = i and |D| = r − i − 1. Then since ω / ∈ B we can use (B.2) to bound the number of such e and (B.3) to bound the number of choices for e * to get for K chosen large with respect to r. Now, suppose that i = 1 and let v be the unique element of S. The number of such pairs (e, e * ) with |e ∩ e * | = 1 is precisely Y r−2,1 v ( ) which, since ω / ∈ B , is at most (1 + (t ))y r−2,1 (t )d 1/(r−1) . When |e ∩ e * | ≥ 2, we have that e ∪ e * is a copy of a secondary configuration with a single neutral vertex. So, as ω / ∈ B , by (B.2) the number of pairs (e, e * ) with |e ∩ e * | ≥ 2 is at most d 1 r−1 log −K/2 (d), as above. Therefore when i = 1, we have (6.14) |T | ≤ (1 + (t ))y r−2,1 (t )d 1/(r−1) + d 1 r−1 log −K/2 (d).
Putting together (6.12), (6.13) and (6.14) gives (6.15) which implies that the expectation of A S ( ) given F is non-positive, as desired. This completes the proof of the claim. Proof. First we bound the maximum value of |A S ( )|. Again, we can assume that ω / ∈ B as, otherwise, |A S ( )| is simply equal to zero. By definition of A S ( ), the minimum possible value of A S ( ) is −a = −o(1).

Now we bound the maximum possible value of
The only way that this quantity can be positive is if some vertex, say x, becomes infected in the ( + 1)th step. Given that x becomes infected, the maximum value that W i S ( + 1) − W i S ( ) can achieve is precisely W i+1 S∪{x} ( ). This is at most log r 3 (r−i−1) (d) by (B.3) since ω / ∈ B . So |A S ( )| ≤ η for 0 ≤ ≤ M and B S (0), . . . , B S (M ) is η-bounded, as required.
Proof. When ω ∈ B , we have that Var (A S ( ) | F ) = 0. So now consider when ω / ∈ B . Since for a constant c and any random variable X we have Var(X − c) = Var(X) ≤ E X 2 , by definition of A s ( ) we have So, the right hand side of (6.18) can be rewritten as is precisely the expected value of 1 E W i S ( + 1) − W i S ( ) given F . So by (6.15), definition of η, γ(t) and y r−2,1 (t), and the fact that t = O (log(d)) we have, So by the above analysis, N log(d)). This completes the proof of the claim.
Hence Lemma 6.6 is proved. So the following proposition will imply Lemma 6.19, via an application of the union bound over all choices of X and S. It follows that the event X (X, S) is contained within the event that either for some 0 ≤ m ≤ M . By Lemma 5.2, so to prove Proposition 6.8 it suffices to show that B S (m) is unlikely to be large. We will show that We will apply Theorem 3.10. In order to apply Theorem 3. Proof. This is equivalent to showing that, for 0 ≤ ≤ M − 1, the expectation of A S ( ) given F is non-positive. For ω ∈ B we have A S ( ) = 0, and so it suffices to consider ω / ∈ B . For each u ∈ D, let X u denote the configuration (F, R, D \ {u}). By Remark 2.26, X u is a secondary configuration. Every element of X S ( + 1) \ X S ( ) comes from an element of X u S ( ), for some u ∈ D, and an open hyperedge e containing the image of u such that e is successfully sampled. As ω / ∈ B , using (6.3) and the fact that γ(t ) is bounded below by a function of r, c and α (see Remark 2.13) gives that the probability that any particular open hyperedge is successfully sampled is Also, since ω / ∈ B , we can apply (B.2) to X u S ( ) and (B.3) to the number of open hyperedges containing the image of u to get that the expected number of copies of X created in the ( + 1)th step is and therefore the expectation of A S ( ) given F is negative. This proves that the sequence B S (0), . . . , B S (M ) is a supermartingale. Proof. We first bound the maximum value of |A S ( )|. Again, assume ω / ∈ B . By (2.1) and the definition of A S ( ), we have For u ∈ D, defineX u to be the configuration (F, R ∪ {u}, D \ {u}). By Remark 2.26 this configuration is secondary. The value of A S ( ) can only be positive if some vertex, say x, becomes infected in the ( + 1)th step. Given that x becomes infected, the number of copies of X rooted at S created is at most u∈DX u S∪{x} ( ). Since ω / ∈ B , by (B.2) this is So we have |A S ( )| ≤ η for 0 ≤ ≤ M , as required.
Proof. When ω ∈ B , we have that Var (A S ( ) | F ) = 0. So now consider when ω / ∈ B . We have As in the proof of Claim 6.17, we have So, by (6.25) and definition of η, we get that Var(A S ( ) | F ) is at most By the above analysis, m−1 =0 Var (A S ( ) | F ) is therefore at most Set ν as in (6.28) and a := 1 2 log 2|D|r 4 (d)·d Applying Theorem 3.10 shows that as required for (6.23). This completes the proof of Proposition 6.21.
Therefore, the proof of Lemma 6.19 is concluded.

Tracking the Y Configurations.
Much of the analysis in this subsection is similar to the previous two; however, we must be more careful since we are aiming at tight concentration bounds (not just crude upper bounds). As we require both upper and lower bounds, we will be dealing with both a supermartingale and a submartingale. We wish to bound the probability that (B.1) is violated at the first time any of (B.1), (B.2), (B.3) and (B.4) are violated. We will prove the following.
to be the set of all ω ∈ B M such that the bound is violated at time m = J (ω), where J (ω) is defined in (6.4) and y i,j (t) is defined in (2.23).
Observe that the bound (6.30) is precisely the bound (B.1) for our fixed choices of i, j and v.

Note that
Thus the following proposition will imply the lemma, via an application of the union bound over all choices of i, j and v.
The proof of the proposition relies on two claims (Claim 6.33 and Claim 6.34), which will be stated where they are needed once the relevant variables have been defined. They will be proved later after completing the proof of the proposition assuming the claims.
Proof of Proposition 6.31. For ≥ 0, define where y i,j (t) is defined in (2.23). We clarify that we are defining two different random variables A + v ( ) and A − v ( ). The superscript denotes whether we expect the variable to be typically positive or typically negative. Given this definition, we define the following pair of random variables: Thus if ω / ∈ B m−1 , by definition, Our strategy is to obtain a concentration result for Y i,j v (m) by analysing each of the terms on the right side of this expression.
The proposition will follow from the next two claims.
then the following bounds hold: Indeed, by Claim 6.33, the event Y(i, j, v) is contained within the event that either or one of the following bounds hold: Combining this with Claim 6.34 completes the proof of the proposition.
We now prove Claims 6.33 and 6.34.
Proof of Claim 6.33. We will analyse each term in the right hand side of (6.32). By Lemma 4.6 with s(t) = y i,j (t), we have as m = O (N log(d)).
Next, we analyse the second summation in the parentheses of (6.32). First, by definition of (t) (given in (2.10)), The function y i,j (t) is a polynomial with positive leading coefficient and, by Remark 2.13, y i,j (t) is bounded away from zero by a function of r, c and α for all t ∈ [0, T ]. Therefore, there exists a positive constant C = C(r, c, α) such that y i,j (t ) · (t + 1) K/10 ≤ C · y i,j (t ) · (t + 1) (K/10)−1 .
Combining this with (6.36) gives Now, applying Lemma 4.6 and choosing K sufficiently large with respect to r, Consider the integral on the right side of the above inequality. By definition of y i,j (t) (given in (2.22) and (2.23)) and definition of γ(t) in (2.8), we have Letting C := r−1 i r−1−i j · (max{1, c, α}) r 3 , gives that this is at most ≤C · y i,j (t m ) · (t m + 1) K/10 , for some positive constantC depending on r, c, α and K. Moreover, by choosing K large with respect to r, c and α, we may takeC arbitrarily close to zero. So, provided that K is large enough, we have as we may chooseC arbitrarily close to zero. Now let us combine (6.32), (6.35) and (6.37) with our hypothesis to give an upper bound for Y i,j v (m): By the definition of (t) in (2.10) we can absorb the error term log −3K/10 (d)y i,j (0)d 1− i r−1 into the main error term. This gives Similarly, we get as required.
It remains to prove Claim 6.34.
Proof of Claim 6.34. To prove these bounds, we show that the sequence is a submartingale which satisfy certain maximum change and variance increment bounds and apply Theorem 3.10. This amounts to bounding the expectation, maximum/minimum possible values and variance of A ± v ( ). As in the previous two subsections, we will always assume ω / ∈ B( ); otherwise, we have A ± v ( ) = 0 and so all of the required bounds hold trivially. Recall the definition of y i,j (t) from (2.22) and (2.23) and the definition of γ(t) from (2.8). The following expression, obtained by differentiating y i,j (t), will be useful in what follows: Proof of Subclaim 6.39. First, we bound the expectation of Y i,j v ( + 1) − Y i,j v ( ) given F . To do this we must consider copies of Y i,j rooted at v that are created by successfully sampling an open hyperedge at the ( + 1)th step, and also copies that are destroyed. Let C +1 be the number of copies of Y i,j rooted at v which are present in the ( + 1)th step but not in the th step. Any such copy must come from either: v ( ) such that a healthy vertex of F , say x, contained only in the central hyperedge becomes infected in the ( + 1)th step, or (2) an element F of Y i,j−1 v ( ) and a copy G of Y r−2,0 in H( ) where: • G contains precisely r − 2 infected vertices, • G is rooted at some non-root u ∈ F , where u is contained only in the central hyperedge of F , • the unique healthy non-root x of G becomes infected at the ( + 1)th step. See Figure 9 for examples of (1) and Figure 10 for examples of (2) (both in the case r = 6). See also Figure 4 for other examples of these (u there plays the role of x here). Both (1) and (2)  successfully sampled is Let us count the number of ways that we can create a copy of Y i,j rooted at v that is not present in the th step using (1). We will break this into two cases. If e * is the hyperedge that is successfully sampled at the ( + 1)th step, then either |e * ∩ F | = 1, or |e * ∩ F | > 1. Consider the first case.
We wish to count the number of ways to choose: Therefore the number of ways to create a member of Y i,j v ( + 1) via (1) when |e * ∩ F | = 1 is at most (j + 1)Y i−1,j+1 v ( ). However, this also unnecessarily counts (j + 1) times the number of copies of Y i−1,j+1 rooted at v that contain at least i infected vertices in their central hyperedge. Therefore we need to subtract off (j + 1) times the number of such copies. Let us now crudely bound these. Such a copy consists of a member F of Y i,0 v ( ) and j + 1 copies of So the number of ways to create a member of Y i,j v ( + 1) via (1) when |e * ∩ F | = 1, is within Now consider the case when |e * ∩ F | > 1. As we will see, this will give a lower order term. In this case, e * ∪F is a member of So we can bound the number of ways of creating a member of Y i,j v ( + 1) via (1) in the second case by the sum of X v ( ) multiplied by log O(1) (d) (for the copies of W 1 ), over all secondary configurations X satisfying (a).
Since ω / ∈ B and |V (F * )|−|R * |−|D * | = r −i for each secondary configuration X satisfying (a), by (B.2) this is at most Putting (6.41) and (6.42) together with (6.40) shows that the expected number of copies of Y i,j rooted at v created via (1) is for x sufficiently small. Counting the number of ways to create a copy of Y i,j rooted at v via (2) (recall that Figure 10 provides examples of this) is equivalent to counting As ω ∈ B , by (B.1) this is contained in Let C(F , u, G ) be the number of ways to pick such an F and u multiplied by the number of ways to pick some G = G ∪ {e * } ∈ Y r−2,1 v ( ). So as ω / ∈ B , by (B.1) and (6.44) we have The quantity C(F , u, G ) both fails to count some triples we wish to count and also counts some triples we do not want to count. However, we will show that both these terms are of a lower order. So up to an error term, the number of ways to create a copy of Y i,j rooted at v via (2) is C(F , u, G ). See Figure 11 for examples of the triples we fail to count and Figure 12 for examples of the triples we count unwantedly.
The triples which C(F , u, G ) fails to count are those where e * intersects G in more than one vertex. However, for such a pair G and e * , we have that G ∪ e * is a copy in H( ) of a secondary configuration X rooted at u with precisely one neutral vertex. As ω / ∈ B , by (B.2)  C(F , u, G ). Observe that |F ∩ G | ≥ 2 and F ∪ G consists of a copy X of a secondary configuration (whose edges are drawn with a thick outline) and some additional copies of W 1 rooted at healthy vertices of X. The central hyperedge of X is the hyperedge containing v. In any of these cases, if the shaded hyperedge were successfully sampled, it would not turn F ∪ G into a copy of Y 3,2 .
we have X u ( ) ≤ d 1 r−1 log −2K/5 . Summing over all such configurations X, and using (6.44) gives that there are triples that are not counted in C(F , u, G ). Now consider the triples which C(F , u, G ) counts that we do not want (Figure 12 provides examples of these). The triples we wish to exclude are precisely those where |F ∩ G | ≥ 2. G could intersect either the central hyperedge or a non-central hyperedge of F (or both). However for any such pair, F ∪ G ∪ e * consists of: (a) a copy in H( ) of a secondary configuration X with a unique root (v) and exactly i marked (so at least i infected) vertices in the central hyperedge, and (b) at most j copies of W 1 (one is e * and the others are the non-central hyperedges of F ).
(The reason we have at most j is because the secondary configuration could contain 2 or 3 hyperedges.) So similarly to the second case of (1) above, since ω / ∈ B and |V (F * )| − |R * | − |D * | = r − (i + 1) for each secondary configuration X = (F * , R * , D * ) satisfying (a), by (B.2) and (B.3) the number of triples we wish to exclude is So putting (6.45), (6.46) and (6.47) together, the number of ways to choose F , u, G and e * is contained in Combining (6.48) and (6.40) gives that the expected number of copies of Y i,j rooted at v created in the second way is contained in Therefore, by (6.43) and (6.49) we have: (6.50) By Definition 2.18, we see that the only way in which a copy of Y i,j rooted at v can be destroyed in the ( + 1)th step is if one of its hyperedges is sampled. Let D +1 be the number of such destroyed copies. As discussed above, since ω / ∈ B , all but at most log O(1) (d) · d 1− i+1 r−1 copies of Y i,j in H( ) rooted at v have exactly i infected vertices in the central hyperedge. A copy whose central hyperedge is open is destroyed with probability (j + 1)q/Q( ). However, for the vast majority of copies of Y i,j rooted at v, the central hyperedge is not open. Using (6.40), the probability that any such copy is destroyed is precisely Therefore we have As ω / ∈ B , by (B.1) we have , by linearity of expectation, (6.50) and (6.51) we get that by (6.38). Therefore the expectation of A + v ( ) given F is positive and the expectation of is a supermartingale, as required for Subclaim 6.39. Now, let us bound the maximum and minimum possible values of A ± v ( ).
Proof. The maximum value of Y i,j v ( + 1) − Y i,j v ( ) occurs when the ( + 1)th sampling is successful and the newly infected vertex creates many new copies of Y i,j rooted at v in either the first or second way (see above). For each u ∈ D letỸ u denote the configuration (F, R∪{u}, D \{u}). If some vertex, say x, becomes infected in the ( +1)th step, the number of copies of Y i,j created is u∈DỸ u S∪x . Each configurationỸ u consists of a copy F of firstly a secondary configuration with (r − 1 − i) neutral vertices, and secondly some additional copies of W 1 rooted at vertices of F . So as ω / ∈ B the maximum number of copies of Y i,j rooted at v that can be created in a time step is As mentioned above, a copy of Y i,j rooted at v can only be destroyed in the ( + 1)th step if it contains an open hyperedge which is sampled. Given that some vertex x becomes infected, the maximum number of copies that can be destroyed is at most the number of hyperedges containing {v, x} and i (other) infected vertices, multiplied by log O(1) (d) (for the other hyperedges, which are copies of W 1 ). A hyperedge containing {v, x} and i infected vertices is either a copy in H( ) of W 2 rooted at {v, x} (when i = r − 2) or a copy in H( ) of a secondary configuration X = (F, R, D) where F is a single hyperedge, |R| = 2 and |D| = i (when i < r − 2). So as ω ∈ B , by (B .2) and (B.3), the maximum number of copies that can be destroyed in a time step is As t = O (log(d)), N ≥ d 1 r−1 log bK (d) and K is chosen to be large with respect to r, c and α, Now we bound the variance of A ± v ( ) given F .
Proof. The calculation follows similarly to the calculations for Claim 6.17 and Claim 6.27.
When ω ∈ B , we have Var (A S ( ) | F ) = 0. So now consider when ω ∈ B . We have In the proof of Subclaim 6.53 we saw that the maximum value of as in the proof of Subclaim 6.39, we have By (6.50), (6.51) and definition of η, this is at most So the sum of Var(A ± v ( ) | F ) over 0 ≤ ≤ M is at most d 2− 2i r−1 log −9K/20 (d), as required. Set and set a := 1 2 (t m )y i,j (t m )d 1− i r−1 . Theorem 3.10 implies that This completes the proof of Claim 6.34.
This claim was the final piece in the proof of Lemma 6.29. Proving Lemma 6.29 completes the proof of Lemma 6.2 and concludes our discussion of the first phase.

The Second Phase in the Subcritical Case
In this section we will complete the proof of Theorem 1.3 in the subcritical case. It may be helpful to recall the definition of the processes we run in the second phase from Subsection 2.2 and the definition of F m from Subsection 3.4. Remember that in the second phase we "restart the clock" again from time zero letting H(0) := H(M ), I(0) := I(M ).
We begin by presenting some technical definitions of various constants that will be used throughout this section. These constants are required to state our main lemmas and are useful to simplify future calculations.
First we present some further analysis of the function γ(t) (defined in (2.8)). We have Recall that t min was chosen to be the unique positive root of γ (t) and that T 0 < t min < T 1 where T 0 , T 1 are the only two positive roots of γ(t). As γ(0) > 0, γ(T 0 ) = 0 and T 0 < t min , the function γ(t) is strictly decreasing in the interval [0, t min ] and so γ (T 0 ) < 0. Therefore α(r − 1)(c + αT 0 ) r−2 − 1 < 0, which implies that So there exists a constant 0 < λ < 1/8 such that Recall from (2.11) and (2.12) that we run the first phase process until time T , where T := 1 N min{m ≥ 0 : (1 + 4 (t m ))γ(t) < ζ} and ζ will be chosen in a moment (in Definition 7.3). Using the fact that T < T 0 and that (r − 1)(c + αt) r−2 is increasing for positive t, we obtain the following from (7.1), We are now ready to define ζ and some other useful constants.
Throughout the section we will also use the following function. We have now completed the tedious technicalities and are able to present the real meat of the section. Let us briefly outline how the proof will proceed. For each time m, we will define A m (similarly to how we defined B m for the first phase) to be the event that some bound on the number of copies of a particular configuration in H(m) fails to hold. A 0 will be formally defined in Definition 7.13 and, for 0 < m ≤ M 2 , A m will be defined in Definition 7.14.
We will use the following lemma. (Recall Definition 2.34 for the definition of Z i,j .) When ω / ∈ A m and K is sufficiently large, the following statement holds. For all S and v contained in V (H): (L.4) For every secondary configuration X = (F, R, D), We will also prove that with high probability A M 2 does not occur.
As we will see in Definition 7.14, the event A M 2 contains the events A m for 0 ≤ m < M 2 . Thus Lemma 7.10 implies that with high probability for all S and v contained in V (H) and all 0 ≤ m ≤ M 2 the bounds (L.1)-(L.5) hold.
We now derive Lemma 2.32 from Lemmas 7.10 and 7.9, which implies Theorem 1.3 in the subcritical case. After this, we will focus on the proof of Lemmas 7.10 and 7.9.
Proof of Lemma 2.32. Our goal is to show that, with probability 1 − o(1), we have (i) Q(M 2 ) = ∅, and . From this, it follows easily that the probability of percolation is at most ε.
For (i), by Markov's Inequality (Theorem 3.1), we have Thus, it suffices to show that the right side is o(1). This is implied by combining our choice of M 2 (see (2.31)) with the following claim.
Proof. The proof proceeds by induction on m. First consider the base case m = 0. By definition of T , Lemma 6.2 and Lemma 4.1, we have Q(0) ≤ ζ · N with probability at least 1 − N −2 √ log(N ) . As H has maximum degree d, we have Q(0) ≤ N · d and so as required. Now suppose that the statement of the claim is proved for all 0 ≤ ≤ m and we wish to show it holds for m + 1. If Q(m) = ∅ we are done, so we may assume that Q(m) = ∅. Recall that in each round of the second phase, every open hyperedge is sampled. Thus, Q(m+1)∩Q(m) = ∅. This implies that, for each e ∈ Q(m + 1), there must be at least one vertex x of e such that x ∈ I(m + 1) \ I(m); that is, x / ∈ I(m) and there is a hyperedge e * ∈ Q(m) containing x which was successfully sampled in the mth step. Therefore, conditioned on F m , the expectation of Q(m + 1) is at most the product of • the number of ways to choose a hyperedge e * ∈ Q(m) containing a vertex x / ∈ I(m), and • the sum of Z r−2−j,j x (m)q j+1 over all 0 ≤ j ≤ r − 2.
When ω ∈ A m (and K is sufficiently large), by Lemma 7.9 for every w ∈ V (H) and 1 ≤ j ≤ r − 2, we have the following two bounds. First, using (L.5) we have and by (7.5) and (L.1) we have So when ω / ∈ A m , we have As H has maximum degree d, we have Q(m) ≤ N · d. So, letting 1 E be the indicator function of the event E occuring, we have So using Lemma 7.10 to bound P(A m ), by (7.12) and the law of iterated expectation we have Applying our induction hypothesis gives that this is at most This completes the proof of the claim.
To complete the proof we show that (ii) also holds with probability 1 − o(1). By Proposition 2.6, with probability 1 − N −Ω( = o(N ) infected vertices when the process terminates. This completes the proof.
The remainder of the subsection is devoted to proving Lemma 7.9 and Lemma 7.10. 7.1. Defining A m and Proof of Lemma 7.9. We begin by formally defining the events A m , for 0 ≤ m ≤ M 2 . It will be convenient for the proof of Lemma 7.10 to define A 0 separately. Recall the definitions of λ and χ(i, j) from (7.1) and Definition 7.3, respectively. Definition 7.13. Let A 0 be the event (in Ω , which was defined in Subsection 3.4) that for some v ∈ V (H) or S ⊆ V (H) one of the following statements fails to hold.
We remark that the case i = 0 of (L.1) in Lemma 7.9 always holds trivially as ∆(H) ≤ d. This is reflected below in our definition of A m . Definition 7.14. For 1 ≤ m ≤ M 2 , let A m be the event (in Ω , which was defined in Subsection 3.4) that either A 0 occurs, or there exists 1 ≤ ≤ m such that, for some v ∈ V (H) or S ⊆ V (H) one of the following statements fails to hold: For every secondary configuration X = (F, R, D), We now present the proof of Lemma 7.9.
Note that (A.2) is precisely the same statement as property (L.2). Observe in addition that if ω / ∈ A m , then clearly (L.3) and (L.4) hold. Also, if ω / ∈ A m , then using the definition of ψ (see (7.8)), By (7.2) and the fact that (1 + (T ))γ(t) < ζ (by (2.31)), the above inequality implies that For 0 ≤ i ≤ r − 2 and j = 0, (7.16) implies that as required for (A0.1). Also, if i = r − 2 and j = 1, (7.16) implies that using (7.16) and the fact that ζ < χ(r − 2, 1), gives Our goal now is to show that the probability of each of Y 0 , Y >0 , W and X occurring is N −10 √ log N , from which Lemma 7.10 will follow. This will be done in Propositions 7.25, 7.31, 7.33 and 7.36 to come.
Given a configuration X that we care about controlling, we wish to apply Corollary 3.6 to bound the probability of X S (m + 1) − X S (m) being too large for each S ⊆ V (H). It will be helpful to introduce some general framework, which will aid us in our application of Corollary 3.6. Before stating the next lemma, we require quite a few technical definitions. Let us briefly motivate the definitions before stating them formally.
For a fixed configuration X = (F, R, D) and non-empty U ⊆ D, we will define a family X U of configurations, such that each configuration in the family is created by changing the set U of marked vertices of X into neutral vertices and, for each u ∈ U , adding a hyperedge e u containing u such that the vertices in e u \ {u} are all marked. The family X U will be helpful when bounding how many new copies of X are made at some time step. Indeed, a new copy of X is made from a copy of a configuration in X U when the open hyperedges rooted at vertices of U are all successfully sampled in some time step.
We will also define a configuration X U,U created by taking some set U of marked vertices of X, turning some subset U ⊆ U into roots and turning U \ U into neutral vertices. This will be used to bound ∂ A f (where f will be an upper bound on X S (m + 1) − X S (m) and A ⊆ V (H)) in the application of Corollary 3.6. As it turns out, we will always be able to express bounds on the number of copies of some X ∈ X U or X U,U in terms of copies of configurations we are keeping control over.
For a configuration X = (F, R, D), call a hyperedge of F with r − 1 vertices in D unstable. If F contains an unstable hyperedge then every copy of X in H(m) is destroyed in the (m + 1)th time step, as at each time step every open hyperedge is sampled and deleted from our hypergraph. So in particular, a new copy of X can only be made from some X U such that U intersects every unstable hyperedge of F. Call such a set U fruitful.
We apologise that the following set of definitions are fairly technical. But the introduction of these concepts and Lemma 7.20 will greatly simplify and clarify the calculations that are to come in the proof of Lemma 7.9. See Figure 13 for an example of a configuration in X U and a configuration X U,U .
Throughout the rest of the section we will condition on F m and assume that ω / ∈ A m . We may not always write explicitly that we are assuming this and conditioning on F m . We will use the following random variables throughout. Given w ∈ V (H), let ξ w be the Bernoulli random variable which is equal to one if w ∈ I(m + 1) \ I(m) and zero otherwise. Also, for e ∈ Q(m), we let ξ e be the Bernoulli random variable which is equal to one if e is successfully sampled and zero otherwise. Clearly, for each v / ∈ I(m), we have that ξ v is equal to one if and only if ξ e = 1 for some e ∈ Q v (m). From this (and as we are conditioning on F m ), we have that ξ u is independent of ξ w for u, w distinct vertices of V (H) \ I(m). Also, we have (7.19) ξ Observe that if each of these hyperedges in a copy of X is successfully sampled in the (m + 1)th step, then this creates a new copy of Y 2,2 in H(m + 1).
(ii) For any set A ⊆ V (H), We remark that (ii) does gives a bound on everything we need to apply Corollary 3.6. However, when dealing with W 1 and the Y configurations we need a more careful bound on the expected change and this is why we have (iii).
Proof of Lemma 7.20. If (L.3) holds at time m, as m = O (log(d)) (by (2.31) and (2.2)), by (7.19) we have Let us consider how a copy of X rooted at S which is present in the (m + 1)th step, but not the mth step is created. Such a copy can only come from a subhypergraph F ⊆ H containing S that is like a copy of X missing some infections. F will become a new copy of X when it gains these missing infections.
More formally, a new copy comes from a subhypergraph F ⊆ H(m), a non-empty fruitful set U ⊆ D and subsets D , V ⊆ V (F ) such that: every vertex v ∈ V becomes infected at the (m + 1)th step. Let Z(X) be the set of triples (F , V , D ) that satisfy (1), (2) and (3). Then the variable X S (m + 1) − X S (m) is bounded above by f (ξ w : w ∈ V (H)), where (7.22) f (x w : w ∈ V (H)) := (F ,V ,D )∈Z(X) v∈V x v .
As V ⊆ D, f has degree at most |D|. Observing that no variable in f has an exponent greater than 1 completes the proof of (i). By definition, V is a subset of V (F )\(S ∪I(m)). So if A contains an element of (S ∪I(m)), then ∂ A f = 0. So suppose that A is a subset of V (H) \ (S ∪ I(m)), Then for any such A, we have (7.23) We can partition the set Z(X) by the cardinality of each V . So for 0 ≤ k ≤ |D|, let So we can rewrite (7.23) as follows. (7.24) Now recall Definition 7.18 and observe that any (F , V , D ) ∈ Z k (X) such that A ⊆ V is in fact a copy of X U,U rooted at S ∪ A for some U ⊆ U ⊆ D with |U | = |V |, |U | = |A| and |U | − |U | = k. So from (7.24) and (7.21), as the variables ξ w are independent (because V \ A ⊆ V (H) \ I(m) and we are conditioning on F m ) we have as required for (ii). Now substituting (7.21) into (7.22) gives Notice that the random variables on the right hand side are now associated to hyperedges, not vertices. Given a fixed (F , V , D ) ∈ Z(X), let G be a subhypergraph of H containing F such that By definition of Z(X), there exists some non-empty fruitful U ⊆ D such that G is a copy of some X ∈ X U rooted at S in H(m) .
As, for e, e distinct hyperedges in H(m), we have ξ e is independent of ξ e and By considering the cardinality of U , this can be rewritten as as required for (iii).
Now we have developed our tools, we are all set to prove Lemma 7.14. As usual we begin with the proof of the bound on W i S (m), as it is the simplest case. Proposition 7.25.
We will prove that for 0 ≤ m ≤ M 2 − 1, The proof of the proposition will then follow by taking the union bound over all 0 ≤ m ≤ M 2 − 1 and S ⊆ V (H) of size at most r − 1.
Conditioning on F m for m ≥ 0, the events A m and W(S, m + 1) are disjoint and so we assume that ω / ∈ A m . Fix S ⊆ V (H). We wish to apply Lemma 7.20 along with Corollary 3.6 to obtain the required bound on P(W(S, m + 1)). As ω / ∈ A m , (L.3) holds at time m and we may apply Lemma 7.20. So let Υ := W i and letW := f (ξ w : w ∈ V (H)), where f (x w : w ∈ V (H)) is the polynomial of degree at most r − i obtained by applying Lemma 7.20 to Υ.
We wish to apply Corollary 3.6 with and E 0 := log 2r (N ) to obtain an upper bound onW which holds with high probability. In order to apply Corollary 3.6 we must bound E j (W | F m ) for 0 ≤ j ≤ r − i.
Proof. By Lemma 7.20 (ii), for any set A ⊆ V (H) with |A| = a, So let us evaluate this expression. Recalling Definition 7.18 we see that Υ k,a contains only one configuration, the configuration Υ k,a = (F , R , D ) where F is a set of r vertices contained in a single hyperedge with i + a roots and k neutral vertices (and therefore r − i − (k + a) marked vertices). We will break the cases up by the value of a + i. First suppose a + i = r. Here k = 0 and ∂ A E(W ) ≤ 1 since H contains no hyperedge with multiplicity greater than one. Now suppose a ≥ 1 and a + i < r, so 2 ≤ a + i < r. When k ≥ 1, Υ k,a is a secondary configuration, and as ω / ∈ A m , by (L.4) we have When k = 0, as ω / ∈ A m , by (L.3) we have Using (7.28) and (7.29) to evaluate (7.27) gives that, when ω / ∈ A m and a ≥ 1, the expectation of ∂ A f (ξ v : v ∈ V (H)) conditioned on F m is at most Provided that K is large enough, this is at most τ when a ≥ 1. This concludes the argument in the case when a ≥ 1.
It remains to bound E a (W | F m ) when a = 0 and i < r. We need to be more careful here. By Lemma 7.20 (iii), Recall Definition 7.18 and observe that, for each k ≥ 1, there is precisely one U ∈ U with |U | = k. First consider when i = 1. We see that when |U | = k, we have Υ U = Z r−1−k,k . So as ω / ∈ A m , applying (L.5) to evaluate (7.30) gives that the expectation ofW given F m is at most When 1 < i < r and |U | = k, we see each configuration in Υ U consists of: (i) a secondary configuration X = (G, R , D ), where G consists of one hyperedge containing i roots and k neutral vertices, and in addition (ii) a collection of k unstable hyperedges rooted at neutral vertices of G that are not in the non-central hyperedge. So in this case, using (L.3) and (L.4) as ω / ∈ A m gives Using this to evaluate (7.30) gives that the expectation ofW given F m is at most for large enough K. This concludes the proof of the claim.
Recall that E 0 := log 2r (N ). As 2r + 1 < r 3 , by Claim 7.26 we have We also have Thus applying Corollary 3.6 gives that as required.
We will prove that for 0 ≤ m ≤ M 2 − 1, The proof of the proposition will then follow by taking the union bound over all choices of X, S and m. Conditioning on F m for m ≥ 0, the events A m and X (X, S, m + 1) are disjoint and so we assume that ω / ∈ A m . Fix S ⊆ V (H) and a secondary configuration X = (F, R, D). We wish to apply Lemma 7.20 along with Corollary 3.6 to obtain the required bound on P(X (X, S, m + 1)). As ω / ∈ A m , (L.3) holds at time m and we may apply Lemma 7.20. So letX := f (ξ w : w ∈ V (H)), where f (x w : w ∈ V (H)) is the polynomial of degree at most |D| obtained by applying Lemma 7.20 to X.
We wish to apply Corollary 3.6 with and E 0 := log 2|D|+1 (d) to obtain an upper bound onX which holds with high probability. We will prove the following claim.
Claim 7.32. Let ω / ∈ A m . For 0 ≤ j ≤ |D|, Proof. By Lemma 7.20 (ii), for any set A ⊆ V (H) with |A| = a, So let us evaluate this expression. By Remark 2.26, every X ∈ X k,a is a secondary config- By definition, |X k,a | = O(1). So as ω / ∈ A m and |D| < 3r, using (L.4) to bound each such as required.
As X is secondary, |D| ≤ 3r and Using this and the fact that E(X) = o(τ ) (by Claim 7.32), applying Corollary 3.6 gives that as required.
We will prove that for 0 ≤ m ≤ M 2 − 1, The proof of the proposition will then follow by taking the union bound over all choices of v, i and m. Conditioning on F m for m ≥ 0, the events A m and Y 0 (v, i, m + 1) are disjoint and so we assume that ω / ∈ A m . Fix v ∈ V (H), 1 ≤ i ≤ r − 2 and set Υ := Y i = (F, R, D). We wish to apply Lemma 7.20 along with Corollary 3.6 to obtain the required bound on P(Y 0 (v, i, m + 1)). As ω / ∈ A m , (L.3) holds at time m and we may apply Lemma 7.20. So letỸ := f (ξ w : w ∈ V (H)), where f (x w : w ∈ V (H)) is the polynomial of degree at most i obtained by applying Lemma 7.20 to Υ.
We wish to apply Corollary 3.6 with and E 0 := log r 2 +1 (N ) to obtain an upper bound onỸ which holds with high probability. We will prove the following claim. Also, Proof. By Lemma 7.20 (ii), for any set A ⊆ V (H) with |A| = a, So let us evaluate this expression. When |A| ≥ 1, as i ≤ r − 2 each Υ ∈ Υ k,a is a secondary configuration (F , R , D ) with |V (F)| = r, |R | = |A| + 1, . So as ω / ∈ A m and |D| < 3r, using (L.4) to bound each such Υ {v}∪A (m) gives that . This proves the first statement of the claim.
In the case A = ∅, we need to bound the expectation of f (ξ w : w ∈ V (H)) more carefully. By Lemma 7.20 (iii), Recall Definition 7.18 and observe that in this case, for each 1 ≤ k ≤ i, there is precisely one U ∈ U with |U | = k. Here, we see that Υ U = Z i−k,k . So as ω / ∈ A m , applying (L.5) and using (7.6) to bound χ(i − k, k) (as i − k ≤ r − 3) to evaluate (7.35) gives This proves the second statement of the claim. As then applying Corollary 3.6 with the bound on E(Ỹ ) obtained in Claim 7.34 gives that as required.
We will prove that for 0 ≤ m ≤ M 2 − 1, The proposition will then follow by taking the union bound over all choices of v, i, j and m.
Conditioning on F m for m ≥ 0, the events A m and Y >0 (v, i, m + 1) are disjoint and so we assume that ω / ∈ A m . Fix v ∈ V (H)0 ≤ i ≤ r −2, 1 ≤ j ≤ r −1 and set Υ := Y i,j = (F, R, D). We wish to apply Lemma 7.20 along with Corollary 3.6 to obtain the required bound on P(Y >0 (v, i, j, m + 1)). As ω / ∈ A m , (L.3) holds at time m and we may apply Lemma 7.20. So letZ := f (ξ w : w ∈ V (H)), where f (x w : w ∈ V (H)) is the polynomial of degree at most i + (r − 1)j obtained by applying Lemma 7.20 to Υ.
As j ≥ 1, Y i,j contains an unstable hyperedge and so every copy of and E 0 := log r 6 +1 (N ) to obtain an upper bound on Y i,j v (m) which holds with high probability. We will use the following claim.
Proof. By Lemma 7.20 (ii), for any set A ⊆ V (H) with |A| = a, First let us evaluate this expression when |A| ≥ 1. For some 0 ≤ k ≤ i + (r − 1)j − a, consider Υ = (F , R , D ) ∈ Υ k,a (as |A| ≥ 1, we have |R | ≥ 2). By definition, F is isomorphic to F (ignoring which vertices are roots and marked). In particular, F = e 0 ∪ e 1 ∪ · · · ∪ e j , where e 0 is the central hyperedge, the hyperedges e 1 , . . . , e j are pairwise disjoint and each e i intersects e in a neutral vertex v i .
For each 0 ≤ ≤ j, define R := e ∩ R and define D := e ∩ D . For 1 ≤ ≤ j define the configuration Z := (e , R ∪ {v }, D ). By Definition 7.18, Υ is obtained from Y i,j by making a fruitful set ∅ = U ⊆ D neutral and turning some subset U ⊆ U into roots. So as U is fruitful, |D | < r − 1 for each .
For each 1 ≤ ≤ j, the configuration Z satisfies one of the following.
(2) R = ∅ and Z contains a neutral vertex: in this case Z is a secondary configuration.
This time, let S be the set of all partitions P := (S 0 , . . . , S j ) of {v}∪A such that |S 0 | = |R 0 | and for 1 ≤ ≤ j, |S | = |R |. As in the previous case, |S| = O(1). Using this notation, we can bound the number of copies of Υ = (F , R , D ) ∈ Υ k,a rooted at {v} ∪ A. As ω / ∈ A m , we can use (L.4) to bound Z 0 S 0 and, for 1 ≤ ≤ j, we can use (7.39), (7.40) and (7.41) as in the previous case to bound Z S ∪{u } . This gives where, as before, the sum is taken over all distinct ordered (u 1 , . . . , u j ). Now we are able to evaluate (7.38) for |A| ≥ 1. As |Υ k,a | = O(1), when K is sufficiently large, as required for the first statement of the claim.
In the case A = ∅, we need to bound the expectation of f (ξ w : w ∈ V (H)) more carefully. By Lemma 7.20 (iii), Recall the definition of Υ U from Definition 7.18. Consider Υ := (F , D , R ) ∈ Υ U for some ∅ = U ⊆ D. Recall that F ⊇ F. Let e 0 be the central hyperedge of F and let e 1 , . . . , e j be the non-central hyperedges of F. For 0 ≤ ≤ j, define U := e ∩ U . By definition, each vertex u of U is the unique neutral vertex of an unstable hyperedge e u in F . Without loss of generality, suppose that |U 1 | ≥ |U 2 | ≥ . . . ≥ |U j |. As (by definition) U is fruitful (i.e. it intersects every unstable hyperedge of Υ ) it must intersect every hyperedge e 1 , . . . , e j and hence |U | ≥ 1 for all 1 ≤ ≤ j.
We will now define some configurations that will help us break up Υ in order to bound the number of copies of it in H(m). For each 0 ≤ ≤ j, define G to be the subhypergraph of F containing e ∪ {e u : u ∈ U }. Define R 0 := R (= {v}) and, for 1 ≤ ≤ j, define R := e ∩ e 0 (so |R | = 1). For 0 ≤ ≤ j, define D := G ∩ D and let Y be the configuration (G , R , D ).
To summarise, when ω / ∈ A m we have as required. As then applying Corollary 3.6 with the bound on E(Z) obtained in Claim 7.37 we get that as required.
This completes the proof of Lemma 7.9 and the proof of Theorem 1.3 in the subcritical case.

The Second Phase in the Supercritical Case
In this section we prove the "supercritical case" of Theorem 1.3 (i.e. when c r−2 α > (r−2) r−2 (r−1) r−1 ). Recall, as in the previous section, that we have "restarted the clock" and that in this case the first phase runs until time T := 1 α log N . It may be helpful to recall the exact process we run in this phase, given in Subsection 2.4. Theorem 1.3 in this case is implied by the following lemma. As mentioned above, to prove Lemma 8.1 we use a lower tail version of Janson's Inequality, Theorem 3.9. In each of our applications of Theorem 3.9, we will simply set ε = 1/2 and use the fact that ϕ(−1/2) = (1/2) log(1/2) + (1/2) ≥ 1/10. We are now ready to complete the proof of Lemma 8.1.
For 0 ≤ m ≤ M 2 − 1 and v ∈ V (H), let S(v, m + 1) be the set of ω ∈ A M 2 such that ω / ∈ A 0 , J (ω) = m + 1, v / ∈ I(m + 1) and either: , or (ii) v was in an open hyperedge in some Q i v (m + 1). We remark that as J (ω) = m + 1, v cannot have been in an open hyperedge in some Q i v ( ) for < m + 1.
So for 0 ≤ m ≤ M 2 − 1, assume ω / ∈ A m . Fix v ∈ V (H) \ I(m) and consider S(v, m + 1). First let us bound the probability that v / ∈ I(m + 1) but (ii) holds. Suppose v ∈ Q i v (m + 1). By the argument above (culminating in (8.2)), with probability at least 1 − N −10 √ log N the vertex v becomes infected (and is hence not present in I(m + 1)) when every hyperedge of Q i v (m + 1) is sampled. So the probability that v / ∈ I(m + 1) but (ii) holds is at most N −10 √ log N . We will now show that when ω / ∈ A m , Then (8.6) will follow from this and the argument of the previous paragraph.
As ω / ∈ A m , for all u ∈ V (H) \ I(m) we have  Recall the definition of Q 0 v (m + 1) from the description of the second phase process in the supercritical case, given in Subsection 2.4. If ξ e is the Bernoulli random variable which is equal to 1 if and only if e ⊆ V (G) q , we have Q 0 v (m + 1) ≥ X, where X := e∈E(G(m)) ξ e .
We will apply Theorem 3.9 to prove a bound on X and hence to prove (8.6).
To bound E(X), we must first bound |E(G)|. Using the definition of Q u (m) with the previous two expressions gives completing the proof of the claim.
As E(ξ e ) = q r−1−s , Claim 8.9 implies that when m = 0 we have (8.14) α r−1 4 · (log N ) (r−1) 2 ≤ E X | F m ≤ 2α r−1 · (log N ) (r−1) 2 , and when m > 0, we have So for any m ≥ 0, as 0 ≤ s ≤ r − 2 the lower bounds of (8.14) and (8.15) give We require one more claim before we can apply Theorem 3.9. where in the last line we used the upper bound on |E(G(0))| given by Claim 8.9. Using the upper bound of (8.14), we see the required bound for m = 0 follows when K is taken to be large with respect to r. When m ≥ 1, again using Claim 8.9 we have Using the upper bound in (8.15), we see that the required bound holds for m > 0.
So we can apply Theorem 3.9 with = 1/2 to give So using the lower bounds given in (8.14) and (8.15) we have If v is contained in an open hyperedge sampled in the second step of this round, it becomes infected with probability at least 1 − N −10 √ log n (by the argument culminating in (8.2)).
As it turns out, every strictly 2-balanced graph F with at least three edges satisfies δ(F ) ≥ 2 (see the proof of Theorem 1.5 below). Thus, in this case, the assumption that δ(F ) ≥ 2 in part (i) of Proposition 9.4 is redundant. However, for k-uniform hypergraphs with k ≥ 3, this is no longer the case. For example, a "loose k-uniform cycle" is strictly k-balanced and contains vertices of degree one; see Figure 14 for an example. We first show that Proposition 9.4 implies Theorem 1.5 before proving the proposition itself.
Proof of Theorem 1.5. First, we show that d 2 (F ) ≥ 1. If not, then F contains at most |V (F )| − 2 edges, and so F is disconnected. If F contains a connected component F with at least two edges, then d 2 (F ) ≥ 1 > d 2 (F ), contradicting the fact that F is strictly 2balanced. Otherwise, every component of F contains at most one edge, which implies that d 2 (F ) ≤ 1/2. In this case, we let F be a subgraph of F consisting of two edges and four vertices. Since F has at least three edges, we have that F is a proper subgraph of F . Also, d 2 (F ) = 1/2 ≥ d 2 (F ), which contradicts the fact that F is strictly 2-balanced.
By the previous paragraph, we have d 2 (F ) ≥ 1. In particular, this implies that F cannot contain a vertex v of degree one; otherwise, let F := F \ {v} and observe that F has at least two edges and that d 2 (F ) ≥ d 2 (F ). Thus, δ(F ) ≥ 2 and so we can apply part (i) of Proposition 9.4 to get that there exists β, s > 0 such that H K (k) n ,F is d(n, F ), d(n, F ) −s , d(n, F ) βwell behaved for n sufficiently large. The result now follows by applying Theorem 1.1 to H K (k) n ,F . We now present the proof of Proposition 9.4.
Proof of Proposition 9.4. First suppose that F is strictly k-balanced and that δ(F) ≥ 2. As H is d-regular, conditions (a) and (b) of Definition 1.2 hold for H. Also, |V (H)| = n k and so, as d = Θ n |V (F )|−k , we have Therefore, condition (e) of Definition 1.2 holds.
Next, we show that condition (c) of Definition 1.2 is satisfied. For 2 ≤ ≤ r − 1, let S be a set of vertices of H and let F be the subhypergraph of K  which implies condition (c) of Defintion 1.2 is satisfied.
Next we show that if δ(F) ≥ 2, then condition (d) of Definition 1.2 holds. Let e 1 and e 2 be two distinct hyperedges of K (k) n and let v 1 be a vertex of e 1 which is not contained in e 2 . Suppose that F is a copy of F in K (k) n containing e 2 such that (F \ {e 2 }) ∪ {e 1 } is also a copy of F. Then, as δ(F) ≥ 2, we have that F also contains v 1 . However, the number of copies of F in K since F has at least two hyperedges. Therefore, H satisfies condition (d) of Definition 1.2. This completes the proof of part (i). Now suppose that F is not strictly k-balanced and let F be a subgraph of F with at least 2 hyperedges such that d k (F ) ≥ d k (F) and define = |E(F )|. Let e be a hyperedge of H (by definition the corresponding hyperedges of K (k) n induce a copy of F ). So we can pick S be a set of vertices of e such that the corresponding hyperedges of K As we have proved, we can apply Theorem 1.1 to H K (k) n ,F if and only if F is strictly kbalanced and δ(F) ≥ 2. However, not only are these properties of F required to apply the theorem, but if they are violated, then the critical probability may take on a different value; in particular it may be lower than the result stated by our theorem.
Let us heuristically discuss why this is so. First consider the hypothesis that F is strictly kbalanced. By Proposition 9.4 (ii), if F is not strictly k-balanced, then there exists 2 ≤ ≤ r−1 such that every hyperedge contains a set S of vertices such that deg(S) = Ω d 1− −1 r−1 . In particular, the bounds on the -codegrees are not sufficient to ensure that the secondary configurations remain a lower order term to the Y configurations. For example, consider the secondary configuration X = (G, R, D) where |V (G)| = 2r − , E(G) = {e 1 , e 2 }, e 1 ∩ e 2 = {v 1 , v 2 , . . . , v }, |R| = 1, R ⊆ e 1 \e 2 and D = V (G)\R∪{v 1 }. Letting be as in Proposition 9.4 (ii), we would expect which is the same order as Y r−2,1 v (0). So right from the start of the process the Y configurations and hence the number of open hyperedges will grow faster than they would if the infection were uniform. Given this, we would expect the critical probability to be lower.
Similarly, if δ(F) = 1, then by Proposition 9.4 (iii), for every vertex e 1 ∈ V (H) there exists some e 2 such that |N H (e 1 ) ∩ N H (e 2 )| is large. We needed condition (d) of Definition 1.2 to bound the secondary configurations with no infected vertices. As in the previous paragraph, if this bound is relaxed, the secondary configurations are no longer forced to be a lower order term compared to the Y configurations and we expect this to cause the open hyperedges to grow faster than they would if the infection was uniform.