The duration of a supercritical SIR epidemic on a conﬁguration model

We consider the spread of a supercritical stochastic SIR (Susceptible, Infectious, Recovered) epidemic on a conﬁguration model random graph. We mainly focus on the ﬁnal stages of a large outbreak and provide limit results for the duration of the entire epidemic, while we allow for non-exponential distributions of the infectious period and for both ﬁnite and inﬁnite variance of the asymptotic degree distribution in the graph. Our analysis relies on the analysis of some subcritical continuous time branching processes and on ideas from ﬁrst passage percolation. As an application we investigate the effect of vaccination with an all-or-nothing vaccine on the duration of the epidemic. We show that if vaccination fails to prevent the epidemic, it often – but not always – increases the duration of the epidemic.

edge represents that two individuals have a relationship that makes it possible for the disease to transmit from one to the other.
Much is already known for (variants of) epidemics on random graphs, e.g. about the final size of the epidemic (the fraction of the population infected during the epidemic) and the probability of a large or major outbreak (to be defined below) [10,4]. In this paper we focus on the random duration of an epidemic on a configuration model graph. The duration of an epidemic is especially relevant for animal diseases. When outbreaks of those diseases occur, trade bans are often imposed on import from affected counties. So, from an economics perspective, it might be more important to reduce the duration of an epidemic than to reduce the number of animals killed by it.
For uniformly mixing populations Barbour [6] provides rigorous results on the duration of (Markov) SIR (Susceptible, Infectious, Recovered; see Section 2.3 for a definition) epidemics and Britton [10] also sketched some results about the duration of epidemic in a uniformly mixing population. A corollary of their results is that, if a major outbreak occurs in a population of size n, the time until the epidemic goes extinct divided by log n converges to an explicit constant as n → ∞. Here the duration of an epidemic is the time until the final recovery in the population, which corresponds to the time of strong extinction defined below.
We consider SIR epidemics on configuration model graphs in the large population limit. Configuration model graphs are random graphs with specified vertex degrees (see Section 2.2, or for a detailed description see [14,16]). In this graph each individual/vertex has his or her given degree (number of neighbours). The edges are created in such a way that the graph is uniform among all possible multigraphs with the given degree sequence.
We only consider major outbreaks of the epidemic. Formally we say that an outbreak is major if more than log n individuals get infected, where n is again the number of individuals in the population. It can be shown that this is (as n → ∞) equivalent to assuming that the number of ultimately infected individuals is of the same order as the population size. The beginning (until a small but non-negligible fraction of the population is infected) and the middle part (until a small but non-negligible fraction of the ultimately infected individuals still has to be infected) of a major outbreak on a configuration model have been studied before (e.g. in [7,12,25,20]). Volz [25] studied a deterministic model for the spread of an SIR epidemic through a network using a set of differential equations, keeping track of the probability that a vertex of given degree avoids infection as a function of time. Under some moment conditions his results were made rigorous by Decreusefond et al. [12]. Using a different mathematical approach Barbour and Reinert [7] study (among other things) a stochastic model for the spread of SIR epidemics on a configuration model with bounded degrees and minor conditions on the infectious period distribution. The approach of the paper is tailored for finding the distribution of the time a typical individual in the population gets infected, but is not directly suitable for finding the time of the last infection or the last infected individual recovering. Janson et al. [20] study the spread of Markov SIR epidemics on quite general configuration models and their analysis heavily relies on the memoryless infectious period. In none of the papers mentioned in this pargraph the time until the end of the epidemic is studied.
The duration of a supercritical SIR epidemic on a configuration model of that uniformly chosen vertex in an SI epidemic (i.e. an SIR epidemic with infinite infectious period). In this setting the question regarding the time until the last infection in the epidemic corresponds to the flooding time of the giant component of the random graph [1].
In the analysis of first passage percolation on random graphs in [9,8] growing "balls" around vertices are explored and the time at which the balls touch provides precise results on the distance between the center vertices of those balls. These methods are well suited for obtaining the asymptotic distribution of the distance between two vertices, but are less fit for finding flooding times and diameters (however, see [1]).
As written above, we focus on the duration of the entire epidemic, and in particular on the final stages of the epidemic. We use two definitions of the end of the epidemic: i) the time at which there are no infectious individuals in the population anymore, which we call strong extinction and ii) the first time at which there are no more infectious individuals with susceptible neighbours in the population, which we call weak extinction. We allow for quite general infectious period distributions (see Theorems 2.3 and 2.4 below), and do not have to restrict ourselves to infinite infectious periods as is the case in the first passage percolation literature. Furthermore, we pose milder conditions on the degree distribution of the configuration model than Barbour and Reinert [7], who also allow for relatively general infectious period distributions. Our approach is to use the results of [7], which are obtained through methods similar to those used in first passage percolation, to obtain the time until a typical vertex gets infected and then use subcritical branching processes to approximate the time between the infection of a typical vertex and the end of the epidemic. We show that, under some mild conditions, the time until (weak or strong) extinction of the epidemic divided by log n converges in probability to a specified constant. We note that our result is weaker in nature than the results of [7,9,8], where asymptotic distributions of the difference between, rather than the quotient of, infection times/distances of uniformly chosen vertices and their typical times/distances are provided (see also Section 8). However, as stated, we allow for more general distributions of the infectious period and degree distributions.
Finally, we briefly analyse the impact of vaccinating the entire population with an all-or-nothing vaccine. This vaccine either causes an individual to be completely immune or has no impact at all independently and with the same probability for different individuals. This vaccination strategy is asymptotically equivalent to vaccinating a uniform fraction of the population with a perfect vaccine, i.e. a vaccine which gives complete immunity.

Outline of paper
The paper is structured as follows. In Section 2 we formally define the model and provide the main theorems of the paper. In Section 3 we discuss the impact of vaccination on the duration of the epidemic, using the results of Section 2 and some heuristics. In Section 4 we present some techniques for analysing epidemics on graphs. Furthermore, we summarise results on continuous time branching processes that we need in the proofs of the main theorems. In Section 5 heuristics are given for the main theorems, while in Sections 6 and 7 these theorems are proved rigorously. In the proofs the durations of the initial and final phase of the epidemic are analysed separately. We conclude the paper with some remarks on possible sharpening of the results and on some caveats in applying the results in real life settings.

Basic notation
The following basic notation and definitions are used throughout this paper (see also e.g. [21,Section 1.2]). For f : R → R and g : R → R ≥0 and x → ∞ we write Also for f : R → R, we write f (x−) = lim y x f (y).
All random processes and random variables that we consider are defined on a rich enough probability space (Ω, F, P), which we do not further specify. The population size is always denoted by n. In this paper, asymptotic results and limits are for n → ∞, unless explicitly stated otherwise. We say that an event occurs with high probability (w.h.p.) if the probability of the event converges to 1. Furthermore, a.s. → denotes almost sure convergence, P → denotes convergence in probability, and d → denotes convergence in distribution.
We denote the set of strictly positive integers by N and write N 0 = N ∪ {0}. Furthermore, N ≤x = [1, x] ∩ N. The sets N ≥x , N <x and N >x are defined similarly. Throughout, the cardinality of a set X is denoted by |X |.

Construction of the random graph and assumptions on the degree distribution
The epidemic spreads on a random graph G (n) = (V (n) , E (n) ). The set V (n) consists of n vertices that represent the individuals, and the edge set E (n) represent connections/relationships of individuals through which infection might transmit. For v ∈ V (n) , the degree of vertex v (i.e. the number of edges adjacent to vertex v) is denoted by d v . We assume that d v ∈ N, since vertices of degree 0 will not be infected anyway. G (n) is generated through a configuration model with given degree sequence {d v } v∈V (n) .
The graph is constructed by assigning d v half-edges (edges with only one endpoint assigned to a vertex) to vertex v for v ∈ V (n) and pairing those half-edges uniformly at random. By this construction every vertex has the right degree, although it is possible that there is more than one edge between a pair of vertices (parallel edges) or that an edge connects a vertex to itself (a self-loop). In the graph, parallel edges are counted separately in the degree and a self-loop adds two to the degree of a vertex. Define (2.1) Observe that (n) is even, since every edge in E (n) adds 2 to the total degree of the graph. We make the following assumptions Assumption 2.1. There exists an N valued random variable D such that If k ∈ N is such that P(D ≥ k) = 0, then there exists n 0 ∈ N, such that for all n ∈ N ≥n0 , it holds that v∈V (n) 1(d v ≥ k) = 0.
The duration of a supercritical SIR epidemic on a configuration model Assumption (A4) is introduced for technical purposes in the proofs. We expect that this condition is not needed for the results to be true. This assumption assures that D provides in some sense enough information on the highest degree vertices of a large but finite graph. Assumption (A4) obviously holds if D has unbounded support or if the degrees of vertices in V (n) are i.i.d. and distributed as D.
The "size biased" random variableD is defined through Let D (n) be a random variable with the same distribution as the degree of a vertex chosen uniformly at random from the graph. That is, Note thatD (n) is distributed as the degree of a vertex adjacent to a uniformly chosen edge from the graph. By (A1) and (A2), For the epidemic process on the graph, we merge parallel edges and ignore self-

The SIR epidemic
We consider an SIR (Susceptible, Infectious, Recovered) epidemic on G (n) . We say that a vertex is susceptible, infectious or recovered if the individual it represents is in this "infection state". Neighbours in the population contact each other according to independent homogeneous Poisson processes with rate β, and if the contact is between a susceptible and an infectious vertex, then the susceptible one becomes immediately infectious itself. Infectious vertices stay so for a random period distributed as the random variable L, which is [0, ∞]-valued. All infectious periods and Poisson processes are independent of each other. A contact by an infectious vertex is called an infectious contact, whether or not the "contactee" is susceptible. Throughout we assume that at time 0, there is one infectious individual, which is chosen uniformly at random from the population and all other individuals are susceptible. This assumption is purely for ease of exposition. It is straightforward to extend our analysis and results to some other initial conditions (see Section 8).
The probability, ψ say, that an infected vertex makes an infectious contact with a given neighbour (and infects it if that neighbour is still susceptible) is given by where we used partial integration and the shorthand L(dt) = P(L ∈ dt). We denote the sets of susceptible, infectious and recovered individuals at time t by S (n) (t), I (n) (t) and R (n) (t) respectively. We say that the epidemic goes strongly extinct or ends before time t if |I (n) (t)| = 0. Lastly, we let X (n) (t) be the set of pairs of neighbours of which one is susceptible and the other infectious. We say that the epidemic is weakly extinct at time t if |X (n) (t)| = 0.
Throughout we use continuous time branching processes [19,Ch. 6] to approximate the epidemic process. We rely on theory for those processes for which there exists a number α (called Malthusian parameter, or real-time growth rate) which satisfies ∞ 0 e −αt µ(dt) = 1, where µ(s) = s 0 µ(dt) is the expected number of births of children of a particle up to time s, i.e. {µ(s); s ≥ 0} defines the mean offspring measure of the branching process.
Below we define and justify a branching process approximation for the early stages of a SIR epidemic. The approximating branching process has mean offspring measure Here E[D − 1], can be interpreted as the expected number of susceptible neighbours a "typical" individual infected in the early stages, has at the moment he or she gets infected. Following the terminology from epidemiology, we define the basic reproduction number R 0 as the expected total number of children of a particle in the branching process: Here we used (2.3) for the last identity. If R 0 > 1 the epidemic is supercritical and α exists and is strictly positive. If on the other hand R 0 < 1, the process is subcritical and α might exist and if it does, α is strictly negative. If R 0 = 1 the epidemic is critical and the corresponding α trivially equals 0.
In epidemic literature R 0 is arguably the most studied quantity (e.g. [13]). It is usually defined as the average number of secondary infections caused by a typical infected individual in the early stages of an epidemic which started in a fully susceptible population. This definition is consistent with (2.6).
The main contribution of this paper is the observation that the final stage of the epidemic can also be approximated by a branching process. The mean offspring measure of this branching process is This mean offspring measure can be understood intuitively by considering the graph of vertices that have not been infected before the final stage of the epidemic and the edges that connect those vertices. In Section 5.2 we show heuristically that the probability that a vertex which is part of this "remaining graph" has degree k in G (n) , is proportional to p k (1 − ψ + ψq * ) k , whereq * is the asymptotic probability that a neighbour (v say) of a uniformly chosen vertex from G (n) avoids being infected by any of its other neighbours and 1 − ψ + ψq * =q * + (1 −q * )(1 − ψ) is the probability that a given neighbour of v does not transmit the infection to v (either by avoiding being infected itself, or by not transmitting the infection if infected). Some straightforward arguments (see Section 5.2) EJP 26 (2021), paper 112. then shows that the expected number of susceptible neighbours a "typical" individual that is infected in the final stage of the epidemic has at the moment he or she gets infected, is given by E (D − 1)(1 − ψ + ψq * )D −1 .

The main results
In this subsection we state the main results of the paper. Heuristics for these results will be given in Section 5 and rigorous proofs are provided in Sections 6 and 7. We consider an SIR epidemic on the configuration model graph G (n) = (V (n) , E (n) ) with degrees satisfying Assumptions 2.1. The infectious periods are distributed as L, and neighbours contact each other according to independent Poisson processes with intensity β. Throughout we condition on a major outbreak, which we denote by M (n) and define as an outbreak in which more than log n vertices get infected, i.e.
In this definition the log n term can be replaced by any increasing function which goes to infinity but is o(n). It can be proved that |S (n) (0) \ S (n) (∞)| = Θ(n) on M (n) w.h.p. This can be shown in a similar way as the corresponding result in [5,Thm. 3.5].
Define the time until strong extinction of an epidemic in a population of size n by T * (n) = inf{t ≥ 0; |I (n) (t)| = 0}. (2.10) We also consider the time of weak extinction T † (n), i.e. the time after which no further infections are possible, because there are no more infected vertices with susceptible neighbours. That is, (2.12) where for x ∈ R g (x) = ∞ 0 e −xt µ (dt) (2.13) and µ (dt) = E[D − 1]βe −βt P(L > t)dt as defined in (2.5). Note that x → g (x) is continuous and decreasing and that g (0) = R 0 > 1 and g (x) → 0 as x → ∞. Therefore, there is a unique α > 0 such that g (α ) = 1 and α is well defined. This α corresponds to the real time growth rate of the epidemic in its early stages (see Lemma 4.1 below). If and therefore the random graph has w.h.p. a giant component for both of the cases in which we define α . Define α * = inf{x ∈ R; g * (x) < 1}, (2.14) where for x ∈ R g * (x) = ∞ 0 e −xt µ * (dt), (2.15) and µ * (dt) is defined in (2.7). From Claim 7.1 below it follows that and thus that µ * (dt) is well defined. It is easy to see that either EJP 26 (2021), paper 112. or g (x) → ∞ as x → −∞. In the same way as for x → g (x), note that x → g * (x) is continuous and decreasing on (α † , ∞) and that g * (x) → 0 as x → ∞. Therefore, α * is well defined. This α * corresponds to the real-time decline rate of an epidemic during its final stages (see Lemma 4.3 below).
For further use, define which can be interpreted as the expected number of infections caused by a typical infected vertex during the final stages of the epidemic.
By standard theory on supercritical branching processes [19], we obtainq * ∈ (0, 1), becauseq * is the extinction probability of a supercritical branching process of which the number of children of a particle is Mixed Binomially distributed, with "number of trials" distributionD − 1 and "success probability" ψ, and therefore with offspring mean R 0 > 1 [4]. By Lemma 2.2 below the branching process defined through µ * (dt) is subcritical.
Before stating the main theorem, we provide the following lemma, the proof of which is provided in Section 7.
The first main theorem is on the time until strong extinction.
if and only if ∞ 0 e (|α * |−η)t L(dt) < ∞ for all η ∈ (0, |α * |). (2.17) Using this main theorem, we obtain in a straightforward fashion a result for the time until weak extinction as well (see the proof in Section 7.3).    So, condition (2.17) is equivalent to ∞ 0 e (|α * |−η)t P(L > t)dt < ∞ for all η ∈ (0, |α * |). (2.18) This condition guarantees that w.h.p. none of the individuals infected during the epidemic will stay infectious for a time longer than log[n]/|α * |. Condition (2.18) gives that Theorem 2.3 provides the (scaled) duration of the epidemic if P(L > t) decays faster than exponential, but not if P(L > t) decays slower than exponential or exponentially with rate less than |α * |.

Remark 2.7.
In order to understand the definition of α * in (2.14) it is good to study g * (x). This function is strictly decreasing, and we may define Recalling the definition of µ * (·), we see that for all x > −β Together this has the following implications.
Remark 2.8. Intuition from first passage percolation (e.g. [9,8]) and research on the epidemic curve [20,7,6] suggests that (possibly with some extra conditions on the distributions of the infectious period and degrees) log n might converge in distribution to a non-degenerate, a.s. finite random variable. We did not try to prove this or identify which extra conditions would be necessary for such a proof.
In order to prove Theorem 2.3 we use some lemmas. Let (2.20) whereq * is defined through (2.8). Copying the steps of the corresponding result for random intersection graphs as provided in [5,Thm. 3.4] (see also Section 5.2 below for a heuristic branching process interpretation of q * and Lemma 2.9), we obtain EJP 26 (2021), paper 112.

Vaccination
In this section we briefly discuss the effect of vaccination on the duration of an epidemic. We give heuristics on the effect of vaccinating everybody in the population with an all-or-nothing vaccine in uniformly mixing populations and on configuration model graphs. We assume that the vaccination takes place before the outbreak starts. We only consider the case where the vaccination is not enough to make the epidemic process subcritical, i.e. the effective R 0 stays strictly larger than 1.
Recall that with an all-or-nothing vaccine, a vaccinated individual will not be affected by the vaccine (say with probability c ∈ (0, 1]) or will be immune to the infection (with probability 1 − c), independently of the effect of the vaccine on other individuals. In what follows, we decorate quantities associated with the epidemic in such a vaccinated population with a subscript c. Note that c does not stand for (vaccine) coverage, as is sometimes the case in epidemiological literature. We choose to parametrise vaccination in such a way that the epidemic spreads more (and thus the effectiveness of the vaccine decreases) if the parameter increases.
To analyse the spread of the SIR epidemic after the all-or-nothing vaccine is administered, we label the vertices of the random graph already by "fully susceptible" or "not susceptible" (depending on the impact of the vaccine on the vertex) before the half-edges are paired. Half-edges receive the same label as the vertex they belong to. Not susceptible vertices (apart from possibly the initially infected vertex) do not contribute to the spread the disease, while the fully susceptible vertices spread to their fully susceptible neighbours as if they were not vaccinated at all. Therefore, we can model the epidemic in the vaccinated population, as if the epidemic is spreading on the graph consisting of the fully susceptible vertices and the fully susceptible half-edges that are paired with other fully susceptible half-edges. So, the all-or-nothing vaccine effectively changes the (fully susceptible) population size to n c , a Bin(n, c) distributed random variable. The limiting degree distribution of the fully susceptible vertices (say D c ) becomes a mixed binomial random variable with "number of trials distribution" D and "success probability" c, because each fully susceptible half-edge is paired with a uniformly chosen other half-edge, and the asymptotic fraction of half-edges that is fully susceptible is c. Below we use the trivial observation that log(nc) log n P → 1. Furthermore, ifD c is the size biased version of D c , then So,D c − 1 is mixed binomial with "number of trials distribution"D − 1 and "success probability" c. Using this, and the binomial theorem we deduce that for x ∈ (0, 1), Note that if x = 1, then the left and right hand side both give the probability thatD c = 2. we obtain that T * c (n), the time until the end of the epidemic after vaccination satisfies for all > 0 where α c and α * c satisfy We now discuss the effect of vaccination in a uniformly mixing population (the second example) and in two different configuration model random graphs (the first and third example). In Examples 3.1 and 3.2 (α c ) −1 + |α * c | −1 decreases as c increases (i.e. the duration of a large outbreak increases as the vaccine becomes more effective) and the duration of a large outbreak is smaller in a vaccinated population than in an unvaccinated population).

So, filling this in in (3.2) and (3.3) we obtain
We assume that the disease is supercritical even after vaccination. So, and by Lemma 2.
By their definitions in (3.1), α c and |α * c | are given through and it follows that α c is increasing in c and |α * which is strictly negative by (3.6) and (3.7). So, both α c and |α * c | are increasing in c, which implies that (α c ) −1 + |α * c | −1 , and thus the limiting duration of a large outbreak, is decreasing in c (and increasing in the efficacy of the vaccine).
The duration of a supercritical SIR epidemic on a configuration model Example 3.2. In a uniformly mixing population, all pairs of individuals contact each other independently of each other at rate β. In order for the model to be interesting and the expected number of contacts per individual to stay constant if n → ∞, we assume β = β /n. The uniformly mixing population does not satisfy the conditions of our paper, but we can take several approaches to still analyse the uniformly mixing population. One is to deduce the branching process approximations used in this paper also for the uniformly mixing population, and use that in Theorem 2.3 the quantities α c and α * c are the Malthusian parameters for those branching processes.
We however, use the previous example of analysing an epidemic on a configuration model with a Poisson degree distribution, where β = βλ, and where the expected degree λ goes to infinity. It is easily checked that the epidemic generated graph (see e.g. [5]) of the epidemic on the configuration model converges locally to that of the uniformly mixing population.
Note that for all t > 0 we have that as λ → ∞, then e −(β /λ)t → 1 and We further observe that for is the principal branch of the Lambert W function, which is a continuous function of x [11]. Together with (3.11) this implies that, Filling in β = β /λ in (3.8) and (3.9) and taking the limit λ → ∞ gives that α c and α * Assume that c is such that the epidemic is still supercritical after vaccination. That is, assume that inequality (3.6) and thus (3.7) still hold. It is again immediate that α c is increasing in c and |α * c | is decreasing in cq * c . Filling in λψ = β E[L] in (3.10) we obtain which is strictly negative by (3.6) and (3.7). As in Example 3.1 this implies that increasing the efficacy of the vaccination, without making the epidemic subcritical, increases the asymptotic duration of the epidemic.

Example 3.3.
For this example we use the following intuition. Unvaccinated vertices of very high degree are very likely to be infected during the early stages of an epidemic, even if a fraction of their neighbours are vaccinated. Therefore, those vertices will hardly play a role in the duration of the final phase of the epidemic. Vertices who have initially 1 unvaccinated neighbour cannot be infected and pass the disease on to other individuals, because the unvaccinated neighbour must be their infector. So, vertices with one unvaccinated neighbour that are still susceptible after the intermediate phase of the epidemic will shorten the final stage of the epidemic. If the infectious period is exponentially distributed, then vertices with two unvaccinated neighbours in the final stages of the epidemic do not lengthen the duration of the epidemic if they get infected, because the neighbour who is not the infector might be infected before, in which case the number of infectious, susceptible pairs decreases by the infection, or that neighbour is still susceptible in which case the number of infectious, susceptible pairs stays the same. So, as an example we consider a population in which vertices may have very large EJP 26 (2021), paper 112. degree or have degree 1 or 2, and such that after vaccination of a small proportion of the population the "effective degree distribution" has more mass on 1.
Using this intuition we consider the model with Furthermore, let t 0 0 and assume that β = 99/100 and P(L > t) = e −µt for t < t 0 and P(L > t 0 ) = 0 with µ = 1/100, that is, L is exponentially distributed with a cut-off at t 0 . This cut-off is needed for ∞ 0 te −α * t L(dt) to be finite. The above parameters make that in the limit t 0 → ∞, we obtain ψ = 99/100. Without vaccination 1 α + 1 |α * | = 2.04, while with 1% of the population vaccinated, i.e. with c = 0.99 we obtain 1 That is, vaccinating 1% of the population does not necessarily prevent the large outbreak and if a large outbreaks occurs it ends faster.

The epidemic on the graph 4.1 Construction of the graph together with the epidemic
For the proof of the main theorems we rely on the following explicit step-by-step simultaneous construction of the graph G (n) and the epidemic process or more precisely, on the simultaneous construction of the cluster of vertices of G (n) which are ultimately recovered and the epidemic process. In this construction we see contacts as asymmetric: the times v contacts v are not necessarily the same as when v contacts v, but contacts in both directions occur according to independent Poisson Processes with intensity β. Only when an infectious vertex contacts a susceptible neighbour (and not when a susceptible vertex contacts an infectious neighbour) the susceptible becomes infectious. Since in both the directed and undirected interpretation of contacts, contacts from an infected to a susceptible neighbour occur at intensity β, the spread of the epidemic is unaltered.
Label the vertices in V (n) by 1, 2, · · · , n, such that is a non-decreasing degree sequence satisfying Assumptions 2.1. Let be the set containing (n) elements, corresponding to the half-edges used in the construction of G (n) and let be an infinite sequence of (2 dimensional) elements of s (n) , where the elements are chosen independently with replacement and uniformly at random. Further define i.i.d. random variables which are exponentially distributed with expectation 1/β. We may interpret τ v,j as the first time after its infection (if this happens) that v makes a contact along the half-edge (v, j).
infectious during the epidemic, we interpret L v as the infectious period of vertex v. Otherwise, L v has no epidemiological interpretation. Furthermore, let x 0 be the initially infected vertex, which is chosen uniformly at random from the population. All random variables defined in this paragraph are independent of each other.
For reasons that will become clear in the proof of Theorem 2.4, we define (4.1) So L v time units after infection, v is either recovered or has made contacts to all of its neighbours. This implies that L v time units after infection v is no longer the infectious vertex in an infectious-susceptible pair, because v has either recovered or has made contacts to all of its neighbours (of which some might have been infected before). In particular, if we say that for v ∈ V (n) , vertex v recovers L v instead of L v time units after v got infected, the spread of the epidemic is unaltered.
We define the following process of partitions of the set of half-edges and vertices, in which the half-edges are paired at the moment a contact involving an infectious vertex is made.
In this process S (n) (t), I (n) (t) and R (n) (t) are respectively the sets of susceptible, infectious and recovered vertices at time t. The set E (n) be the time that v gets infected, which corresponds to the time at which the first halfedge belonging to vertex v is added to {E (n) P (t); t ≥ 0}. Throughout the process the sequence x (n) is explored element by element and x (n) (t) is the set of elements of x (n) explored before or at time t.
The construction of {X (n) (t); t ≥ 0} is as follows.
• Start of construction: Choose the initial infected vertex x 0 uniformly at random. So, which can be interpreted as the first time after time t that a step in the process {X (n) (t); t ≥ 0} occurs, by either a recovery of an infected vertex or a pairing of two half-edges and possibly the infection of a vertex. Because the distribution of the "τ random variables" does not have any atoms, the infection times of (infected) vertices are almost surely different and at t + (t) almost surely only one event occurs.
R (t + (t)) and u ∈ R(t + (t)). If t + (t) = σ(u) + τ u,j for some (u, j) ∈ E  consider (x k+1 , x k+1 ), which is the half-edge (u, j) "wants to" be paired with if it is still possible. The half-edge (x k+1 , x k+1 ) is considered explored from time t + (t) on, i.e. (x k+1 , x k+1 ) ∈ x (n) (t + (t)). We distinguish between the following cases for further changes in X (n) (t) at time t + (t) = σ(u) + τ u,j . - none of the other half-edges and none of the vertices changes.
, then take the same steps as above with (x k+1 , x k+1 ) replaced by (x k+2 , x k+2 ) and so on, while treating all considered half-edges as explored.
• Continue the above construction until I (n) (t) = ∅. That is, until there are no infectious vertices left.

Branching processes theory background
Throughout the manuscript we use several continuous time branching processes. In this section we summarise some of the results we use in the analysis of the duration of the epidemic. Some of the branching processes that we use are two stage branching processes in the sense that the reproduction law for the ancestor is different from that of the other particles in the process. In the exposition below we use a single stage branching process, but extending the results to two stage branching processes is straightforward. For further theory we refer to [19,Chapter 6] and [16,Chapter 3].
We decorate particles in the branching process with a lifetime, distributed as some [0, ∞]-valued random variable Λ and we assume that P(ξ(Λ) = ξ(∞)) = 1. Let Z(t) be the number of particles in the branching process at time t and Z tot (t) the number of particles born in the branching process up to and including time t. Furthermore, let Z(t; a) be the number of particles alive at time t and of age at most a, i.e. born after time t − a. The following Lemma follows immediately from Theorems 2.1 and 2.4 of [18] and Theorem 5.4 of [22].
where W and W are a.s. finite random variables satisfying If in addition We need the following corollary of (4.3) and (4.4) in this lemma.
Proof. We only provide the proof of (log k) −1T k a.s.
can be proved in an identical way.
Note that because {ξ(t); t ≥ 0} has neither atoms nor multiple points at the same then (4.3) and (4.4) cannot both be true, which finishes the proof.
To approximate the final phase of an epidemic we use a subcritical branching process. For these branching processes equation (2.4) does not necessarily have a solution.
However if it has, then we may obtain some useful results. First note that α < 0. Let the life-length of particles be distributed as Λ. From Theorem 6.2 of [19], we immediately obtain then e |α|t P(Z(t) > 0|Z(0) = 1) converges to a strictly positive and finite limit.
Below we use the following Corollary of this Lemma. The duration of a supercritical SIR epidemic on a configuration model Proof. It is enough to prove that for every δ ∈ (0, 1) and as k → ∞, are independent branching processes distributed as the subcritical branching process satisfying Z j (0) = 1. So, and we obtain that So, we obtain which converges to 0, since (1 − ck −1 ) k → e −c as k → ∞ and the proof of the corollary is complete.

Heuristics
In this subsection we provide some heuristic arguments for Theorem 2.3. If a large outbreak occurs, the epidemic can be subdivided into three phases, which can be roughly described as follows. Let > 0 be small. In the initial phase the number of susceptible vertices decreases from n − 1 to (1 − )n. In the intermediate phase the number of susceptible vertices decreases from (1 − )n to (q * + )n. While the final stage of the epidemic lasts from the moment that the number of susceptible vertices is (q * + )n until there are no more infectious vertices in the population.

The initial and intermediate phase of the epidemic
The primary intuition for the initial phase is that |I(t)|, the number of infectious vertices at time t and |I(t)| + |R(t)|, the number of vertices infected before time t are well approximated by a branching process with mean measure given by (2.5) as long as n −1 |S(t)| > 1 − for > 0 but small. The result of Lemma 2.10 then follows by applying Corollary 4.2 with k = n.
To justify the use of (2.5), assume that the degree of a vertex uniformly taken from the population of size n has exactly the same distribution function as D, then a newly infected vertex has degree distributionD, because of size biasing effects (see e.g. [14]). Apart from one (the infector) all of the neighbours of this newly infected vertex are susceptible with high probability. A newly infected vertex stays infectious for a random time distributed as L. Neighbours contact each other with intensity β, and if the contact is between a susceptible and an infectious vertex then the susceptible one becomes infected, which can be interpreted as being a child of his or her infector in the approximating branching process. So in an approximating branching process we obtain expression (2.5): is the expected number of susceptible neighbours of a newly infected vertex, βe −βt is the density of the time since infection of the first contact with a given neighbour, while P(L > t) is the probability that the vertex is still infectious at this time of first contact. The Malthusian parameter of this approximating branching process is therefore given by (2.12).
In the intermediate phase of the epidemic, |S(t)|, |I(t)|, and the number of infectioussusceptible neighbour pairs are all Θ(n). This implies that changes in n −1 |S(t)|, occur at an Θ(1) rate and the intermediate phase has duration Θ(1).
Our proof of Lemma 2.10, however makes use of the fact that the initial and intermediate phase of the epidemic are, with some extra conditions on D and L, studied by Barbour and Reinert in [7]. They study the evolution of |S( √ n} is the time when √ n vertices are infected or recovered. As a corollary of the results of [7] it follows that for T γ (n) defined as in Lemma 2.10, T γ (n) − (α ) −1 log n converges in distribution as n → ∞. We avoid the extra conditions of [7] at the cost of only being able to study the convergence of T γ (n)/(log n).

The final phase of the epidemic
In order to describe the end of the epidemic more work is required. We use that for 1 − q * − γ > 0 but small, the time interval between T γ (n) and T * (n), none of the quantities n −1 |S (n) (t)| and n −1 |E    neighbours which are still susceptible of newly infected vertices are constant during this final phase. In particular, the degree distribution of a vertex infected during the final phase of the epidemic should be well approximated by the size biased degree distribution of ultimately susceptible vertices, while the fraction of susceptible neighbours of a newly infected vertex in this phase of the epidemic should be well approximated by the fraction of susceptible neighbours of ultimately susceptible vertices. We now find those quantities.
Let D * be a random variable, such that the degree of a uniformly chosen ultimately susceptible vertex converges in distribution to D * as n → ∞. And let p * ss be the probability that a given neighbour of an ultimately susceptible vertex is ultimately susceptible itself. Below we show that p * ss is indeed well defined, and whether a given neighbour of an ultimately susceptible vertex is susceptible is independent of the degree of that vertex.
The end of the epidemic is then described by offspring measure Combining the above with (5.1) and Corollary 4.4 then gives Lemma 2.11.

Degree distribution of ultimately susceptible individuals
In this section we use ideas from [3,4,5]. Although we do not use them explicitly, these ideas are related to susceptibility sets and could likely be expressed in those terms here as well. The arguments of this section are self-contained.
It is important to note that in the epidemic process the event that a vertex is ultimately recovered does not depend on its infectious period, even when infectious periods are random. This fact helps us to derive the probability of a vertex being ultimately susceptible and of degree k (as in [2]), which then yields the degree distribution of the ultimately susceptible individuals.
Assume that a large outbreak occurs, which happens with the same probability as the survival of the branching process approximating the early spread of the epidemic, (see e.g. [4]). Recall that there is only one initially infectious individual. So, as n → ∞, the probability that a uniformly chosen vertex is the initial infectious vertex converges to 0. Therefore, the probability that a uniformly chosen vertex v is ultimately susceptible (i.e. it escapes the epidemic) is given by where ξ k is probability that a vertex of degree k does not acquire the infection by any of its neighbours until the end of the epidemic. We denote a neighbour of vertex v by u. Recall that 1 − ψ is the probability that u does not contact v during its infectious period, if u would become infected. Letq * denote the probability that u escapes the epidemic (we determineq * later). Then, ξ k is given by Similarly,q * , the probability that u escapes infection by all of its neighbours other than v is given byq * = ∞ k=0ξ kpk , (5.5) whereξ k is the probability that a degree k vertex does not acquire the infection from k − 1 given neighbouring vertices and is defined as Here we consider only k − 1 of the k neighbours of u because we assume that u does not acquire infection from v. Equations (5.5) and (5.6) give thatq * is a solution of In this heuristic argument we claim without proof thatq * is the smallest solution of this identity. So we have an implicit expression for the probabilityq * that neighbour u escapes the epidemic. Moreover, from (5.3) we obtain the probability that a vertex of degree k escapes the epidemic. From this we deduce that the probability that an ultimately susceptible individual has degree k (say p * k ) is given by where ξ is a normalising constant and is defined in (5.4). The size biased distribution of the ultimately susceptible individuals is given through

Fraction of ultimately susceptible neighbours of an ultimately susceptible vertex
Let v be an arbitrary vertex of degree k and u one of its neighbours. We compute the fraction of neighbours of an ultimately susceptible individual which are also ultimately susceptible as the following conditional probability: p * ss (k) = P(u is ultimately susceptible | v is ultimately susceptible), = P(v and u are ultimately susceptible) P(v is ultimately susceptible) =q * ξ k ξ k =q * 1 − ψ + ψq * . Note that this probability is independent of the degree k of vertex v and therefore we can write p * ss (k) = p * ss . To understand (5.10), recall thatq * is the probability that the initially susceptible neighbour u escapes the infection from all its neighbouring vertices, apart from possibly v,ξ k is the probability that v escapes infection from all of its neighbours, apart from possibly u, and ξ k is the unconditional probability that vertex v does not acquire the infection until the end of the epidemic.

Proof of Lemma 2.10
We split the proof into two lemmas which trivially imply Lemma 2.10. Note that E[D 2 ] = ∞ implies α = ∞ and the equivalent of Lemma 6.2 is meaningless.
Lemma 6.1 still holds in that case.
Proof of Lemma 6.1. Assume first that D (n) has uniformly bounded support, that is, there exist K > 0 such that P(D (n) > K) = 0 for all n ∈ N. Furthermore, assume that there exists L max ∈ (0, ∞) such that P(L > L max ) = 0, i.e. we assume that L has bounded support. Under those assumptions the conditions of [7, Thm. 3.3] are satisfied. Note that in the notation of [7], λ is the Malthusian parameter (α in our notation) and N is the population size (n in our notation). It is easily deduced from equation (3.11) and the definition of τ N on page 27 of [7] that τ N /[log N ] P → 1/(2λ) on M (n) . Finally, the expressionŝ l (u) in [7] is independent of N for all l ∈ {1, 2, · · · , K}. Translating the notation of [7, Thm. 3.3] to our notation we obtain as an immediate corollary that for every γ ∈ (0, 1 − q * ) and every δ > 0, 1 n S (n) (α ) −1 + δ log n 1(M (n) ) < q * + γ w.h.p.
To obtain the results without the extra conditions, let K = K(δ) be a large constant satisfying some properties specified later. Mark (before the pairing) all half-edges belonging to vertices with degree strictly larger than K. By assumptions (A1) and (A2) one can make the fraction of half-edges that are marked arbitrary small by choosing K and n large enough. The next step is to pair all half-edges (ignoring whether they are marked and unmarked) uniformly at random as before. Then delete all edges which contain at least one marked half-edge. If a fraction δ 1 = δ 1 (K) of the half-edges is marked then the remaining degree distribution of the graph is a Mixed Binomial distribution with random "number of trials parameter" D (n) 1(D (n) ≤ K) and "probability parameter" 1 − δ 1 . Let D In particular, we obtain that In addition we consider an epidemic on the newly created (thinned) graph with infectious period distribution L = L1(L < L max ) + L max 1(L ≥ L max ).
So, in the new model we have deleted some edges and shortened some infectious periods, which means that the epidemic spreads faster in the original model than in the new model.
For this new epidemic we deduce from (2.12) that the Malthusian parameter is the while lim x→∞ f (x, L max ) = 0 for all L max ∈ (0, ∞]. It follows that the solution of (6.1) converges to α as K → ∞ and L max → ∞. In particular, for every δ > 0, there exists K 0 < ∞ and L 0 < ∞ such that for all K > K 0 and L max > L 0 , the x ∈ R solving (6.1) satisfies 1/x < 1/α + δ/2. So, by choosing L max and K large enough (but finite), we are in the realm of [7, Thm. 3.3] and for the corresponding model we obtain that for every γ ∈ (0, 1 − q * ) and δ > 0 with high probability it holds that, 1 n S (n) 1 α + δ/2 + δ/2 log n < q * + γ , which finishes the proof of Lemma 6.1.
Proof of Lemma 6.2. In order to prove the lemma we prove the following stronger statement: The number of vertices affected by the epidemic up to time log n α +δ satisfies |n − S (n) ( log n α +δ )| = o(n) with high probability for all δ > 0. Now, for δ 1 > 0, let As before, because R 0 > 1, we know that α 1 exists and is positive for all δ 1 ≥ 0 and is continuous increasing in δ 1 on [0, ∞). In particular, for every δ > 0, we can and do choose δ 1 > 0 such that α 1 (δ 1 ) < α + δ/2. We use the notation of Section 4.1, where the vertices in V (n) are labelled such that the degree sequence d 1 , d 2 , · · · , d n is non-decreasing. We also use that 2 (n) = O(n) by assumption (A3) and the assumption E[D 2 ] < ∞ (or equivalently E[D] < ∞).
That is, D (n) (x; i) is the degree distribution of the vertices not chosen in the first i elements of x.
Note that for all i ≤ i 0 = 1 n , the random variable D (n) (x; i) is stochastically dominated by D (n) ( 1 ), which is defined through So, D (n) ( 1 ) is the degree distribution of a vertex chosen uniformly from the n − i 0 vertices with highest degrees, which is stochastically increasing in i 0 . LetD (n) ( 1 ) be the size biased variant of D (n) ( 1 ). It follows that Note that (apart from possibly x 0 ), in the construction of {X (n) (t); t ≥ 0} the degree of a vertex added to V (n) \ S (n) (t) is stochastically smaller thanD (n) ( 1 ), as long as t < t 0 , where t 0 = t 0 ( 1 ) = max{t > 0; |x (n) (t)| ≤ i 0 }. That is, up to we explore the i 0 -th vertex, the number of vertices in V (n) \ S (n) (t) is less than the number of particles in a branching process with offspring measure Denote the number of particles in this branching process at time t (t ≥ 0) by Z (n) (t).
In particular, note that For notational convenience we continue using z below.
Proof of Lemma 2.2. Recall the definition of α † from (2.19). By Claim 7.1 µ * (dt) is well defined. In Remark 2.7 we argued that α * ≥ α † ; α † < 0 and the function g * (x) is continuous and strictly decreasing for x > α † . This implies that g * (x) is continuous and strictly decreasing on the non empty interval (−α † , 0) and α * ∈ [−α † , 0) if and only if To show that R * 0 < 1, observe that the function is convex and analytic on x ∈ [0, 1] and has derivative Furthermore, by the definition ofq * (see (2.8)) and the convexity of g(·),q * and 1 are the only two solutions of the equation g(x) − x = 0 in [0, 1]. We recall thatq * < 1. Because g(x) − x is convex, we know that the function g(x) − x has to be negative between its two zeros (i.e. betweenq * and 1). This, together with d dx g(x)| x=q * = R * 0 (by (2.16)) implies that R * 0 < 1, which finishes the proof.

Time until the end of the epidemic
In this section we use the construction of the epidemic generated graph as presented in Section 4.1. We restrict ourselves to major outbreaks. Our approach is to define a random time t 1 = t (n) 1 , when the fraction of susceptible vertices among all vertices is larger than, but close to, its asymptotic value and sandwich (w.h.p.) the process 1 } between two branching processes and then find the time until those branching processes go extinct.
Let {X (n) (t); t ≥ 0} be as in Section 4.1. In our analysis below we consider |E (n) For all n, k ∈ N define the constantd Observe that by definitiond ≥ P(D (n ) = k) for all n ∈ N ≥n and for all k ∈ N.
(n) k may be strictly larger than 1 and there is no reason to assume that ∞ k=1d (n) k converges to 1 as n → ∞.
where the infimum of an empty set is ∞. Let t (n) and define the event A

|E
For all ∈ (0, ψ(1 −q * )), it holds that P(A (n) ( )|M (n) ) → 1 and there exists c 1 > 0, such that P(|S (n) (t The proof of this lemma is provided in Appendix A. Now we are almost ready to prove Lemma 2.11. In the proof we consider who infected whom, and since individuals can be infected only once, this leads to a tree representation EJP 26 (2021), paper 112. of the infection process: the infection tree. For u, v ∈ V (n) , if v is infected by u then u is the infector of v and we write u = ζ(v). We say that vertex u is an ancestor of v if there is j ∈ N and there are vertices u = v 0 , v 1 , · · · , v j = v ∈ V (n) such that for i ∈ N ≤j , v i−1 = ζ(v i ). To be complete we say that v is an ancestor of itself.
Let v be a vertex infected at time σ(v). Then define {J ; v is an ancestor of u}.
, be the set of vertices infected after time t, of which the infector is infected before time t, i.e.
In the language of [9], V Note that in Lemma 7.5 condition (2.17) is not needed, since (2.17) is only needed to guarantee that vertices do not stay infectious for too long and as such only needed for stochastic upper bounds on the duration of the epidemic.
Proof of Lemma 7.4. We divide the proof in the following steps 1. Show that there exists with high probability a constant γ > 0 such that for v ∈ V (n) * (T γ (n)) and for δ ∈ (0, |α * |), we can construct a branching process which dominates {J
Recall the notation from Section 4.1 Let v be a vertex infected at time t. Then v has a random degree with distribution defined through π (n) ≥k (t). One of the d v half-edges attached to v is paired at time t, while the other d v − 1 are still unpaired at time t. Let L v be the infectious period of v and without loss of generality we can assume that v was infected through half-edge (v, d v ). So, τ v,1 , τ v,2 , · · · τ v,dv−1 are the independent exponentially distributed random variables with expectation 1/β assigned to the different unpaired half-edges of v.
contact is made along the half-edge (and the half-edge is paired) and the contact made at time t + τ v,i is with a susceptible with probability κ (n) (t + τ v,i −). By (7.10), (7.11) and (7.12) we thus obtain that for all v ∈ V  where Y + k ( ) is a Bernoulli random variable with success probability κ + ( ), τ 1 , τ 2 , · · · are exponential distributed random variables with expectation 1/β and all defined random variables are independent.
So, the new branching process also has Malthusian parameter α * .
The mean offspring measure of the branching process with reproduction process {ξ (n),+ (t); t ≥ 0} is then defined through where κ + ( ) is defined in (7.11) and .
By α + > α * ≥ α † and the second inequality in (7.18) we obtain by the definition of α † that ∞ 0 e −α + t µ * (dt) < ∞ and thus that Finally, It follows from Claim 7.1 that the quotient of the expectations is finite, while the integral is finite by condition (iii). So condition (iv) is met.
2. Show that there exists γ > 0 and δ > 0 such that the dominated branching process satisfy the conditions of Lemma 4.3.
3. Show that there exist c 1 > 0 such that 4. Show that for every δ ∈ (0, 1), there exist γ > 0, such that Step 1: Let > 0 be small and chosen appropriately later. Recall the definitions (7.8) and (7.9). For t (n) 1 ( ) as in Lemma 7.2, and t > t (7.21) where the last equality is the definition of κ − ( ). Similarly, for t > t  (7.22) where again the last equality serves as a definition. Further, let P(D − ( ) > 1/ ) = 0. So,D − ( )1(A (n) ( )) is stochastically dominated by the random variable defined through π (n) ≥k (t) for t > t (n) 1 ( ). As in the proof of Lemma 2.10, if v is a vertex infected at time t, then v has degree distribution defined through π (n) ≥k (t). One of the d v half-edges attached to v is paired at time t, while the other d v − 1 are still unpaired at time t. Again as in the proof of Lemma 2.10, let L v be the infectious period of v and let τ v,1 , τ v,2 , · · · τ v,dv−1 be independent exponentially distributed random variables with expectation 1/β assigned to the different unpaired half-edges of v. If τ v,i ≤ L v then a contact made by v at time t + τ v,i is with a susceptible with probability κ (n) (t).
Let L = min(L, 1/ ), be a random variable representing a life length, which is distributed as L cut off at length 1/ . By (7.21) and (7.22) we obtain that for all v ∈ V  where Y − k ( ) is a Bernoulli random variable with success probability κ − ( ), τ 1 , τ 2 , · · · are exponentially distributed random variables with expectation 1/β and all defined random variables are independent.
This concludes Step 1 of the proof.
Step 2: In this step we wish to show that there exists > 0 such that We note first that since µ − (·) has mass on a bounded interval, (i) and (iii) are trivially satisfied. Similarly, because L has bounded support (ii) is also satisfied. Finally, It follows from Claim 7.1 that the quotient of expectations is finite, while the integral is trivially finite. So (iv) is met.
Recall that {J v (t); t ≥ 0} dominates a branching process with mean offspring measure {µ − (t); t ≥ 0}. Consider a sequence of i.i.d. copies of this branching process indexed by v ∈ |V (n) * (T γ (n))| and let Z − ,v (t) be the number of alive particles in the copy indexed by v at time t. So, |I (n) (t)| is stochastically larger than v∈V (n) * . By the independence of the branching processes we then obtain that For the second inequality we used that {Z − ,k (t) = 0} is increasing in t. Using the above gives us for all c 1 ∈ (0, 1) ). For sufficiently large n we have log c1n |α − |+δ/3 ≥ log n |α − |+δ/2 and thus we obtain By step 3 we also know that there exists c 1 ∈ (0, 1) such that P(|V (n) * (T γ (n))| ≤ c 1 n | M (n) ) → 0.
So, from (7.27) we obtain P |I (n) T γ (n) + log n By |α − | < |α * | + δ/2 we then obtain This in turn leads to and the proof is complete.

Proof of Theorem 2.4
In this section we show how Theorem 2.4 relatively straightforward follows from the proof of Theorem 2.3.
The way we prove it is to show that for every η > 0, with high probability no vertex in the population is infectious while having at least one susceptible neighbour for a period at least (|α * | − η) −1 log n. Furthermore, we show that for all vertices infected after time T γ (n) (as defined in (2.21)) condition (2.18) is satisfied if we replace L by L as defined in (4.1). Then we can use the proof of Lemma 7.4 with the replacement for L, while Lemma 7.5 holds irrespective of the distribution of L.
Our next step is to observe that the epidemic spread does not change if for all v ∈ V (n) we say that v recovers L v instead of L v time units after the infection time σ(v). In the first step of the proof of Lemma 7.4 we can then replace L by a random variable L , with a distribution defined through We further use that by Claim 7.1 E D (n),+ ( ) < ∞.
We may apply Lemma 7.4 with L replaced by L and check whether condition (2.18) holds: which is indeed finite by (7.29).

Concluding remarks
In this manuscript we obtain asymptotic results on both the time of strong extinction T * (n) and the time until weak extinction T † (n) of an SIR epidemic on a configuration model random graph with n vertices. In these concluding remarks we only consider the time of strong extinction. Results for weak extinction are similar. We show that conditioned on a large outbreak and under some further mild conditions T * (n)/ log n converges in probability to (α ) −1 + |α * | −1 , where α and α * are Malthusian parameters of branching processes that approximate respectively the early phase and the final phase of the epidemic. As opposed to much theory on epidemics on networks, we do not have to assume that the infectious period distribution is exponentially distributed or that the asymptotic degree distribution has finite variance. We expect that conditioned on a large outbreak and some further conditions converges in distribution to a random variable with finite mean and variance. We also expect that showing this might require quite some extra work. In particular, in the proof of Lemma 7.4 we use that we have useful bounds for the infection times of vertices that are infected after time T γ (n) (see (2.21)), but of which the infector was infected before that time. If we want to obtain convergence results for T * (n) − ((α ) −1 + |α * | −1 ) log n, we need more detailed knowledge of the asymptotic distribion of those infection times, which we anticipate is possible but hard to obtain if the infectious periods are not exponentially distributed. Furthermore, we need to be more precise in coupling the initial and final phase of the epidemic with branching processes, even if the infectious periods are exponentially distributed. For the early phase we can use the results from EJP 26 (2021), paper 112. [7] or [20] if we adopt the assumptions underlying the results of those papers. To obtain similar results for the final phase of the epidemic we need to work with subcritical branching processes and epidemics, which we consider beyond the scope of this paper. We expect that if the infectious periods of vertices satisfy some extra conditions (e.g. that they are almost surely bounded) and if the conditions of [7] on the degree distribution are met, then for every deterministic function f (n) that converges to 0. We leave exploring what conditions are exactly needed to future work. We also consider the impact of vaccination. We show that if vaccination is not sufficient to prevent a large outbreak, it will often (but not always) increase the duration of an SIR epidemic in a large enough population. We note that this result might be of no direct value in real world applications, because we only show that with insufficient vaccination the the time until strong survival divided by log n converges to a larger constant than if there is no vaccination at all. Because the growth of log n is slow, it is likely that the lower order terms for the duration are still important for considerably large n, and perhaps even for population sizes exceeding the global population.
We assume throughout the manuscript that the epidemic is initiated by a single, uniformly chosen infected vertex. In our proofs we do not need this assumption and our theorems are still valid if the sum of the degrees of the initially infectious vertices converges in distribution to some almost surely finite random variable, say D int . This is because we condition on a large outbreak in all of our results, and the Malthusian parameter of the approximating branching process is not dependent on how many particles there are in the first generation of the branching process. If D int and D have the same support then our results are immediate since the number of infections and the times of infections by initial infectious vertices can be coupled, such that they correspond with positive probability, and all our results involve convergence in probability of the duration of an epidemic.

A Proof of Lemma 7.2
In this appendix we prove Lemma 7.2. We repeat some definitions and the statement of the Lemma.
We also defined the event A (n) 2 ( ), which is the event that the following holds.
Our strategy is now as follows. First we show that P(K 1 < |x (n) (∞)| < K 2 |M (n) ) → 1 (A.1) and that there exists c 1 > 0, such that P |S (n) (t (K 1 ))| − |S (n) (∞)| > c 1 n|M (n) → 1. and that (i) the number of half edges that belong to vertices in V (n) that have none of their halfedges among the first K 2 elements of x (n) exceeds with high probability E[(z − )D] (n), (ii) the number of half-edges that are themselves or are paired with half edges among the first K 2 elements of x (n) is with high probability less than (n) − 1 − (z − ) 2 (n), (iii) For every k ∈ N, the number of half edges that belong to vertices of degree at least k in V (n) that have none of their half-edges among the first K 2 elements of x (n) is with high probability at least ∞ j=k njP(D (n) = j)(z − ) j . Together this proves the Lemma. Because the elements of x (n) are i.i.d. and uniform among all (n) half-edges, we have by well-known properties of the Poisson distribution (see e.g. [24, p. 317]) that the number of times a given half-edge is among the first K 1 (resp. K 2 ) elements of x (n) is Poisson distributed with expectation | log(z + /2)| (resp. Poisson distributed with expectation | log(z − /2)|) and independent for different half-edges. This implies that the events that different half-edges are not among the first K 1 elements of x (n) are independent and have probability e −| log(z+ /2)| = z + /2. Similarly, the events that different half-edges are not among the first K 2 elements of x (n) are independent and have probability z − /2. So, the probability that none of the EJP 26 (2021), paper 112. half-edges belonging to a uniformly chosen vertex is part of the first K 1 elements of x is given by ∞ k=0 P(D (n) = k)(z + /2) k → ∞ k=0 P(D = k)(z + /2) k , and the probability that none of the half-edges belonging to a uniformly chosen vertex is part of the first K 2 elements of x is given by In a similar fashion we obtain that the probability that a uniformly chosen half-edge belongs to a vertex of which none of the half-edges is part of the first K 1 elements of x (n) is given by ∞ k=0 P(D (n) = k)(z + /2) k and the probability that a uniformly chosen half-edge belongs to a vertex of which none of the half-edges is part of the first K 2 elements of x (n) is given by ∞ k=0 P(D (n) = k)(z − /2) k .
Using the same law of large numbers argument as above we obtain that the fraction of half-edges belonging to vertices of which none of the half-edges is part of the first K 1 elements of x (n) converges in probability to ∞ k=0 P(D = k)(z + /2) k , which is strictly less than E[(z + )D]. Similarly, the fraction of half-edges belonging to vertices of which none of the half-edges is part of the first K 2 elements of x (n) converges in probability Now we turn our attention to |E (n) P (t)|. In G (n) , all half-edges are paired uniformly at random. For a half-edge not to be part of E (n) P (t), neither the half-edge itself nor its partner should be part of x (n) (t). Again the probability that two given half-edges are not among the first K 1 (resp. K 2 ) elements of x (n) is (z + /2) 2 (resp. (z − /2) 2 ). So using (A.1) and the above law of large numbers again we obtain (n) − |E   (j ∈ N) are independent binomially distributed random variable with parameters nP(D (n) = j) and (z + /2) j .
We want to show that P(t (K 1 ) > t  where we have used D (n) → D and that ∞ j=k j 2 (z + ) j → 0 as k → ∞. The quotient in the right hand side of (A.7) is independent of n and by the assumption P(D ≥ k) > 0 it is also finite. So, In other words, for all k 0 ∈ N and all 1 > 0 there exists n 0 ∈ N such that for all n > n 0  0 ( ) = k ∈ N >k0 ; nP(D (n) = k)(z + 3 /4) k ≤ 1 . We should prove that for every 1 > 0, there exists k 0 ∈ N such that  Let k 0 be such that (z+ ) k 0 (z+ /2) k 0 > 7. Further assume that k 0 was chosen such that k 0 > k 0 .
To prove that