Limit distributions of branching Markov chains

We study branching Markov chains on a countable state space (space of types) $\mathscr{X}$, with the focus on the qualitative aspects of the limit behaviour of the evolving empirical population distributions. No conditions are imposed on the multitype offspring distributions at the points of $\mathscr{X}$ other than to have the same average and to satisfy a uniform $L \log L$ moment condition. We show that the arising population martingale is uniformly integrable. Convergence of population averages of the branching chain is then put in connection with stationary spaces of the associated ordinary Markov chain on $\mathscr{X}$ (assumed to be irreducible and transient). This is applied, in particular, to the boundaries of appropriate compactifications of $\mathscr{X}$. Final considerations consider the general interplay between the measure theoretic boundaries of the branching chain and the associated ordinary chain.


Introduction
There is a large body of literature devoted to the quantitative aspects of branching random walks on the additive group of real numbers and to the behaviour of the associated martingales, e.g., see Shi [Shi15] and the references therein.In what concerns more general state spaces rich enough to have a non-trivial topological boundary at infinity (like, for instance, infinite trees), it is natural to ask about the limit behaviour of the branching populations in geometric terms.Non-trivial limit sets of random population sequences were first exhibited by Liggett [Lig96] for branching random walks on regular trees.This was pursued further for branching random walks on free groups by Hueter -Lalley [HL00], on more general free products by Candellero -Gilch -Müller [CGM12], and very recently to random walks on hyperbolic groups by Sidoravicius -Wang -Xiang [SWX20].See also Benjamini -Müller [BM12, Section 4.1] for a number of conjectures concerning the trace and limit sets of branching random walks.Some answers were given by Candellero -Roberts [CR15] and Hutchcroft [Hut20].
We are looking at branching random walks from a different and apparently novel angle.We are interested in the random limit boundary measures arising from the empirical distributions of sample populations.Unlike with the limit sets, the very existence of the limit measures is already a non-trivial problem.We consider and solve this problem in two different setups: in the topological one (when the boundary of the state space is provided by a certain compactification) and in the measure-theoretical one (when we are dealing with the Poisson or exit boundary of the underlying Markov chain on the state space).
Date: May 2022.2020 Mathematics Subject Classification.60J10; 60J80; 60J50, 31C20.The second author was supported by Austrian Science Fund project FWF: P31889-N35 during a visit at University of Ottawa in 2019.
Definition 0.1.Let X be a countable set, the state space.
(a) A population on X is a finitely supported function m : X → Z + , also viewed as a multiset, so that x ∈ m means that m(x) > 0 and m(x) is the number of particles (members of the population) situated at x ∈ X.Thus, the same location can be shared by several particles.We emphasise the difference between a population m ∈ M = Z + [X] (which is a multiset) and its support supp m = {x ∈ X : m(x) > 0} which is a plain subset of X.
(b) A branching Markov chain is a time homogeneous Markov chain on the space of populations M whose transitions m ∼ m ′ are determined by a family of branching probability distributions Π x , x ∈ X, in the following way: each particle of the population m is replaced with a population independently sampled from the distribution Π x determined by the position x of the particle; the result of this procedure is the population m ′ .That is, for y ∈ X, m ′ (y) is the sum of the random offspring numbers which each x ∈ m places at position y.
The elements of the state space X are often referred to as types, and then one talks about multi-type branching processes (rather than branching Markov chains).They were first considered by Kolmogorov [Kol41].The explicit general definition was given by Harris [Har63, Section III.6].There is an ample literature on multi-type branching processes, in discrete as well as continuous time.The reader is referred to the survey by Ney [Ney91] for a historical account and for general information on this field.More relevant literature will be outlined further below.
The reason for our choice of terminology is that we take a geometrical point of view and have in mind a spatial structure of X.Our branching Markov chain is a sequence M = (M n ) n≥0 of random populations, and we are interested in its evolution in that space.In particular, we are interested in the behaviour of the sequence of empirical distributions The branching ratio π x at x is the first moment of π x (the expected offspring number at x), and π x,y denotes the expected offspring number which is placed at y ∈ X under the distribution Π x .A branching Markov chain gives rise to the underlying (also caled base) "ordinary" Markov chain on the state space X whose transition operator (matrix) P is given by p(x, y) = p x (y) = π x,y /π x . (0.4) We write p (n) (x, y) for its n-step transition probabilities and G(x, y) = ∞ n=0 p (n) (x, y) for the associated Green function.
Our basic assumptions, beginning with Section 2, are the following.
The population cannot die out and has non-trivial branching, that is, π x (0) = 0 and π x (1) < 1 for all x ∈ X.

(NE )
The branching ratio is constant and finite, i.e., there is ρ < ∞ such that π x = ρ for all points x ∈ X.

(BR)
The underlying Markov chain is transient, and all its states communicate, i.e., 0 < G(x, y) < ∞ for all x, y ∈ X.
(TC ) Note that then ρ > 1.The assumptions can be relaxed, but they simplify some technicalities whithout compromising the conceptual spirit.
In Section 1, we set up a rigorous framework and present general classes of examples, including a discussion and many references.
One of our main aims is to study the boundary behaviour of the sequence (0.2) of empirical distributions, or very similarly, of the sequence 1 ρ n M n ∈ Meas(X) (0.5) under Assumption (BR).Assuming that the state space X is endowed with a suitable compactification X = X ∪ ∂X, there has been a body of interesting work considering the limit set of the trace (set of visited points) of the branching Markov chain on the boundary ∂X.For more details and references, see §1.E.Our goal is to shift the focus and to look, instead of the limit sets, at the random limit boundary measures obtained as the weak* limits of the empirical distributions (0.2), resp., the measures (0.5).
Before we embark on this study, a comparison of (0.2) and (0.5) reveals that we need to understand the behaviour of the following sequence.
Definition 0.6.The population martingale of a branching Markov chain satisfying (BR) is the sequence of random variables (functions on the path space) We call its a.s.pointwise limit the limit population ratio.
We emphasise that even though the branching ratio is assumed to be constant, the offspring distributions π x themselves are allowed to be different.What we need here is an extension of the classical theorem of Kesten -Stigum [KS66] which says that for a single offspring distribution π, the population martingale is uniformly integrable if and only if π satisfies the L log L moment condition.
This issue is dealt with in Section 2. We introduce the uniform L log L moment condition for the family (π x ) x∈X .Under this condition, we prove that the population martingale is uniformly integrable (Theorem 2.3) and that W ∞ is almost surely strictly positive for any initial population (Theorem 2.4).
Section 3 is the central one: in order to study boundary convergence of the sequence of empirical distributions (0.2), we first consider stationary spaces for the underlying Markov chain.These are measurable spaces equipped with a P -harmonic system of probability measures κ x , x ∈ X, see Definition 3.1.We can then study the sequence of random measures κ Mn = x∈Mn κ x .
A particularly interesting case is the one where we have a compactification X of the state space with separable boundary ∂X = X \ X.What we want is its compatibility with the underlying Markov chain X = (X 0 , X 1 , . . . ) in the sense that X n converges almost surely to a ∂X-valued random variable X ∞ for any starting point x ∈ X .Endowed with the associated limit distributions κ x , the boundary is a stationary space.Our main Theorem 3.26 states almost sure weak* convergence of the normalised random measures 1 ρ n κ Mn to a positive Borel measure κ M under the uniform L log L moment condition.Further, assume that the compactification is Dirichlet regular, which means that every continuous function on ∂X has a continuous continuation to X which is P -harmonic in X.In this situation, the random measures 1 ρ n M n themselves converge (weak*) to κ M almost surely.As a consequence, we obtain in Theorem 3.32 that the random probability measures 1 Mn κ Mn also converge almost surely.In particular, in the case of Dirichlet regularity, the sequence of empirical distributions (0.2) converges almost surely to a random probability measure on the boundary -our primary goal.
In the last parts of Section 3, we review geometric, resp.algebraic adaptedness conditions of the transition probabilities of the underlying chain (X n ) to a given graph or group structure of the state space, in which case we speak of a random walk.Then we recall a few typical compactifications and explain how theorems 3.26 and 3.32 apply.
In Section 4, we shift our attention from topological to measure theoretic boundary theory.We start by explaining in some detail the Poisson boundary for a general (i.e., not necessarily group invariant) Markov chain on a countable state space and its relation with the tail boundary.We elucidate the relationship between compact stationary spaces and quotients of the Poisson boundary.Our goal in this section is to establish a link, in the measure-theoretic context, between the boundaries of a branching Markov chain and that of the underlying chain.Theorem 4.17 provides a natural transfer operator from the tail boundary of the latter to that of the former, which is Markov on the respective Banach spaces of essentially bounded functions.Theorem 4.21 clarifies how this operator descends to a compact stationary space for the base chain.The final Theorem 4.22 explains the importance of the above Markov transfer operator: in the topological context of §3, its range is precisely the set of random limits of the sequence of empirical distributions.

Basic notions
1.A.General framework.Here, we set up a general rigorous framework for our main objects.Given our countable state space X, we use the notations below for the following spaces.
When talking about integration we use the "pairing notation" µ, f to denote the integral of a function f with respect to a measure µ (a sum in the discrete case).
One can also treat M as a subspace of Meas(X) that comprises all finite non-negative integer valued measures on X (sometimes called occupation measures).Thus, m is the total mass of m ∈ M. If X is a group, then M = Z + [X] is precisely the set of all non-negative elements of the group algebra Z[X] of X over Z (which is the reason for our notation).In this situation, the map that assigns to any population its amplitude (size) is nothing but a restriction of the corresponding augmentation homomorphism.We use this term for the additive augmentation map amp in our more general setup as well.Applied to a measure Π ∈ Meas(M), it gives rise to the image measure given as in (0.3) (without the x in the index).The barycentre (the first moment) of π is denoted by If Π (and therefore π as well) is a probability measure, then π is the size distribution of the populations sampled from Π, and π is their average size.We use the same notation for the barycentre map on the space Meas(M) obtained by linear extension of the mapping The horizontal arrows in the following commutative diagram represent the barycentre maps from Meas(M) and Meas(Z + ) to Meas(X) and R, respectively, In particular, the total mass of the barycentre measure Π is is precisely the average size of the populations sampled from Π.If π = Π < ∞ (for instance, if Π is finitely supported), then the normalisation of Π produces the displacement distribution Since the population space M is contained in the commutative group Z[X], which is an additive semigroup, one can define in the usual way the convolution of two measures on If both arguments are probability measures, then Π * Π ′ is the distribution of the sum M + M ′ , where the random summands are independently sampled from the respective distributions Π and Π ′ .Clearly, 1.B.Implementation for branching Markov chains.For branching Markov chains, we use the notation of §1.A for the various objects associated with the branching distributions Π x by adding the subscript x.As anticipated in the Introduction, π x is the offspring distribution at x.The displacement distribution (1.5) associated with the probability measure Π x is the transition kernel p x of (0.4).As follows from Definition 0.1 and the definition of the convolution operation (1.6), the transition probabilities of the branching Markov chain M = (M 0 , M 1 , . . . ) are the convolutions where we treat the populations m as multisets, so that each point from the support of m is taken with its multiplicity.That is, the probability of the move m ∼ m ′ is Π m (m ′ ).We denote by P P P Θ the probability measure on the space M Z + of sample paths of M corresponding to the initial distribution Θ ∈ Prob(M).We use the notation P P P m = P P P δm for the initial distribution Θ = δ m concentrated at a single population m ∈ M, and P P P x = P P P δx if m = δ x is the singleton at a point x ∈ X.The respective expectations are denoted by All these measures on the path space are absolutely continuous with respect to the common initial full support class of the measures P P P Θ corresponding to the initial distributions Θ with supp Θ = M.
It is to the initial full support measure class that we refer when we use the expression "almost everywhere" without specifying a measure on the path space.
(FS ) The transition operator of the branching random walk is It is well-defined not only on the space Fun(M) of bounded functions on M, but also for all non-negative positive functions (allowed to take the value +∞).Following the standard probabilistic convention we use the postfix notation for the action of the dual operator on the space Meas(M) of positive measures on M, so that ΘP is the time 1 marginal distribution of the measure P P P Θ .For the underlying Markov chain X = (X 0 , X 1 , . . . ) with transition probabilities given by (0.4), resp.(1.5), we denote in the same way as above the measures on the space X Z + of sample paths by P θ (or P x = P δx , if the initial distribution θ is concentrated at a single point x ∈ X), the respective expectations by E θ , E x , and the transition operator by (1.10) 1.C.The lifting operator.For a function f on X, we denote by its lift to the space of populations M. In particular, for the function 1(x) ≡ 1 on X.The lifting operator is dual to the barycentre map (1.3), i.e., Therefore, the barycentre map can be written in the postfix notation as Proposition 1.14.The transition operators P (1.9) and P (1.10) of the branching Markov chain and of the underlying chain, respectively, satisfy the commutation relation where π denotes the operator of multiplication by the branching ratio function π : x → π x (see subsection 1.A).In other words, and Proof.It is more convenient to prove the commutation relation for the dual operators acting on measures.By linearity it is enough to consider the situation when Θ = δ m is the delta measure at a population m ∈ M: We recall that a function f is called harmonic with respect to a transition operator P if P f = f , and it is called λ-harmonic for an eigenvalue λ ∈ R if P f = λf .Corollary 1.16.If the branching ratio π x ≡ ρ is constant, then for any λ-harmonic function f of the underlying chain its lift to the population space M is λρ-harmonic for the branching Markov chain.This property will play a key role in the rest of the paper.

1.D. Examples of branching Markov chains.
Example 1.17.If all branching distributions Π x are concentrated on one-point configurations (i.e., all offspring distributions π x are just δ 1 , and π x ≡ 1), then the barycentres Π x are probability measures, so that in this situation the branching Markov chain consists in running independent sample paths of the underlying Markov chain issued from each particle of the initial population.
Example 1.18.If the state space X is a singleton, then the size is the only parameter that describes populations on X, and a branching Markov chain over X is determined just by a single offspring distribution Π ∼ = π = amp(Π) on M ∼ = Z + .Therefore, it is nothing but the usual Galton -Watson branching process determined by π.
Example 1.19.For a probability measure µ on X, we denote by µ k ∈ Prob(M) the image of the product measure µ ⊗k on X k under the map Given a distribution π ∈ Prob(Z + ) and a Markov chain on X with the transition probabilities p x ∈ Prob(X ), the family of branching distributions where Π k x are probability measures on size k populations.For instance, if we let m k y = k •δ y be the population with k particles at y and none elsewhere, then we can consider the branching distributions (1.22) The corresponding transition distributions are again p x .In the branching Markov chains determined by both (1.20) and (1.22) first one samples a Galton -Watson tree with the offspring distribution π and then equips this tree with the transitions sampled from the appropriate transition distributions p x .However, in Example 1.19 the independently sampled transitions are parameterised by the edges of the tree, whereas for the chain determined by (1.22) they are parameterised by the vertices of the tree (so that the transition is the same for all edges issued from the same vertex in the direction away from the root).It might be interesting to look at the branching Markov chains determined by convex combinations of the measures (1.20) and (1.22).
Example 1.23.One can also consider a more general situation than in Example 1.19 with the offspring distributions π x being space dependent, although the displacement is still governed by the transition probabilities p x of an underlying Markov chain on X (e.g., see Menshikov -Volkov [MV97] and Gantert -Müller [GM06]).In this case the resulting branching Markov chain is determined by the branching distributions In the context of this example, the dependence of π x on x is often referred to as an environment; if it is random, then one talks about branching Markov chains in random environment, see Comets -Menshikov -Popov [CMP98].The term "environment" is also used to describe the generalisation of the Galton -Watson process that allows the offspring distribution to depend on the generation number, see e.g.Athreya -Ney [AN72, Section VI.5].By passing to the space-time process (see Section 4.B) the latter model becomes a particular case of the former one.
Example 1.24.If X is a group, then it makes sense to consider the assignments x → Π x equivariant with respect to the natural action of X on the population space M = Z + [X] by translations, i.e., such that all branching distributions Π x are the translates of a single probability measure Π ∈ Prob(M) (the branching distribution at the group identity).By analogy with ordinary random walks on groups, we then talk about branching random walks.In particular, in this case the offspring distributions π x all coincide with the size distribution π of the measure Π, the branching ratios (offspring averages) π x all coincide with π, and the transition probabilities p x are the translates of the displacement distribution µ = Π/π, the law of the random walk on the group: p x (y) = µ(x −1 y).In the same vein one can also consider the situation when X is endowed with a group action (transitive, quasi-transitive, or a more general one), and the map x → Π x is equivariant with respect to this action (cf.Kaimanovich -Woess [KW02] and Subsection 3.D).
1.E.Limit sets vs. limit measures.Before plunging into medias res we outline the earlier approach to the boundary behaviour of branching Markov chains which served as our motivation.For the branching Markov chain M = (M n ), its trace is the random set of all points from the state space which are charged (or visited) by at least one of the populations M n .Assuming that the state space X is endowed with a compactification X = X ∪ ∂X (see Section 3.E below for definitions and examples), one can then define, in the usual way, the limit set of a sample path as the boundary of its trace with respect to this compactification: Notions of of recurrence and transience for branching Markov chains with independent branching and displacement have been studied by Benjamini -Peres [BP94], Müller [Mü08], Bertacchi -Zucca [BZ08] (in continuous time) and Benjamini -Müller [BM12]; see also Woess [Woe09, §5.C] for a simplified approach.If the branching Markov chain is recurrent in the sense that supp M = X for almost all sample paths, then obviously the limit set Λ(M) coincides almost surely with the whole boundary X.Otherwise, proper traces supp M = X may lead to proper limit sets Λ(M) = ∂X, and it makes sense to look at their properties.
Regarding non-trivial limit sets, see the referenes given at the beginning of the introduction.Note that in those papers only branching random walks with independent branching and displacement (as described in Example 1.19) were considered.
Free groups have served as the "true touchstone" in the non-commutative random walk theory for the last 60 years, so let us describe the situation with them in more detail (for instance, see Ledrappier [Led01] and the references therein for more background).Let A be a finite alphabet of cardinality d ≥ 2, and F be the free group of rank d generated by A. We fix a symmetric probability measure µ with support A ∪ A −1 ; the simplest case is when µ is equidistributed on A ∪ A −1 , so that the random walk (F, µ) is just the simple random walk on the homogeneous Cayley tree of the free group.Further, let π be the geometric distribution on N with parameter p ∈ (0, 1) and mean ρ = 1/p.We can now consider the branching random walk with independent branching and displacement determined by the underlying random walk (F, µ) and the offspring distribution π.
If ρr > 1, where r = r(F, µ) is the spectral radius (3.39) of the random walk (F, µ), then almost surely supp M = F and Λ(M) = ∂F.In the case ρr ≤ 1, Hueter and Lalley, extending the above cited result by Liggett related to simple random walk, proved that the Hausdorff dimension HD Λ(M) of the limit set with respect to a natural metric on ∂F is almost surely constant and obtained an explicit formula for it [HL00, Theorem 1].In particular, it satisfies the inequality and HD Λ(M) → 0 as ρ → 1 from above.This result was extended to branching random walks on free products of finitely generated groups under less restrictive conditions by Candellero -Gilch -Müller [CGM12, Theorems 3.5 and 3.10], and very recently to random walks on hyperbolic groups by Sidoravicius -Wang -Xiang [SWX20].
As outlined in the Introduction, our goal here is different; we are interested in random limit boundary measures arising from the sequences (0.2), resp.(0.5).Unlike with the limit sets, the very existence of the limit measures is a non-trivial problem.In many cases there is a phase (regarding the branching ratio ρ) where the branching Markov chain is strongly recurrent in the sense that with probability 1, each state x ∈ X is visited by the population infinitely often, see the references of the present subsection.Nevertheless, the empirical distributions always move their mass to infinity, as the following lemma shows, providing a simple motivation for our goals.Proof.It suffices to prove this for the situation when the branching chain starts with one particle at a generic x ∈ X.In view of (0.4), Therefore M n (y)/ρ n → 0 and thus also ⌢ M n (y) → 0 almost surely under P P P x .

Uniform integrability and positivity of the population martingale
2.A.The population martingale.Recall that we assume (BR): the offspring averages satisfy π x = ρ < ∞ for all x ∈ X.In terms of the augmentation map (1.1), the barycentre map (1.3), and the transition operator P (1.9) this condition means that In other words, after one step of the branching Markov chain the average size of populations is always multiplied by the same constant ρ.This is the case, for instance, for the branching Markov chains from examples 1.19, 1.21, and 1.24; in the setup of Example 1.23, condition (BR) was used by Gantert -Müller [GM06, Section 3.1].Recall the Definition 0.6 of the population martingale.The sequence (W n ) is indeed a martingale with respect to the increasing coordinate filtration on the path space, because by Corollary 1.16 condition (BR) implies that the lift 1(m) = m (1.12) of the constant function 1 from X to M is ρ-harmonic; see Section 4.D below for a more general discussion.
A priori the expectation of the limit population ratio may be strictly smaller than the expectations of the population martingale with respect to the measure P P P Θ on the path space corresponding to an initial distribution Θ ∈ Prob(M).Their equality means that the population martingale is uniformly integrable on the path space (M Z + , P P P Θ ) (e.g., see Meyer [Mey66, Chapter V] for the basics of martingale theory).When talking about uniform integrability without specifying a measure on the path space we mean that it holds for any initial distribution Θ ∈ Prob(M), i.e., with respect to the full initial support measure class (FS ).
In order to guarantee this property it is enough to take for Θ just the delta measures concentrated at singletons δ x , x ∈ X, i.e., to require that For the ordinary Galton -Watson processes (Example 1.18) the equivalence of the uniform integrability of the population martingale to the L log L moment condition on the offspring distribution π is the classical theorem of Kesten -Stigum [KS66] (see also Lyons -Pemantle -Peres [LPP95] and the references therein).Although this criterion is directly applicable to the situation when the offspring distributions π x are the same for all x ∈ X, in particular, to branching random walks on groups (see Example 1.24), this is not the case for general branching Markov chains.
In order to formulate an analogous result in the general setup we need tightness of the offspring distributions, as follows.
Definition 2.2.Given two probability distributions π and π ′ on Z + , we say that π dominates π ′ (notation: A family of probability measures on Z + satisfies the uniform first moment condition (resp., the uniform L log L moment condition) if it is dominated by a probability measure with a finite first moment (resp., by a measure that satisfies the L log L moment condition).
Theorem 2.3.If the offspring distributions of a branching Markov chain satisfy the uniform L log L moment condition, then the population martingale is uniformly integrable.
It is known since Levinson [Lev59, Section 4] that for the ordinary Galton -Watson processes the L log L condition implies that the limit population ratio is almost surely strictly positive on non-extinction.A consequence of the Kesten -Stigum theorem is the equivalence (on non-extinction) of the following two conditions: (i) the population martingale (W n ) is uniformly integrable; (ii) the limit population ratio W ∞ is almost surely strictly positive.
However, for branching processes in varying environment it may well happen that the limit population ratio vanishes with positive probability in spite of the uniform integrability of the population martingale (see the example constructed in MacPhee -Schuh [MS83] and the discussion in D'Souza -Biggins [DB92, p. 41]).We do not know whether in our setup the uniform integrability of the population martingale would always imply that the limit population ratio is almost surely positive.Still, we can show that this is the case under the same uniform L log L condition as in Theorem 2.3.
Theorem 2.4.If the offspring distributions of a branching Markov chain satisfy the uniform L log L moment condition, then the limit population ratio is almost surely strictly positive for any initial population.
Our proofs of Theorem 2.3 and Theorem 2.4 below are self-contained and follow the approach of D'Souza -Biggins [DB92] to the Galton -Watson processes in varying environment.Theorem 2.3 can also be deduced from the general criterion of uniform integrability of the martingales of multi-type branching processes (≡ branching Markov chains in our terminology) associated with "mean-harmonic functions" due to Biggins -Kyprianou [BK04, Theorem 1.1 and the discussion on p. 547], cf.Remark 4.16 below.

2.C. Laplace transforms and their remainders. We denote by
the Laplace transform of a probability measure θ ∈ Prob(Z + ) (which can be thought of as the distribution of a Z + -valued random variable X).The linear part of the power series expansion of L θ (e −s ) is equal to 1 − θs, where θ is the expectation of θ (assumed to be finite), and we denote the arising remainder by (2.6) We also use the above notation with the subscript Θ in the situation when θ = amp(Θ) is the image of a measure Θ ∈ Prob(M) under the augmentation map amp (1.1), so that Lemma 2.7.For any measure θ ∈ Prob(Z + ) with a finite first moment (i) the function R θ is non-decreasing on the positive ray R + ; (ii) the ratio R θ (s)/s is non-decreasing on R + , and s 2 ds is convergent for any C > 0 if and only if the measure θ satisfies the L log L moment condition (2.1).Further, if a measure θ ∈ Prob(Z + ) with a finite first moment dominates another measure Proof.(i) and (ii) immediately follow from the same properties of the functions ψ (2.6) and s → ψ(s)/s, respectively, whereas (iv) is a consequence of (i).Property (iii) is wellknown, e.g., see Athreya -Ney [AN72, Lemma I.10.1].Since our setup is somewhat different, for the sake of completeness we include its elementary proof.The function R θ being non-negative, by exchanging the order of summation and integration one arrives at where s 2 ds .Since ψ(s)/s → 1 as s → ∞, the latter integral asymptotically behaves as k log(kC), whence the claim.
Lemma 2.9.If the offspring distributions π x of a branching Markov chain satisfy the uniform first moment condition, then there exists s 0 > 0 such that for any measure where ρ is the common branching ratio from condition (BR), θ is the expectation of the measure θ = amp(Θ), and R = R π is the remainder function (2.5) associated with the measure π ∈ Prob(Z + ) that dominates the distributions π x .
Proof.To begin with, let Θ be the delta measure at the singleton δ x ∈ M, x ∈ X.Then ΘP = Π x , see Definition 0.1, whence amp(ΘP) = π x .We recall that π x = ρ for all x ∈ X by our standing assumption (BR).Therefore, by Lemma 2.7(iv) for any s ≥ 0 Now, let Θ = δ m for m ∈ M, so that ΘP = Π m .Then by (1.8) and (2.11), (counting as always multiplicities in the product).By Lemma 2.7(ii) we can choose s 0 > 0 in such a way that Since the derivative of the function t → t m on the interval [0, 1] does not exceed m , we then have and therefore (2.10) is satisfied, because θ = amp(Θ) = δ m , so that L Θ (z) = z m and θ = m .Finally, the general case follows from the linearity of the both sides of (2.10) with respect to Θ.

2.D. Proof of Theorem 2.3. We denote by
the one-dimensional distributions of the associated measure P P P x on the space of sample paths of the branching Markov chain M = (M 0 , M 1 , . . . Condition (BR) implies that amp(Θ n ) = ρ n , whence by Lemma 2.9 for s ≤ s 0 and by telescoping (2.12) Thus, (2.13) denote the probability that the limit population ratio of the branching random walk issued from an initial population m ∈ M vanishes.Somewhat abusing notation, we also put ω(x) = ω(δ x ) if the initial population is the singleton δ x at a point x ∈ X.
The function ω on M is P-harmonic, its values are sandwiched between 0 and 1, and This implies that the function ω is determined by its values on singletons as where, as always, each point from the support of m is taken with its multiplicity.By inequality (2.12) from the proof of Theorem 2.3 σ 2 dσ , with the right-hand side of this inequality being strictly less than 1 for all sufficiently small s in view of Lemma 2.7(iii).Therefore, the function ω (2.13) is bounded away from 1, i.e., there exists c < 1 such that The fact that the offspring distributions π x satisfy the uniform first moment condition, whereas their expectations are equal to ρ > 1, implies that the probabilities π x [2, ∞) are bounded away from 0. Therefore, at each step of the branching Markov chain the size of the population increases with a probability bounded away from 0, so that M n → ∞ almost surely.Thus, ω(M n ) ≤ η Mn → 0 .We have already mentioned that the function ω is P-harmonic, whence ω ≡ 0.
3. Topological convergence of populations 3.A.Harmonic systems of measures and stationary spaces.In this and the next subsection, we set up the needed background on boundary behaviour for transient Markov chains, to be applied to the base Markov chain of our branching chain and subsequently the branching Markov chain itself.
The action of the transition operator of a countable state space Markov chain (see Section 1.C) naturally extends from the "ordinary" real valued functions to the ones taking values in an arbitrary affine space (provided infinite convex combinations are welldefined -this is needed if not all transition probabilities are finitely supported), in particular, to measure valued functions.By Prob(K) we denote the space of probability measures on a measurable space K, and in the same way as for real functions we can formulate -in other words, a system (κ x ) of probability measures on K indexed by a countable space X -,is harmonic with respect to a Markov operator P on X, if P κ = κ, i.e., if κ satisfies the mean value property where p x are the transition probabilities of the operator P .One also uses the term stationary (or, P -stationary), cf.Remark 3.4 and Example 3.12.We shall refer to the couple (K, κ) as a measurable P -stationary space.In the situation when K is a topological space endowed with the Borel sigma-algebra, we call it a topological P -stationary space.
As follows from our irreducibility assumption (TC ), all measures κ x in a harmonic system κ = (κ x ) are pairwise equivalent.Therefore one can talk about their common measure class and we use the notation L ∞ (K, κ) = L ∞ (K) for the corresponding Banach space of essentially bounded measurable functions.
Remark 3.4.Given a map κ (3.2), we use the same notation for its extension Then the P -harmonicity of a map κ is equivalent to its invariance with respect to the action of the operator P on Meas(X), that is, κ θ = κ θP for all θ ∈ Meas(X) .The dual statement is that (3.3) holds if and only if for any test function ϕ ∈ L ∞ (K), or from the Banach space C(K) of real valued continuous functions when K is a topological space, the function n the measurable case is P -harmonic in the usual sense.In particular, a non-constant harmonic system exists only if there are non-constant bounded P -harmonic functions on X.
Proposition 3.7.If (K, κ) is a compact separable P -stationary space, then with probability 1, the Markov chain X = (X n ) has a random weak* limit and the barycentre of the arising family of measures {κ X } on K with respect to any distribution P x , x ∈ X, on the path space is the measure Proof.As we have already mentioned, for any ϕ ∈ C(K) the function f ϕ (3.6) on X is P -harmonic and obviously bounded.Therefore, the sequence of its values f ϕ (X n ) along the sample paths of the chain is a bounded martingale with respect to the coordinate filtration of the path space (see Section 4.B below for more details), whence the limit exists for almost every sample path, and is a countable dense subset, then by discarding the exceptional sets for each function ϕ ∈ Φ one obtains a co-negligible subset Ω of the path space such that the limit (3.9) exists on Ω for all functions ϕ ∈ Φ, hence, by the density assumption, for all ϕ ∈ C(K).Hence, this limit determines a non-negative normalised linear functional on C(K), i.e., a Borel probability measure κ X on K such that and convergence in (3.9) is precisely the weak * convergence of the measures κ Xn to κ X on Ω.Now, in terms of the limit measures κ X (3.11) formula (3.10) takes the form which proves the statement on the barycentre.
Example 3.12.Let X be a countable group continuously acting on a compact space K.
Given a probability measure µ on X, a measure κ on K is called µ-stationary if it is preserved by the convolution with µ: where Extending the notion of a µ-boundary for random walks on groups introduced by Furstenberg [Fur73, Section 8], at this point we formulate the following.Definition 3.14.A compact separable P -stationary space is a topological P -boundary if with probability 1, the random limit measure κ X is a delta measure at a random point.
As we shall see in Proposition 4.6 below, topological P -boundaries considered as measure spaces can be characterised as quotients of the Poisson boundary of the chain (X, P ).

3.B. Compactifications and the Dirichlet problem.
A priori the stationary space K in Definition 3.1 and Proposition 3.7 does not have to be "attached" to the state space X in any way.Let us now look at the situation when K is the boundary ∂X of a compactification X = X ∪ ∂X of the state space X.
We only consider compactifications for which ∂X is separable.
(SC ) (Since X is countable, the compactification space X is always separable.Still, the boundary ∂X need not be separable in general,-such as, e.g., the Stone -Čech compactification).
Definition 3.15.A compactification of the state space X of a Markov chain is stochastically resolutive with respect to this chain if X n converges almost surely to the compactification boundary, i.e., with probability 1 there exists the limit The resulting images κ x of the measures P x under the limit map are called the hitting distributions of the Markov chain.
This definition alludes to the notion of resolutivity from classical potential theory (e.g., see Lukeš -Netuka -Veselý [LNV02, Section 4]), cf. the remark at the beginning of Section 8 in Woess [Woe96].By the Markov property the system of hitting measures κ x of a stochastically resolutive compactification is P -harmonic in the sense of Definition 3.1.
Proposition 3.16.The boundary ∂X of a stochastically resolutive compactification endowed with the family of the hitting measures (κ x ) is a P -boundary, and This is a consequence of a general result on the identification of P -boundaries with the quotients of the Poisson boundary (Proposition 4.6) which we relegate to Section 4.B.Definition 3.17.A compactification of the state space X of a Markov chain with the transition operator P is Dirichlet regular with respect to this chain if for any function ϕ ∈ C(∂X) there is a unique P -harmonic function f ϕ on X (the solution of the Dirichlet problem with the boundary data ϕ) that provides a continuous extension of ϕ to all of X.In this situation for any x ∈ X ϕ → f ϕ (x) is a norm 1 positive linear functional on C(∂X) represented by a Borel probability measure κ x on ∂X (≡ the harmonic measure with pole at x) as (3.18) The system of harmonic measures from Definition 3.17 is P -harmonic in the sense of Definition 3.1 (cf.Remark 3.4).

Proposition 3.19 (Woess [Woe96, Theorem 2.2], [Woe00, Theorem 20.3]).
A compactification satisfying (SC ) X = X ∪ ∂X of the state space X of a transient Markov chain is Dirichlet regular if and only if the following two conditions hold: (i) the compactification is stochastically resolutive; (ii) the system of the hitting measures κ x has the property that w*-lim In this situation the measures arising from the solvability of the Dirichlet problem coincide with the hitting measures κ x .
Remark 3.20.In terms of the boundary convergence the difference between stochastic resolutivity and Dirichlet regularity is that in the latter case the harmonic measures κ xn converge to the delta measure δ x∞ at the limit point Remark 3.21.If a P -stationary space (K, κ) is compact, then the map κ : x → κ x (3.2) provides an embedding of the discrete space X into the compact space Prob(K) of Borel probability measures on K endowed with the weak* topology, and therefore it gives rise to a compactification of X whose boundary is the collection of all weak* limit points of the system (κ x ).This idea goes back to Furstenberg [Fur63b, Chapter II] who used it to define a compactification of Riemannian symmetric spaces.Proposition 3.7 then implies that this compactification is stochastically resolutive.
Our various preliminary considerations lead to the following, which is going to be a basic tool for proving a.s.convergence of the empirical distributions to a random distribution on the boundary.Proof.Let ϕ ∈ C(∂X) be a continuous test function on ∂X, and let f ϕ ∈ C X be its harmonic extension to the whole of X.Then by the definitions of the harmonic measures and of the weak* convergence whence the claim.
Corollary 3.23.Under the conditions of Proposition 3.22, let θ n ∈ Prob(X) be a sequence of measures escaping to infinity on X (i.e., such that θ n (x) → 0 for any x ∈ X).Then the sequence θ n converges if and only if the sequence of the harmonic measures κ θn converges, and the limits of these two sequences coincide.
Proof.The claim follows from the compactness of the space Prob X in the weak* topology.Indeed, if κ θn is convergent, whereas θ n is not, then by the compactness the sequence θ n has at least two different limit measures θ 1 ∞ , θ 2 ∞ which by the escape assumption are supported by ∂X.By taking sub-sequences of θ n converging to θ 1 ∞ and to θ 2 ∞ , respectively, one then arrives at a contradiction with Proposition 3.22.

3.C. Population convergence.
We finally come to the application, resp.extension of the results from subsection 3.A and subsection 3.B to the setup of branching Markov chains (see subsection 1.B).We recall that, given a map κ : X → Prob(K), we denote by its extension (3.5) to the population space M = Z + (X) over X, so that, in particular, The normalisation is then the average of the measures κ x over a population m (where m, as always, is treated as a multiset).
Theorem 3.26.If (1) a branching Markov chain on the state space X has constant branching ratio ρ > 1, and its offspring distributions satisfy the uniform L log L moment condition, (2) (K, κ) is a separable compact stationary space for the underlying Markov chain on X, then (I) for almost every sample path M = (M n ) of the branching Markov chain there exists the limit κ M = w*-lim n→∞ 1 ρ n κ Mn , which is a positive finite Borel measure on K; (II) the barycentre of the measures {κ M } with respect to any distribution P P P x , x ∈ X, on the path space of the branching Markov chain is κ x : In particular, if (3) X = X ∪ ∂X is a compactification of the state space X which satisfies (SC ) and is stochastically resolutive with respect to the underlying Markov chain, then (I) and (II) hold for the associated family of hitting distributions on the boundary ∂X.Furthermore, if in addition (4) the compactification is Dirichlet regular for the underlying Markov chain, then (III) For almost every sample path of the branching Markov chain, Proof.The argument for the proof of (I) and (II) is essentially the same as in the proof of Proposition 3.7 (which could potentially be generalised to allow the measures from a harmonic family to be not necessarily normalised and to depend on time, cf.Section 4.D).The only difference is that the arising martingales of the branching Markov chain are not bounded.Still, they are dominated by the uniformly integrable population martingale.
Let us first take a test function ϕ ∈ C(K), let be the corresponding harmonic function of the underlying Markov chain on X, and let be its lift to M. Then by Corollary 1.16 and Remark 3.4 the function f is ρ-harmonic for the branching Markov chain, whence the sequence of random variables ρ n κ Mn , ϕ on the path space of the branching Markov chain is a martingale with where W n = W n (M) is the population martingale of Definition 0.6.Then Theorem 2.3 on the uniform integrability of (W n ) implies the uniform integrability of the martingale (W f n ) as well, so that the limit exists for almost all sample paths and has the property that If Φ ⊂ C(K) is a countable dense subset, then there is a common co-negligible subset Ω of the path space such that the limit (3.29) exists and satisfies (3.30) for all M ∈ Ω and any ϕ ∈ Φ.By (3.28) where f i = f ϕ i , i = 1, 2, are the harmonic functions (3.27) associated with the functions ϕ i .Therefore, the limit (3.29) exists on Ω for all ϕ ∈ C(K) and satisfies condition (3.30).For any fixed realisation of M on Ω, it defines a positive linear functional on , a non-negative Borel measure κ M with total mass κ M = W ∞ (M) (3.31) which is strictly positive by Theorem 2.4.The identity (II) is then precisely the fact that (3.30) is satisfied for all ϕ ∈ C(K).Finally, the existence of a stochastically resolutive compactification obviously implies the transience of the underlying chain, and therefore (I) implies (III) in view of Corollary 3.23.
In the course of the proof of Theorem 3.26 we have seen, in formula (3.31), that the norm of the limit measure κ M is the limit W ∞ (M) of the population martingale, whence we get the following.
Theorem 3.32.Under conditions (1) and (2) of Theorem 3.26 for almost all sample paths M = (M n ) of the branching Markov chain the averages converge in the weak* topology of Prob(K) to the probability measure ⌢ κ M (the normalisation of the measure κ M from Theorem 3.26), and In particular, this is the case for the boundary ∂X of any stochastically resolutive compactification satisfying (SC ) X = X ∪ ∂X of the state space of the underlying Markov chain endowed with the family of the arising hitting measures.Moreover, if the compactification is Dirichlet regular, then the empirical distributions converge almost surely to the limit measure ⌢ κ M in the weak* topology of Prob X .
Remark 3.33.Under conditions (1) and (2) of Theorem 3.26, the random limit probability measure ⌢ κ M is a random point mass if and only if there is a deterministic element z ∈ K such that κ x = δ z for all x ∈ X.In this case, also ⌢ κ M = δ z is deterministic.
Proof.The "if" as well as the last statement are obvious.For the interesting part, we need some refined notation.We write M x = (M x n ) n≥0 for the branching Markov chain starting at time 0 with one particle at position x ∈ X, and the other related objects will also be equipped with the superscript x.In particular, we denote by ⌢ κ x n the normalised measure associated with the population at time n according to (3.25), that is, Here and below, one must observe (without adding further involved notation) that the elements y ∈ M x t appear according to their multiplicity, and the respective norms M y n and measures ⌢ κ y n are independent (in particular, not identical).If we let n → ∞ and apply Theorem 3.32 then we get and the respective limits W ∞ (M y ) and limit measures ⌢ κ y M are independent among themselves (including multiple appearances), but not independent of M x .The sum in (3.34) is a convex combination with a.s.strictly positive coefficients by Theorem 2.4.
We now take t to be the first moment when M x t ≥ 2. By our assumptions, this is an a.s.finite stopping time.
But the latter measures -at least 2 -are independent, and it is a straightforward exercise that ζ must be deterministic.
We note that the last proposition is related to the issue of triviality of the Poisson boundary.The latter will be considered further below.

3.D. Adaptedness conditions.
Having in mind the above boundary convergence results, we are now going to list several compactifications of the state space X of a discrete Markov chain (to be thought of as the underlying chain of a branching Markov chain) and comment upon the key properties of these compactifications required in Theorem 3.26: stochastic resolutivity and Dirichlet regularity.Suppose that X carries a certain geometric, algebraic or combinatorial structure, and that the transition operator P is adapted in some way (to be specified in more detail) to that structure.In this situation the Markov chain is usually called random walk (so that the corresponding branching Markov chain becomes a branching random walk, cf.Example 1.24.)How does its adaptedness affect the behaviour of the chain?For the next considerations, we assume that X carries the structure of an unoriented infinite graph which is locally finite, i.e., for every vertex x ∈ X the cardinality deg(x) of its set of neighbours N (x) is finite, and connected.
(LFC ) We denote by E(X) ⊂ X × X the edge set of X and write d(x, y) for the graph distance on X .We recall that the transition operator P is always assumed to satisfy condition (TC ), i.e., to be transient and to have pairwise communicating states.Here is a list of different basic geometric adaptedness conditions.The random walk (X, P ) with the transition probabilities p(x, y) = p x (y) is said to be • simple, if for any x ∈ X the transition measure p x is equidistributed of its set of neighbours N (x), i.e., p(x, y) = 1/ deg(x) for [x, y] ∈ E(X), and p(x, y) = 0, otherwise.
• of bounded range, if there is R < ∞ such that p(x, y) > 0 only when d(x, y) ≤ R.
• uniformly irreducible, if there are N < ∞ and ε > 0 such that for any pair (x, y) ∈ E(X) there is a time n ≤ N with p (n) (x, y) ≥ ε.
One can also impose various tightness or moment conditions on the distributions of the distances d(x, y) with respect to the transition probabilities p(x, y), e.g., the uniform first moment condition from Definition 2.2 (see Kaimanovich -Woess [KW92, Section 3] for a detailed discussion).
We now consider algebraic adaptedness.A graph X is called vertex transitive if the action of its group of automorphisms Aut(X) on the vertex set acts transitively on the vertex set.This is the case for the Cayley graph of any finitely generated group with respect to a finite symmetric set of generators S (i.e., [x, y] is an edge if and only if x −1 y ∈ S).There are also vertex transitive graphs which are not Cayley graphs.
Given a transition operator P on a state space X (not necessarily endowed with a graph structure), one can define the automorphism group of the Markov chain (X, P ) as where Perm(X) denotes the group of all permutations (not necessarily finitely supported!) of X.A natural algebraic adaptedness condition in this situation is to require that Aut(X, P ) (or a subgroup) act transitively on X, or at least quasi-transitively, which consists in requiring that the action of the corresponding automorphism group have finitely many orbits.This condition is satisfied for so-called random walks with internal degrees of freedom (also known under numerous other names, in particular, as semi-Markov, covering, or coloured chains), e.g., see Kaimanovich -Woess [KW02] and the references therein.

3.E. A zoo of compactifications.
We now display several "geometric" compactifications of a graph X satisfying conditions (LFC ) and discuss if and how Theorem 3.26 applies.
Example 3.35 (end compactification).This definition goes back to Freudenthal [Fre45] and Halin [Hal64].We denote by C(F ) the (finite!) collection of all infinite connected components of the graph X\F obtained from X by removing a finite set of edges F ⊂ E(X).The end compactification X E = X ∪ ∂ E X is the unique (up to homomorphisms) minimal compactification of X to which all the indicator functions 1 C of connected components C ∈ C(F ) extend continuously.The space of ends ∂ E X is the projective limit of the discrete spaces C(F ) as F → X.There is also a more explicit graph-theoretical description.
Example 3.36 (hyperbolic compactification).A graph X is called hyperbolic, if it is a Gromov-hyperbolic metric space with respect to the standard graph metric.We refrain from re-stating all features of hyperbolic spaces and groups.See the original paper by Gromov [Gro87] (and its numerous renditions), or, in the context of random walks on graphs and groups, Woess [Woe00, §22].A hyperbolic graph X has a hyperbolic compactification X H with the hyperbolic boundary ∂ H X.
Example 3.37 (Floyd compactification).Let f : Z + → (0, ∞) be a summable function such that there is 0 < c < 1 with We define the f-length ℓ f (π) of any finite path π in X as the sum of the f-lengths We denote by X f the completion of X with respect to this metric, with the resulting The space X f is compact and does not depend on the choice of the root o (the identity map on X extends to a homeomorphisms between the compactifications corresponding to different roots), see Floyd [Flo80] and Karlsson [Kar03a, Kar03b, Kar03c].
The end, the hyperbolic (provided the graph is hyperbolic), and the Floyd compactifications have the following common features (see the aforementioned references): • The action of the group of automorphisms Aut(X) on X extends to its action on the whole compactification by homeomorphisms; • If the graph is vertex-transitive, then the boundary of the compactification consists of one, two, or uncountably many points; • All these compactifications are contractive Aut(X)-compactifications in the sense of Woess [Woe93].The Floyd compactification is finer than the end one, i.e., there exists a (necessarily surjective) continuous map π : X f → X E (an extension of the identity map on X) such that the embedding X ֒→ X E is the composition of the embedding X ֒→ X f with π.If the graph X is hyperbolic, then the hyperbolic compactification is intermediate between the other two, i.e., it is coarser than the Floyd one and finer than the end one (the latter fact was, to our knowledge, first explicitly stated by Pavone [Pav89]): Note that in general even the vertex-transitive graphs with infinitely many ends may be quite far from being hyperbolic.
The following result from Woess [Woe93, Section 4] provides sufficient conditions for the applicability of Theorem 3.26: Proposition 3.38.Let X be a graph satisfying (LFC ), and let X be one of its compactifications -the end, the hyperbolic (if X is hyperbolic), or the Floyd one -with infinite boundary ∂X.If the group Aut(X, P ) acts quasi-transitively on X and does not fix a boundary point, then the compactification is Dirichlet regular.
The case when Aut(X, P ) fixes a boundary point is a very special one; we omit the details here.
Next, we review the situation when no group invariance is assumed.We recall that the spectral radius of a transition operator P is defined as under condition (TC ) it does not depend on the choice of x, y ∈ X (see, e.g.Woess [Woe00]).
Proposition 3.40.Suppose that (TC ) holds.The end compactification is stochastically resolutive if one of the following two conditions holds: (i) P has bounded range; (ii) P is uniformly irreducible, has a uniform first moment, and r(P ) < 1.Moreover, it is Dirichlet regular if, in addition to (i) or (ii), the following respective conditions hold: (i ′ ) under condition (i): the Green kernel vanishes at infinity; (ii ′ ) under condition (ii): there are C > 0 and r < 1 such that Remark 3.42.With small modifications, Proposition 3.40 also holds for the other two compactifications.
(a) If the graph X is hyperbolic and r(P ) < 1, then Proposition 3.40 holds for the hyperbolic compactification of X as well.
(b) For the Floyd compactifications in absence of a transitive group action, the case of simple random walk has been touched by Karlsson [Kar03c].This has recently been generalised by Spanos [Spa21]; in particular, stochastic resolutivity, resp.Dirichlet regularity hold under conditions (ii), resp.(ii)+(ii ′ ).
For proofs and references regarding the end and hyperbolic cases, as well as the question of validity of condition (3.41), see Woess [Woe00], in particular Sections 21-22.
Boundary convergence of Markov chains is a vast and active area, and the purpose of the examples above is just to convey its flavour as a backdrop for Theorem 3.26 rather than to provide any comprehensive overview.Without going into further details, let us mention, for instance, the related work on the visual compactification of Cartan -Hadamard manifolds, the Busemann (or horospheric) compactification of metric spaces, various compactifications of Riemannian symmetric spaces, boundaries of planar graphs, the Thurston compactification of Teichmüller space, etc. etc.
Let us finally discuss a compactification of the state space X intrinsically determined just by the transition operator P .The latter, as always, is assumed to satisfy condition (TC ), and therefore a normalisation of the Green kernel produces the Martin kernel where o ∈ X is a fixed reference point.Martin compactification is the unique (up to homeomorphisms) minimal compactification X M = X ∪ ∂ M X of the state space X to which each function K o (x, •), x ∈ X, extends continuously in the second variable (e.g., see Woess [Woe09] for a detailed exposition).The Martin compactification of a bounded range Markov operator P on a graph X is known to be comparable with the aforementioned geometric compactifications of X in the following situations: (i) it is finer than the end compactification; (ii) it coincides with the hyperbolic compactification -if X is hyperbolic and r(P ) < 1; (iii) it is finer than the Floyd compactification -if (X, P ) is a random walk on (the Cayley graph of) a finitely generated group.For (i) and (ii), see Woess [Woe00, Chapter IV] and the references therein; (iii) is very recent and due to Gekhtman, Gerasimov, Potyagailo and Yang [GGPY21].
Every positive harmonic function h has an integral representation for a unique Borel measure ν h o on ∂ M X.The Martin compactification is stochastically resolutive, and the hitting distribution ν o issued from the reference point o is precisely the representing measure ν 1 o of the constant harmonic function 1(x) ≡ 1.There are various classes of Markov chains for which the Martin compactification is Dirichlet regular, but there are also many classes for which it is not.In any case, at least claim (iii) of Theorem 3.26 always applies to the Martin compactification.
After mentioning that the Martin boundary considered as a measure space endowed with the family of the representing measures ν 1 o is isomorphic to the Poisson boundary of the chain, we shall now switch to discussing the boundary behaviour of branching Markov chains in the measure-theoretic setting.

Boundary correspondence
4.A.Motivation: topological case.In the topological setup, as we saw in the previous Section (Theorem 3.26 and Theorem 3.32), under suitable conditions there is a natural map M → κ M , which assigns to almost every sample path of the branching Markov chain M = (M n ) a finite positive weak* limit measure κ M on a stationary space K.The total mass κ M is the limit of the population martingale (0.7), and its normalisation κ M κ M is a random probability measure on K which can be interpreted as a limit of the population averages.In particular, if K = ∂X is the boundary of a Dirichlet regular compactification of the state space X, then ⌢ κ M is the weak* limit of the empirical distributions on the populations M n .The purpose of this section is to show that the limit measures associated with the sample paths of the branching Markov chain can also be defined by entirely measuretheoretical means not involving any topological convergence, as the transition probabilities of a certain Markov transfer operator acting between two measure spaces.This will allow us to the limit distributions of the branching Markov chain on the measuretheoretical boundaries of the underlying chain.
Before proceeding further, let us return to Theorem 3.32.It provides a measurable family of probability measures ⌢ κ M on the stationary space K parameterised by the sample paths of the branching Markov chain M. Considered as a Markov kernel from the path space M Z + to K, this family gives rise to a positive norm 1 linear operator from C(K) to the space L ∞ (M Z + ) of bounded measurable functions on the path space of the branching chain with respect to the default measure class (FS ).
We will now go in the opposite direction by first defining an appropriate transfer operator and then using it to produce the associated family of boundary measures.

4.B. Tail and Poisson boundaries.
We begin with reminding the basic definitions and facts from the measurable boundary theory of Markov chains, see Kaimanovich [Kai92] and the references therein for more details.Given a transition operator P on a countable state space X (or, equivalently, the corresponding family of transition probabilities), this theory provides an integral representation of bounded harmonic functions, or, more generally, of bounded harmonic sequences (≡ space-time harmonic functions) By A ∞ n we denote the σ-algebra on the path space X Z + determined by the positions of the chain at times ≥ n.The intersection is the tail σ-algebra of the Markov chain (X, P ), and it gives rise to the tail boundary T P X defined in the measure category by using Rokhlin's correspondence between (complete) sub-σ-algebras of Lebesgue measure spaces and their quotient spaces; see e.g.Coudène [Cou16,Chapter 15].We denote the corresponding quotient map by tail = tail P : X Z + → T P X . (4.2) The tail boundary is endowed with the harmonic measure class, which is the tail image of the default measure class (FS ) on the path space, and the notation L ∞ (T P X) refers to the harmonic measure class.Any initial position (n, x) from the space-time Z + ×X determines the associated harmonic probability measure η (n,x) on T X P absolutely continuous with respect to the harmonic measure class, and is a space-time P -harmonic function for any f ∈ L ∞ (T P X).Equivalently, f is a harmonic function of the space-time Markov chain on Z + × X (for which the spatial transitions are accompanied by increasing the time coordinate by one).Conversely, a space-time function f = (f n ) is P -harmonic if and only if the sequence of its values f n (X n ) along the sample paths of the Markov chain is a martingale with respect to the increasing coordinate filtration of the path space.In particular, if f is bounded, then the limit exists and is measurable with respect to the tail σ-algebra, i.e., it determines a function f in L ∞ (T P X).Formulas (4.3) and (4.4) establish an isometric isomorphism of L ∞ (T P X) and of the space of bounded space-time P -harmonic functions endowed with the sup norm.
In the same way one defines (also in the measure category) the Poisson boundary ∂ P X responsible for an integral representation of bounded P -harmonic functions (whence the name alluding to the classical Poisson formula for harmonic functions on the unit disk).It is the quotient of the path space under the boundary map bnd = bnd P : X Z + → ∂ P X , determined by the exit σ-algebra (the sub-algebra of the tail σ-algebra consisting of shift invariant sets -this is why this σ-algebra is also sometimes called invariant).The following commutative diagram illustrates the relationship between the path space, the tail and the Poisson boundaries: The Poisson boundary can be interpreted as the space of ergodic components of the transformation T of the tail boundary induced by the time shift on the path space, and the resulting projection is the map p : T P X → ∂ P X in the above diagram.Formulas (4.3) and (4.4) restricted to the space of bounded P -harmonic functions (i.e., of space-time harmonic functions constant in time) establish its isometric isomorphism with the subspace of L ∞ (T P X) that consists of T -invariant functions.In terms of the Poisson boundary this isomorphism takes the form of the Poisson formula where f ∈ L ∞ (∂ P X), and ν x are the harmonic measures on the Poisson boundary, i.e., the images of the measures P x on the path space under the boundary map bnd -or, equivalently, the images of the measures η x = η (0,x) on T P X under the quotient map p : T P X → ∂ P X .We now provide a characterisation of the topological P -boundaries of a Markov chain in terms of quotients of the Poisson boundary mentioned after Definition 3.14.Proposition 4.6.Let P be a Markov operator on a countable state space X, and let (K, κ) be a compact separable P -stationary space.It is a P -boundary if and only if, as a measure space, it is a quotient of the Poisson boundary ∂ P X, i.e., there exists a measurable map q : ∂ P X → K such that q(ν x ) = κ x for all x ∈ X.
Proof.Let K be a P -boundary.Then for almost every sample path of the Markov chain X = (X n ) the weak* limit κ X = w*-lim t→∞ κ Xt is a delta measure, which provides a map from the path space to K which is measurable with respect to the exit σ-algebra, i.e., a sought for measurable map q : ∂ P X → K, and by formula (3.8) from Proposition 3.7 q(ν x ) = κ x for all x ∈ X.
Conversely, let K, as a measure space, be a quotient of the Poisson boundary.Then any test function ϕ ∈ C(K), considered as an element of L ∞ (K), can be lifted to a function f ∈ L ∞ (∂ P X).Let f = f ϕ be the associated bounded harmonic function: Then for almost every sample path of the Markov chain X = (X n ), i.e., κ X = δ q•bnd X .
4.C.Boundaries of branching Markov chains.Now we pass to the branching chain on M determined by the transition probabilities Π m (1.8), or, equivalently, by the transition operator P (1.9).We denote the tail boundary of the branching Markov chain by T P M, and the Poisson boundary by ∂ P M. By η (t,m) and ν m we denote the harmonic measures on the respective tail and Poisson boundaries corresponding to an initial population m ∈ M (replacing, as in §1.B, the subscript δ x with x if the initial population is a singleton δ x ).
We recall that in what concerns random walks on countable groups (cf.Example 1.24), the difference between the tail and the Poisson boundaries is not very significant.They do coincide in the aperiodic case (Derriennic [Der76]); otherwise the fibres of the projection p (4.5) of the tail boundary onto the Poisson boundary are parameterised by the periodicity classes of the random walk (Jaworski [Jaw95]), in particular, p is always a bijection with respect to a one-point initial distribution (the latter fact plays a key role in the entropy theory of random walks on groups, see Derriennic [Der80] and Kaimanovich -Vershik [KV83]).The situation is similar for random walks on graphs under the uniform ellipticity condition (the transition probabilities between any two neighbouring vertices are bounded away from 0): in this case the tail and the Poisson boundaries also coincide with respect to any one-point initial distribution, and the cardinality of the fibres of the projection p is at most 2, see Kaimanovich [Kai92, Corollary 2 on p. 162].
Branching Markov chains are manifestly space inhomogeneous, and the difference between their tail and Poisson boundaries is much more pronounced.It can be illustrated already by the simplest example: Remark 4.8.It seems plausible that the tail and the Poisson boundaries admit a similar description for branching Markov chains over any finite state space (≡ multi-type branching processes with a finite number of types).As far as we know, this question has not been addressed in the literature.
If one passes to branching Markov chains over an infinite state space, then the problem of identification of the tail and the Poisson boundaries appears to be horizon-less.This is indicated by the abundance of various martingales already in the simplest case of branching random walks on Z (e.g., see Shi [Shi15, Chapter 3]).To the best of our knowledge, this problem has not been formulated even in the aforementioned Z case.We are now going to provide links between measure-theoretic boundaries of a branching Markov chain and that of the underlying chain.4.D. Harmonic martingales.Given two functions f and g on the state space X, their respective lifts f and g to M (see Section 1.C) have the property that P f = g if and only if π • P f = g, see Proposition 1.14 and recall that x → π x is the assignment of the branching ratio at x.We immediately get the following.
Proposition 4.9.The lifts f n of a sequence of functions f n on X form a space-time P-harmonic function on M, i.e., Definition 4.12.The harmonic martingale determined by a space-time P -harmonic function f = (f n ) on X is the sequence of random variables on the path space of the branching Markov chain M = (M n ).
We denote the pointwise (almost everywhere) limit of the harmonic martingale (4.13) by The random variable W f ∞ is tail measurable, and therefore it can be presented as the composition W f ∞ = w f • tail P (4.14) of the quotient map tail P : M Z + → T P X (4.2) with the arising measurable function w f on the tail boundary T P M of the branching Markov chain.
In the particular case when f = 1 the associated harmonic martingale (W 1 n ) is precisely the population martingale (W n ) introduced in Definition 0.6 and studied in Section 2. We denote by w = w 1 the function on the tail boundary determined by the pointwise limit ∞ of the population martingale (the limit population ratio, see Definition 0.6).If the limit population ratio is positive (by Theorem 2.4 this is almost surely the case under the uniform L log L moment condition), then is nothing but the limit empirical average of the functions f n along the branching Markov chain.
Remark 4.16.In the context of branching Markov chains the term "harmonic martingale" was used by Biggins -Cohn -Nerman [BCN99] for the sequence (in our notation) ϕ n (M n ), where (ϕ n ) is a sequence of functions on X such that its lift to the space of populations ( ϕ n ) is a space-time P-harmonic function.Actually, in the setup of [BCN99] the state space X is endowed with a space-time partition into pairwise disjoint levels X n , n ≥ 0 such that X 0 consists of a single state x 0 , the branching chain starts at time 0 from a single particle sitting at x 0 , and at each step the population moves to the next level, so that the time n random population M n is concentrated on X n .Therefore, the sequence (ϕ n ) can be considered as a single function ϕ on the state space X with the property that its lift ϕ is a P-harmonic function on the population space M. Such functions on X are called mean-harmonic by Biggins -Kyprianou [BK04].Our setup is slightly more general as we deal with the space-time harmonic functions which do not necessarily come from a space-time partition of the state space.We feel that it is more consistent to deal with the space-time harmonic functions (instead of the time constant ones) from the very beginning.The reason is that martingales, by their very nature, are linked with the tail σ-algebra (rather than with the exit one), cf.Section 4.B and Theorem 4.17.
4.E.Boundary transfer operator.Below we are going to use the standard facts from the measurable theory of Markov operators, e.g., see Foguel [Fog80].We recall that, given two measure spaces (X, µ X ) and (Y, µ Y ), a linear operator is called Markov if it preserves constants, is positive and order continuous, i.e., (i) B1 Y = 1 X , (ii) Bϕ ≥ 0 for any ϕ ≥ 0 , (iii) Bϕ k ↓ 0 for any sequence ϕ k ↓ 0 .
As we have explained in Section 4.B, formulas (4.3) and (4.4) establish an isometric isomorphism f → f = (f n ) of L ∞ (T P X) and of the Banach space of bounded space-time P -harmonic functions on the state space X endowed with the sup norm.Proof.Property (i) from the definition of a Markov operator follows from Theorem 2.4, whereas (ii) is obvious.We just have to verify property (iii), i.e., the order continuity of B. It is here that we use the uniform integrability of the population martingale which implies that the operator B preserves the integrals with respect to appropriately chosen measures on the tail boundaries T P X and T P M.
The first observation is that the uniform integrability of the population martingale (Theorem 2.3) implies the uniform integrability of the harmonic martingale W f n for any bounded space-time harmonic function f = (f n ) on X.Thus, for any initial population m ∈ M P P P m , W f 0 = P P P m , W f ∞ , or, in view of (4.13) and (4.In view of the monotone convergence theorem, identity (4.19) then implies that if f (k) ∈ L ∞ (T P X) with f (k) ↓ 0 almost everywhere, then B f (k) ↓ 0 almost everywhere with respect to all measures w•η m , m ∈ M, which by Theorem 2.4 is the same as the almost everywhere convergence with respect to the harmonic measure class on T P M.
If a measurable space K is a quotient of the tail boundary T P X, then the precomposition of the transfer operator B (4.18) constructed in Theorem 4.17 with the lift L ∞ (K) → L ∞ (T P X) provides a Markov operator which we are going to compare with the operator B K (4.1).Since the map M → κ M is tail measurable by the definition of the measures κ M in Theorem 3.26, the operator B K produces tail measurable functions on the path space, and therefore its range can be identified with the space L ∞ (T P M).
Theorem 4.21.Under the conditions of Theorem 4.17, if K be a compact separable P -boundary of the underlying Markov chain (X, P ), then the restriction of the operator B K (4.20) to the space C(K) coincides with the operator B K (4.1).Theorem 4.22.If the offspring distributions of a branching Markov chain satisfy the uniform L log L moment condition, then the tail boundary of the underlying chain T P X is endowed with a family of probability measures η ξ indexed by the points ξ ∈ T P from the tail boundary of the branching chain with the following properties: (i) The family {η ξ } is measurable in the sense that for any function ϕ ∈ L ∞ (T P X) the integrals η ξ , ϕ depend on ξ measurably.(ii) If K is a compact separable P -boundary, then for almost every sample path of the branching chain M = (M n ), the limit measure ⌢ κ M on K from Theorem 3.32 is the image q(η ξ ) of the measure η ξ , ξ = tail M, under the quotient map q : T P X → K. (iii) In particular, if ∂X is the boundary of a Dirichlet regular compactification of the state space X, then the empirical distributions ⌢ M n almost surely weak* converge to the image of the measure η ξ , ξ = tail M, under the quotient map from the tail boundary T P X to ∂X.

Proof
Proof.We will construct the measures η ξ as the transition probabilities of the Markov operator B from Theorem 4.17.
We denote by L 1 (T P X) the Banach space of finite measures absolutely continuous with respect to the harmonic measure class on the tail boundary T P X of the underlying chain and endowed with the total variation norm.We emphasise that this space -in the same way as its dual L ∞ (T P X) -is defined "coordinate free", just in terms of the harmonic measure class.If one takes a reference measure λ from this class, then the elements of L 1 (T P X) can be identified with their densities with respect to λ, after which L 1 (T P X) becomes the "usual" space L 1 (T P X, λ).Likewise, we denote by L 1 (T P M) the analogous space associated with the tail boundary of the branching Markov chain.
Being Markov, the operator B : L ∞ (T P X) → L ∞ (T P M) from Theorem 4.17 is dual to an operator λ → λB , L 1 (T P M) → L 1 (T P X) , and, as we saw in the course of the proof of Theorem 4.17, formula (4.19), For any initial probability measure λ ∈ L 1 (T P M) the operator B gives rise to the associated joint distribution λ B on the product T P M × T P X whose marginal distributions are the measures λ and λB.Since all involved measure spaces are Lebesgue spaces, the conditional measures η ξ , ξ ∈ T P M, on the fibres {ξ} × T P X ∼ = T P X of the projection T P M × T P X → T P M are well-defined and their dependence on ξ is measurable.Their system does not depend (mod 0) on the choice of λ and provides the transition probabilities that determine the operator B, so that λB = η ξ dλ(ξ) for any λ ∈ L 1 (T P M).
Claim (ii) then follows from Theorem 4.21.Indeed, the operator B K (4.20) being the result of the precomposition of the operator B (4.18) with the lift L ∞ (K) → L ∞ (T P X) determined by the quotient map q : T P X → K, its transition probabilities are the q-images of the transition probabilities η ξ of the operator B. On the other hand, by Theorem 4.21 the operator B K has the same transition probabilities as the operator B K (4.1), whereas the latter ones are, by definition, the measures ⌢ κ M .
Finally, claim (iii) now follows from Theorem 3.32.
Remark 4.23.Although the measure λB on T P X is absolutely continuous with respect to the harmonic measure class, the measures η ξ need not be absolutely continuous.An extreme example is provided by the situation when there is no branching (ρ = 1), and the branching Markov chain is reduced to an ordinary one (Example 1.17).
the existence of µ-stationary probability measures is guaranteed by the Krylov -Bogolyubov theorem κ → µ * κ , see Furstenberg [Fur63a, Definition 1.2 and Lemma 1.2].In terms of the transition operator P = P µ of the random walk on X determined by µ (cf.Example 1.24), the µstationarity of a measure κ is equivalent to the P -harmonicity of the family of translates κ x .The proof of Proposition 3.7 above follows the group case argument in Furstenberg [Fur71, Lemma 3.1 and the ensuing Corollary] which essentially goes back to Furstenberg [Fur63a, Lemma 1.3]; also see Woess [Woe96, Theorem 2.2], [Woe00, Theorem 20.3] (cf.Proposition 3.19 below).

}
of its edges e = [x, y], where as usual |x| = d(x, o) denotes the graph distance between x and a fixed root vertex o.The Floyd distance on X is the resulting path metric d f (x, y) = inf{ℓ f (π) : π is a finite path from x to y } .

Example 4. 7 .
Consider the Galton -Watson process (Example 1.18).Let ρ > 1 be the mean of a non-degenerate offspring distribution π ∈ Meas(N) that satisfies the L log L moment condition.Then, assuming the non-extinction condition (NE ) (as always in this paper), for almost every sample path of the Galton -Watson process (Z n ) there exists the limitW ∞ = lim n Z n /ρ n > 0 (cf.Section 2.A and Section 4.D) which is clearly tail measurable.It was proved by Lootgieter [Loo77, Corollaire 2.3.II, Corollaire 3.3.II] that the limit W ∞ completely describes the tail behaviour of the Galton -Watson process with respect to any one-point initial distribution, or, in the aperiodic case, with respect to any initial distribution.(According to Cohn [Coh79, pp.420-421], this result was also independently obtained by B. M. Brown in an unpublished 1977 manuscript "The tail σ-field of a branching process".)Therefore, under the L log L moment condition the tail boundary of the Galton -Watson process coincides with the product of the positive ray R + by the finite set of periodicity classes, or just with R + in the aperiodic case.The arising limit measure on R + (the distribution of W ∞ ) is actually absolutely continuous, and its support is the whole ray R + , see Athreya -Ney [AN72, Theorem I.10.4].The time shift on the path space amounts to the multiplication of the limit W ∞ by ρ.The corresponding action of the group Z on R + is dissipative, and therefore its space of ergodic components (≡ the Poisson boundary of the Galton -Watson process) can be identified with the fundamental interval [1, ρ) .This identification of the Poisson boundary of Galton -Watson processes was first obtained -in somewhat different terms -by Dubuc [Dub71, Theorem 2] under the finite second moment condition.

. 10 )
Corollary 4.11.If condition (BR) is satisfied, then for any space-time P -harmonic function (f n ) on the state space X the sequence 1 ρ n f n is a space-time P-harmonic function on the population space M.

Theorem 4. 17 .
If the offspring distributions of a branching Markov chain with property (NE ) satisfy the uniform L log L moment condition, then the mapB : f → w f w , L ∞ (T P X) → L ∞ (T P M) , (4.18)is a Markov operator.Here w f /w is the function (4.15) on the tail boundary T P M that represents the limit empirical averages of the space-time harmonic function f = (f n ) determined by f .
14), m, f 0 = η m , w f , where η m = tail(P P P m ) is the harmonic measure on the tail boundary T P M corresponding to the initial distribution δ m .In terms of the boundary function f ∈ L ∞ (T P X) representing f we then have η m , f = w•η m , B f , (4.19)whereη m = x∈m η x ,and w • η m is the measure on T P M with the density w with respect to the harmonic measure η m , so that η m = w•η x = m .
In case all offspring distributions π x of a branching Markov chain coincide with a common distribution π, this does in no way imply that offspring and displacement are independent.It just means that all branching distributions Π x have the form