Random walks on regular trees can not be slowed down

A random walk on a regular tree (or any non-amenable graph) has positive speed. We ask whether such a walk can be slowed down by applying carefully chosen time-dependent permutations of the vertices. We prove that on trees the random walk can not be slowed down.


Introduction
One of the classical results relating the geometry of a space to the behaviour of random walks on the space is that on any non-amenable graph the random walk has positive speed, in that lim inf t→∞ t −1 |X t | exists and is a.s.positive.On transitive graphs the limit is even an almost sure constant.In particular, on the d regular tree, denoted T d , the speed for the simple random walk is d−2 d , which is positive as long as d > 2. The motivation for this paper is the question: Can we slow down the particle?
Suppose that after each step t of the random walk, we are allowed to apply some permutation π t to the vertices of the tree, so that if the particle is at v it is transported to π t (v).If we observe the particle and can choose π t accordingly, then we can constantly push it back to any vertex we wish, so that it never moves.Our main finding is that if the permutations do not depend on the location of the particle, then the particle can not be slowed down.

Permuted random walks
We start by considering lazy random walks, where the results are cleaner for mostly technical reasons (see the discussion below).We start by introducing some notations.Fix d ≥ 2, and let T = T d denote the rooted infinite d-regular tree.The vertex set is denoted by V = V (T d ).The root of the tree is denoted by v 0 .The depth |v| of a vertex v ∈ V is its distance from the root.The neighborhood N (v) of v is the set of vertices u that are of distance at most one from v. Note that since we are considering lazy random walks, it is convenient to have v ∈ N (v).Thus the size of N (v) is d + 1.
Let (X t ) ∞ t=0 be a lazy random walk on T started at the root.The laziness parameter P(X t+1 = X t ) is chosen to be 1 d+1 .That is, X 0 = v 0 and X t+1 is a uniformly random element of N (X t ).The (empirical) speed of (X t ) is defined to be the process (t −1 |X t |).The strong law of large numbers implies that the speed a.s.converges to d−2 d+1 .Note that this also holds in the case d = 2 where T 2 is the line and the speed is 0. For more on random walks on trees see e.g.[5,9] and references therein.The model we suggest for studying the slowing down of particles is as follows.Before the particle starts to move, we can choose a sequence (π t ) ∞ t=1 of permutations of V .(These do not need to be finitary; any bijections of V will do.)The permutation π t is applied on the random walk at time t.Thus the permuted random walk (Y t ) starts at the root, and its position at time t + 1 is defined by Y t+1 = π t+1 (Y t+1 ), where Y t+1 is a uniformly random vertex in N (Y t ).The (empirical) speed of (Y t ) is the process (t −1 |Y t |).In contrast with (X t ), the permuted random walk may not have a limiting speed.The lower speed of the permuted random walk is defined by lim inf t→∞ t −1 |Y t |.
Permuted random walk have been studied before, both on their own merit, and as a tool towards other ends.Pymar and Sousi [8] established uniform bounds on hitting times for permuted random walks on finite regular graphs.Ganguly and Peres [3] studied walks on an interval with a fixed uniform random permutation.Recently, Chatterjee and Diaconis [1,2] demonstrated that mixing of certain Markov chains can be significantly sped up by adding a deterministic permutation after each move.In a different direction, Gouëzel [4] used permuted random walks to establish large deviation lower bounds on the speed of random walks on hyperbolic spaces without moment assumptions on the step distribution.One idea here is to condition on the long steps of the walk, and consider the process as a permuted version of a walk with bounded steps for which other methods can apply.The question at the heart of this paper arose following a presentation of that work.
Our main result is that no matter how we select the permutations (π t ), the permuted walk (Y t ) is not slower than (X t ).
Theorem 1.1.For every d ≥ 2, every sequence (π t ) of permutations of V (T d ), and every time t ≥ 0, the depth of the permuted random walk |Y t | stochastically dominates the depth of the lazy random walk |X t |.That is, for all t, n ≥ 0, In particular, E|Y t | ≥ E|X t | for all t ≥ 0, and lim inf t→∞ t −1 |Y t | ≥ d−2 d+1 almost surely.
Remark 1.2.The proesses (X t ) and (Y t ) in Theorem 1.1 correspond to a lazy random walk that stays put with probability 1 d+1 .Theorem 1.1 holds verbatim (with the obvious change to the constant d−2 d+1 ) as long as the probability to stay put is at least 1 d+1 .In particular, it holds when the chance to stay put is one half, which is a more common definition of the lazy random walk.For details, see the remark after the proof of Theorem 1.1.
Note that the theorem is informative even for d = 2, where the limit speed is zero.However, the laziness is required for Theorem 1.1 to hold.Indeed, for the non-lazy walk on T d we can have E|Y 1 | < E|X 1 | (or for any other t).
Theorem 1.1 is a special case of a more general phenomenon, which we describe in the next two theorems.For a distribution p on V , define p * : N → [0, 1] by letting p * (j) be the total mass of the j largest atoms in p, or equivalently, We say that a distribution p majorizes a distribution q if p * (j) ≥ q * (j) for all j ∈ N.
Denote by p t the distribution of X t and by q t the distribution of Y t , depending implicitly on the fixed permutations (π t ).The stochastic domination asserted in Theorem 1.1 is a consequence of the following more technical statement.The main reasons are that the distribution p t is spherically symmetric and monotone in depth; for more details, see Section 3.
Theorem 1.3.For every t ≥ 0, the distribution p t majorizes q t .The fact that p t majorizes q t can be interpreted as saying that the amount of disorder in q t is at most that of p t .Concretely, the theorem implies that the Shannon entropy of Y t is at most the Shannon entropy of X t .There is no way to increase the entropy of a lazy random walk on a regular tree by applying time dependent permutations.
A second interpretation of the theorem is that for every t, there is a distribution r t on permutations of V , so that if σ t is sampled from r t independently of X t , then (X t , σ t (X t )) has the same distribution as (X t , Y t ).In other words, there is a distribution on a single permutation σ t that allows to replace the iterative application of the t permutations π 1 , . . ., π t .
An even more general statement than Theorem 1.3 holds.Let B n = {v ∈ V : |v| ≤ n} denote the ball of radius n ≥ −1 in the tree 1 and ∂B n = B n \ B n−1 the sphere of radius n.Fix an order v 0 , v 1 , v 2 , . . . of V with the following property: for every i < j, it holds that |v i | ≤ |v j | and if |v i | = |v j | then the d − 1 children of v i appear in the order before the d − 1 children of v j .Initial segments of the form {v 0 , v 1 , . . ., v i } are called quasi-balls.Note that every ball is a quasi-ball.A distribution p on V is called greedily arranged if p(v i ) ≥ p(v i+1 ) for every i.
Theorem 1.4.Let p and q be distributions on V and let p and q be the corresponding distributions after a single step of a lazy random walk started at p and q, respectively.If p is greedily arranged and majorizes q, then p is greedily arranged and majorizes q .Theorems 1.1 and 1.3 follow by a simple inductive argument from the last theorem using the following two observations.First, the initial distribution p 0 is greedily arranged, and majorizes q 0 .Second, if a distribution p majorizes q, then it also majorizes any rearrangement of q (i.e., a distribution of the form q • π for a permutation π of V ).Thus Theorem 1.4 implies that for every t ≥ 0 and every finite J ⊂ V , (1.1) where B is the quasi-ball of size |B| = |J|.

Non-lazy random walks
The last result is particular to lazy random walks on regular trees.For non-lazy walks, it is too strong to be true.The distribution of a simple non-lazy random walk on a regular tree is not greedily arranged because the tree is bipartite; in particular, (1.1) may fail already for t = 1.
On the other hand, versions of the above theorems do hold for non-lazy walks, as we describe next.The limit speed of a simple (non-lazy) random walk on T d is d−2 d a.s.As noted, for such random walks, the same stochastic domination as in Theorem 1.1 does not hold.Nonetheless, we prove that it almost holds (at least for d > 2, when the tree is not the line).
Denote by N (v) the d neighbors of v not including v. Let (S t ) be a simple random walk so that S 0 = v 0 and S t+1 is uniform in N (S t ).Let (Z t ) be a permuted simple random walk so that Z 0 = v 0 and Z t+1 is π t+1 (Z t+1 ), where Z t+1 is uniform in N (Z t ).
Theorem 1.5.For every d > 2, every sequence (π t ) of permutations of V (T d ), and every time t ≥ 1, we have that For d = 2, the bound d−2 d on the lower speed of (Z t ) trivially holds, but the stronger claim in the theorem is false.One way to see this is to take π t to be the identity up to some large time 2T , and then map via π 2T all even integers in the range [−2T, 2T ] to all integers in [−T, T ] so that E|Z 2T | = 1 2 E|S 2T |.We shall deduce Theorem 1.5 from the following modification of Theorem 1.4 which takes into account the periodicity of the non-lazy walk.The vertex set can be partitioned according to parity into V 0 = {v ∈ V : |v| = 0 mod 2} and V 1 = V \V 0 .A distribution p is called half-greedily arranged if it is supported on one of V 0 or V 1 , and p(v i ) ≥ p(v j ) for every i < j for which v i and v j have the same parity (using the same ordering of V as above).
Theorem 1.6.Let p and q be distributions on V , and let p and q be the corresponding distributions after a single step of a non-lazy random walk started at p and q, respectively.If p is half-greedily arranged and majorizes q, then p is half-greedily arranged and majorizes q .
Although the distribution of S t is not greedily arranged, it is half-greedily arranged.The theorem thus implies that the distribution of S t majorizes that of Z t for every t ≥ 0 (although the distribution of |S t | does not necessarily majorizes that of |Z t |).

The speed process
Theorems 1.1 and 1.5 establish stochastic domination of the distance of a standard (lazy/simple) random walk over the distance of a permuted random walk at any particular time.It is natural to wonder whether such stochastic domination holds for the corresponding processes, i.e., whether the two processes can be coupled so that the distance of the permuted walk is always at least the distance of the standard walk.Somewhat surprisingly, it turns out this is not always possible.We focus on lazy random walks for concreteness.As an example, consider a sequence of permutations π in which π 1 and π 2 are the identity permutation and π 3 is an automorphism of T which maps a neighbor of v 0 to v 0 .A direct computation yields that When d = 2, this effect can be repeated and magnified over time.The next result shows that for certain choices of permutations, even translations, there are infinitely many times at which the distance of the permuted random walk is much smaller (no matter how the two processes are coupled).
Theorem 1.7.Fix d = 2.There exists a sequence of permutations (π t ) of V (T 2 ) ∼ = Z, all of which are translations, such that in any coupling of the lazy random walk process (X t ) and the permuted random walk process (Y t ), almost surely, When d > 2, on the other hand, we show that the above cannot occur (not even nearly) when the permutations are required to be automorphisms of T d .This is the content of the result below.We do not know how strong this effect can be for general permutations.For instance, we do not know whether it is always possible to couple the two processes so that, almost surely, |X t | ≤ |Y t | for all large enough t.
Theorem 1.8.For every d > 2 and every sequence of automorphisms (π t ) of T d , there exists a coupling of the lazy random walk process (X t ) and the permuted random walk process (Y t ) such that, almost surely, as t → ∞.
The theorem is interesting even when each π t is the identity.It states that there is a way to couple two lazy random walks so that one is significantly more distant than the other.The result is tight is the sense that the o(1) term cannot be dropped entirely.Our proof gives a quantitative estimate for this term and yields that t 1/2−o(1) can be replaced with √ t/(log C t) for some constant C > 0. See Lemma 5.2 and the second remark following it.

A spectral argument
One natural approach towards proving the results above is using spectral methods (see [7] and references within).Specifically, the transition kernel on 2 (V ) is a contraction with norm ρ < 1, and application of a permutation is an isometry on 2 (V ).Thus q t 2 ≤ ρ t decays exponentially.A positive lower bound on the lower speed of Y t follows easily.Moreover, this argument holds for any non-amenable graph.However, the resulting bound on the speed is not sharp.
The proof of a spectral gap uses an isoperimetric inequality for the tree.Not surprisingly, our proofs also use isoperimetric inequalities; see Propositions 2.2 and 2.3 below.Proposition 2.3 is a non-standard isoperimetric inequality, which takes into account the amount of "isolated" points in the set of interest.Proposition 2.1 is a significant generalization of the isoperimetric inequality using the language of majorization.

Isoperimetry
As noted, our arguments rely on isoperimetric properties of the tree.However, to get the strongest possible comparison between the permuted and regular random walks we need sharp isoperimetric inequalities, which we now proceed to prove.
Recall that N (v) is the neighborhood of a vertex v, including v itself.For J ⊂ V , the neighborhood of J is defined by To analyze the behavior of the random walk, we need to understand the boundary in more detail.For J ⊂ V and i ∈ [d + 1], define In particular, the set K 1 (J) is the neighborhood N (J).
A partition is a sequence µ = (µ 1 , µ 2 , . . ., µ ) with µ 1 ≥ µ 2 ≥ • • • ≥ µ ≥ 0. Note that usually trailing 0's are omitted, but for us it is convenient to have the length of the partitions be fixed, so we may include 0's.The size of the partition is defined by |µ| = i µ i .The dominance order on partitions is defined as follows.For partitions µ, λ, we write λ ≺ µ if |µ| = |λ| and (2.1) The following majorization statement is an extension of the standard isoperimetric inequality for the tree.Proposition 2.1.Let J ⊂ V be finite and let B be the quasi-ball with |B| = |J|.Let To prove this result, we need a couple of lemmas on the isoperimetric behavior of the tree.Let κ 1 (J) denote the number of connected components induced by J. Let κ 2 (J) denote the number of connected components induced by J in the graph in which edges are added between all pairs of vertices that are at distance 2 from each other in the tree.The first lemma is a formula for |N (J)| for general J: Proposition 2.2.For every finite J ⊂ V , Proof.We prove the claim by induction on |J|.The base case when |J| = 0 is trivial.Let J be non-empty.Let v ∈ J be a vertex of maximum depth in J. Let N 1 = N (v)∩(J \{v}) and N 2 = N (N (v)) ∩ (J \ {v}).The following two equalities hold: The induction hypothesis implies It remains to show that The left-hand side equals So we need to show that By the choice of v, there are at most two vertices in N (v) ∩ N (J \ {v}); the vertex v and its parent.
For the next lemma, we also need the following definitions.The sets of isolated points in J and connected points in J are defined by iso(J) = {v ∈ J : N (v) ∩ J = {v}} and con(J) = J \ iso(J).
Proposition 2.3.For every non-empty J ⊂ V , Proof of Proposition 2.1.The fact that (k i ) and (m i ) are decreasing is obvious.These are partitions of the same size s = (d + 1)|J|.If |J| = 1 then the statement trivially holds, so we can assume |J| > 1.The choice of order on V implies there is n ≥ 0 so that B n ⊆ B B n+1 , where B n is the ball of radius n.We can write where a, c are non-negative integers so that c < d − 1.
The tree is simple enough so that we can compute all the m i 's in terms of these: The case r = 1 of (2.1) now holds by Proposition 2.2: The case r = 2 is proved as follows.If k 2 = 0 then k i = 0 for all i ≥ 2 and the proof is complete.On the other hand, if k 2 ≥ 1 then by Proposition 2.3, and because con(J) ⊆ K 2 (J), For r ∈ {3, 4, . . ., d}, proceed by induction.Because k r ≥ k r+1 ≥ . . .≥ k d+1 , we have By induction, It follows that m r = b r .All k i 's and m i 's are integers, so the desired inequality follows.

Lazy random walks
The following proposition presents the key link between the isoperimetric inequality and the behavior of random walks.
Proposition 3.1.Let J ⊂ V be finite and let B be the quasi-ball with |B| = |J|.Let q * (m i ).
This may seem surprising until one realizes that q * can be any function on N that is increasing from 0 to 1 and is concave.The proof of Proposition 3.1 is based on the following majorization inequality, known as the Hardy-Littlewood-Pólya inequality and Karamata's inequality, a version of which was first proved by Schur; see e.g.[6,Theorem 3.C.1].Note that the definition of the dominance order µ λ extends verbatim to partitions of a real number with real instead of integer parts, and so this applies also for non-integer dominated sequences.In our setting, k i and m i are integers.Theorem 3.2.Let I ⊂ R be an interval and let f : I → R be concave.If µ, λ ∈ I t are two partitions such that µ λ, then Proof of Proposition 3.1.To apply Theorem 3.2 and Proposition 2.1 we need to extend q * to a concave function.By construction, the function q * : N → [0, 1] is increasing and can be written as q * (j) = i∈[j] D(i) where D : N → [0, 1] is a decreasing function.Thus extending q * to R + by a piecewise linear interpolation is increasing and concave.
The following observation helps to establish the property that a distribution is greedily arranged.We are now ready to complete the proof of our main results.
Proof of Theorem 1.4.Let J ⊂ V and let B be a quasi-ball of the same size.For Here, the first and last equalities follow from the definition of the lazy random walk; (3.1) follows from the definition of q * ; (3.2) follows from Proposition 3.1; (3.3) holds because p majorizes q; finally, (3.4) follows from Observation 3.3 and the assumption that p is greedily arranged.
For the set J that achieves q * (s), the above implies that q * (s) ≤ p (B) = p * (s).The fact that p is greedily arranged follows from Observation 3.3.
Proof of Theorem 1.1.Theorem 1.4 implies (1.1) and in particular p t (B n ) ≥ q t (B n ) for all n, t ≥ 0. In other words, |Y t | stochastically dominates |X t | for every t ≥ 0. This implies that E|Y t | ≥ E|X t |.It remains to show that lim inf t→∞ t −1 |Y t | ≥ d−2 d+1 almost surely.For every ε > 0, standard concentration bounds show that for some constants c, C > 0, for every t ≥ 0, the same holds with Y t instead of X t .The Borel-Cantelli lemma completes the proof.Remark 3.4.Theorem 1.4, and thus also Theorems 1.1 and 1.3, extends to the lazy random walk in which the probability to stay put is any γ ≥ 1 d+1 .The idea is that if q γ is the result of a lazy random walk step applied to a distribution q with lazyness γ, then for any γ > δ, q γ = γ − δ q + 1 − γ + δ q δ .We apply this with γ > δ = 1 d+1 to get Proof of Theorem 1.6.Let J ⊂ V and let B be the half-quasi-ball of the same size as J and with opposite parity than p.Let k i = |K i (J)| and m i = |K i (B)|.We have where the first and last equalities follow from the definition of the non-lazy random walk; (4.1) follows from the definition of q * ; (4.2) follows from Proposition 4.2; (4.3) holds because p majorizes q; and (4.4) follows from Observation 4.3 and the assumption that p is half-greedily arranged.The result follows in the same way as in the proof of Theorem 1.4.
Proof of Theorem 1.5.Denote by p t the distribution of S t , and denote by q t the distribution of Z t .Since d > 2, we have

Exceptional times
In this section we consider the possible slow-down of a random walk on Z ∼ = T 2 and on T d for d > 2. While the domination of Theorem 1.1 still applies, we ask here whether (π t ) may be chosen so that there are exceptional times where |Y t | is much smaller than |X t |.We prove Theorem 1.7 on the existence of exceptional times of slowing down on Z.In contrast, we prove Theorem 1.8 on the non-existence of such times on T d when d > 2 and the permutations are restricted to automorphisms.This section is mostly independent of the previous parts of the paper.

Exceptional times for Z
Proof of Theorem 1.7.The permutations (π t ) are all translations of Z. Consequently, the permutations commute not just with each other but with the steps of the random walk.We shall define an integer sequence t , and define the permutations π t by π Thus the process (Y t + t ) has the same law as the random walk (X t ).However, the coupling between the processes may not be such that Y t = X t − t , even though that is one possible coupling.
To define ( t ), let φ(t) denote the integer part of ( 4 3 t log log t) 1/2 .Let f (t) be a positive integer-valued non-decreasing function growing to infinity slower than φ(t).Let (b j ) ∞ j=0 be defined by b 0 = 1 and b j+1 = b j + f (b j ) for all j ≥ 0. Let ( t ) be defined by b j +i is the integer part of φ(b j ) • ( 4i f (b j ) − 2) for all j ≥ 0 and 0 ≤ i < f (b j ).Intuitively, for each j, the numbers of the form b j +i are uniformly and densely placed in the interval between −2φ(b j ) and 2φ(b j ).
Fix ε > 0 and consider the set T ε of times t at which X t ≥ (1 − ε)φ(t).By the law of the iterated logarithm for the lazy random walk (X t ), we have that T ε is almost surely infinite.By the same law, almost surely, the set T of times t at which |Y t + t | ≤ 1.5φ(t) contains all but finitely many positive integers.
Fix t ∈ T ε ∩ T sufficiently large.Let j ≥ 0 be such that b We conclude that almost surely, lim sup Since lim sup t→∞ |Xt| φ(t) = 1 almost surely, we have equality above.

No exceptional times for d > 2
We split the proof of Theorem 1.8 into two parts for readability, and in order to emphasize the missing piece for lifting the automorphism restriction.
where c , C > 0 are constants that depend on p but not on n.Let m be the integer part of √ n/ log 2 n and consider the two intervals whenever i ≤ p(n+1).Thus, f (i) is increasing for i ∈ I n , and f (i) ≤ f (i + m) for i ∈ J n .It follows that there is a coupling such that The central limit theorem implies that P(B n ∈ J n ) converges as n → ∞ to some positive constant c = c(p).Since f is bounded from above by C/ √ n for some constant C = C(p), we have that P(B n ∈ I n \ J n ) ≤ Cm/ √ n ≤ C/ log 2 n.This completes the construction of a coupling between B n and B n with the claimed properties.
The above coupling between B n and B n is relevant because X t has the same law as 2B t − t.Consider the times t n = 2 n for n ≥ 1.We construct the coupling between (X t ) and (X t ) so that it is Markovian at these times.Fix n ≥ 1 and suppose we have already coupled (X t ) t≤tn and (X t ) t≤tn in some manner (the coupling for n = 1 can be done arbitrarily).We now describe the (conditional) coupling between the processes in the time range (t n , t n+1 ].This coupling only depends on X tn and X tn .The law of (X t − X tn ) tn≤t≤t n+1 and (X t − X tn ) tn≤t≤t n+1 is entirely independent of the past (conditioned on time t n ).These are two random walks of length t n+1 − t n = 2 n , which we denote by (S i ) 2 n i=0 and (S i ) 2 n i=0 .To couple these walks, we first couple the endpoints S 2 n and S 2 n using the above coupling between B 2 n and B 2 n (pushed forward by the map x → 2x − 2 n ).Given the endpoints, we couple the walks so that S i ≥ S i for all i when S 2 n ≥ S 2 n , and arbitrarily otherwise.The former can be done by first sampling (S i ) and then uniformly choosing 1 2 (S 2 n − S 2 n ) coordinates i among those where the increment S i − S i+1 is −1 and setting the corresponding increments S i − S i−1 to +1 there (with all other increments remaining the same for both).This completes the description of the coupling between (X t ) and (X t ).
It remains to check that the constructed coupling has the claimed property.Let ∆ n = X t n+1 − X tn and ∆ n = X t n+1 − X tn .Define events Since F n has probability at most C /n 2 , only finitely many of the F n occur almost surely.
Let N 1 be the smallest positive integer such that F n does not occur for any n ≥ N 1 .
Observe that X t −X t is non-decreasing for t ≥ t N 1 .Since {E n } ∞ n=1 are independent events, each of probability at least c , infinitely many of them occur almost surely.Moreover, almost surely, for any n large enough, at least one of E n−1 , E n−2 , . . ., E n−C log n occurs, where C > 0 is some large constant.Let N 2 be the smallest positive integer so that this holds for n ≥ N 2 .Observe that if n − C log n ≥ max{N 1 , N 2 } and t n ≤ t ≤ t n+1 , and we can write |B| = |B n | + a(d − 1) + c, where a, c are non-negative integers so that c < d − 1. Analyze the different K i (B)'s as follows.The set K 1 (B) = N (B) contains B n+1 and some of the smallest elements in ∂B n+2 .The set K 2 (B) is equal to B. For i ∈ {3, . . ., c + 2}, the set K i (B) contains B n−1 and the a + 1 smallest elements in ∂B n .For i ∈ {c + 3, . . ., d + 1}, the set K i (B) contains B n−1 and the a smallest elements in ∂B n .
) holds because |B n | ≤ |B n+1 |; and (4.8) holds because p t is half-greedily arranged, and because B n+1 ∪ B n+2 ⊆ B n+2 .The rest of the proof proceeds in a similar manner as in the proof of Theorem 1.1.