Functional CLT for random walk among bounded random conductances

We consider the nearest-neighbor simple random walk on Z d , d ≥ 2, driven by a ﬁeld of i.i.d. random nearest-neighbor conductances ω xy ∈ [0 , 1]. Apart from the requirement that the bonds with positive conductances percolate, we pose no restriction on the law of the ω ’s. We prove that, for a.e. realization of the environment, the path distribution of the walk converges weakly to that of non-degenerate, isotropic Brownian motion. The quenched functional CLT holds despite the fact that the local CLT may fail in d ≥ 5 due to anomalously slow decay of the probability that the walk returns to the starting point at a given time.


Introduction
Let B d denote the set of unordered nearest-neighbor pairs (i.e., edges) of Z d and let (ω b ) b∈B d be i.i.d. random variables with ω b ∈ [0, 1]. We will refer to ω b as the conductance of the edge b. Let P denote the law of the ω's and suppose that P(ω b > 0) > p c (d), (1.1) where p c (d) is the threshold for bond percolation on Z d ; in d = 1 we have p c (d) = 1 so there we suppose ω b > 0 a.s. This condition ensures the existence of a unique infinite connected component C ∞ of edges with strictly positive conductances; we will typically restrict attention to ω's for which C ∞ contains a given site (e.g., the origin).
Each realization of C ∞ can be used to define a random walk X = (X n ) which moves about C ∞ by picking, at each unit time, one of its 2d neighbors at random and passing to it with probability equal to the conductance of the corresponding edge. Technically, X is a Markov chain with state space C ∞ and the transition probabilities defined by P ω,z (X n+1 = y|X n = x) := ω xy 2d (1.2) if x, y ∈ C ∞ and |x − y| = 1, and P ω,z (X n+1 = x|X n = x) := 1 − 1 2d y : |y−x|=1 ω xy . (1. 3) The second index on P ω,z marks the initial position of the walk, i.e., The counting measure on C ∞ is invariant and reversible for this Markov chain.
The d = 1 walk is a simple, but instructive, exercise in harmonic analysis of reversible random walks in random environments. Let us quickly sketch the proof of the fact that, for a.e. ω sampled from a translation-invariant, ergodic law on (0, 1] B d satisfying is harmonic for the Markov chain. Hence ϕ ω (X n ) is a martingale whose increments are, by (1.5) and a simple calculation, square integrable in the sense EE ω,0 ϕ ω (X 1 ) 2 < ∞. (1.7) Invoking the stationarity and ergodicity of the Markov chain on the space of environments "from the point of view of the particle" -we will discuss the specifics of this argument later -the martingale (ϕ ω (X n )) satisfies the conditions of the Lindeberg-Feller martingale functional CLT and so the law of t → ϕ ω (X ⌊nt⌋ )/ √ n tends weakly to that of a Brownian motion with diffusion constant given by (1.7). By the Pointwise Ergodic Theorem and (1.5) we have ϕ ω (x) − x = o(x) as |x| → ∞. Thus the path t → X ⌊nt⌋ / √ n scales, in the limit n → ∞, to the same function as the deformed path t → ϕ ω (X ⌊nt⌋ )/ √ n. As this holds for a.e. ω, we have proved a quenched functional CLT.
While the main ideas of the above d = 1 solution work in all dimensions, the situation in d ≥ 2 is, even for i.i.d. conductances, significantly more complicated. Progress has been made under additional conditions on the environment law. One such condition is strong ellipticity, Here an annealed invariance principle was proved by Kipnis and Varadhan [20] and its queneched counterpart by Sidoravicius and Sznitman [28]. Another natural family of environments are those arising from supercritical bond percolation on Z d for which (ω b ) are i.i.d. zero-one valued with P(ω b = 1) > p c (d). For these cases an annealed invariance principle was proved by De Masi, Ferrari, Goldstein and Wick [11; 12] and the quenched case was established in d ≥ 4 by Sidoravicius and Sznitman [28], and in all d ≥ 2 by Berger and Biskup [6] and Mathieu and Piatnitski [25] .
A common feature of the latter proofs is that, in d ≥ 3, they require the use of heat-kernel upper bounds of the form P ω,x (X n = y) ≤ c 1 n d/2 exp −c 2 |x − y| 2 n , x, y ∈ C ∞ , (1.9) where c 1 , c 2 are absolute constants and n is assumed to exceed a random quantity depending on the environment in the vicinity of x and y. These were deduced by Barlow [2] using sophisticated arguments that involve isoperimetry, regular volume growth and comparison of graph-theoretical and Euclidean distances for the percolation cluster. While the use of (1.9) is conceptually rather unsatisfactory -one seems to need a local-CLT level of control to establish a plain CLT -no arguments (in d ≥ 3) that avoid heat-kernel estimates are known at present.
The reliance on heat-kernel bounds also suffers from another problem: (1.9) may actually fail once the conductance law has sufficiently heavy tails at zero. This was first noted to happen by Fontes and Mathieu [14] for the heat-kernel averaged over the environment; the quenched situation was analyzed recently by Berger, Biskup, Hoffman and Kozma [7]. The main conclusion of [7] is that the diagonal (i.e., x = y) bound in (1.9) holds in d = 2, 3 but the decay can be slower than any o(n −2 ) sequence in d ≥ 5. (The threshold sequence in d = 4 is presumably o(n −2 log n).) This is caused by the existence of traps that may capture the walk for a long time and thus, paradoxically, increase its chances to arrive back to the starting point.
The aformentioned facts lead to a natural question: In the absence of heat-kernel estimates, does the quenched CLT still hold? Our answer to this question is affirmative and constitutes the main result of this note. Another interesting question is what happens when the conductances are unbounded from above; this is currently being studied by Barlow and Deuschel [3].
Note: While this paper was in the process of writing, we received a preprint from Pierre Mathieu [24] in which he proves a result that is a continuous-time version of our main theorem.
The strategy of [24] differs from ours by the consideration of a time-changed process (which we use only marginally) and proving that the "new" and "old" time scales are commensurate. Our approach is focused on proving the (pointwise) sublinearity of the corrector and it streamlines considerably the proof of [6] in d ≥ 3 in that it limits the use of "heat-kernel technology" to a uniform bound on the heat-kernel decay (implied by isoperimetry) and a diffusive bound on the expected distance travelled by the walk (implied by regular volume growth).

Main results and outline
Let Ω := [0, 1] B d be the set of all admissible random environments and let P be an i.i.d. law on Ω. Assuming (1.1), let C ∞ denote the a.s. unique infinite connected component of edges with positive conductances and introduce the conditional measure the Borel σ-algebra defined relative to the supremum topology.
Here is our main result: be the random walk with law P ω,0 and let Then for all T > 0 and for P 0 -almost every ω, the law of (B n (t) : 0 ≤ t ≤ T ) on (C[0, T ], W T ) converges, as n → ∞, weakly to the law of an isotropic Brownian motion (B t : 0 ≤ t ≤ T ) with a positive and finite diffusion constant (which is independent of ω).
Using a variant of [6,Lemma 6.4], from here we can extract a corresponding conclusion for the "agile" version of our random walk (cf [6, Theorem 1.2]) by which we mean the walk that jumps from x to its neighbor y with probability ω xy /π ω (x) where π ω (x) is the sum of ω xz over all of the neighbors z of x. Replacing discrete times by sums of i.i.d. exponential random variables, these invariance principles then extend also to the corresponding continuous-time processes. Theorem 2.1 of course implies also an annealed invariance principle, which is the above convergence for the walk sampled from the path measure integrated over the environment.
Remark 2.2 As we were reminded by Y. Peres, the above functional CLT automatically implies the "usual" lower bound on the heat-kernel. Indeed, the Markov property and reversibility of X yield Cauchy-Schwarz then gives with C(ω) > 0 a.s. on the set {0 ∈ C ∞ }. Note that, in d = 2, 3, this complements nicely the "universal" upper bounds derived in [7].
The remainder of this paper is devoted to the proof of Theorem 2.1. The main line of attack is similar to the above 1D solution: We define a harmonic coordinate ϕ ω -an analogue of (1.6) -and then prove an a.s. invariance principle for along the martingale argument sketched before. The difficulty comes with showing the sublinearity of the corrector, As in Berger and Biskup [6], sublinearity can be proved directly along coordinate directions by soft ergodic-theory arguments. The crux is to extend this to a bound throughout d-dimensional boxes.
Following the d ≥ 3 proof of [6], the bound along coordinate axes readily implies sublinearity on average, meaning that the set of x where |ϕ ω (x) − x| exceeds ǫ|x| has zero density. The extension of sublinearity on average to pointwise sublinearity is the main novel part of the proof which, unfortunately, still makes non-trivial use of the "heat-kernel technology." A heat-kernel upper bound of the form (1.9) would do but, to minimize the extraneous input, we show that it suffices to have a diffusive bound for the expected displacement of the walk from its starting position. This step still requires detailed control of isoperimetry and volume growth as well as the comparison of the graph-theoretic and Euclidean distances, but it avoids many spurious calculations that are needed for the full-fledged heat-kernel estimates.
Of course, the required isoperimetric inequalities may not be true on C ∞ because of the presence of "weak" bonds. As in [7] we circumvent this by observing the random walk on the set of sites that have a connection to infinity by bonds with uniformly positive conductances. Specifically we pick α > 0 and let C ∞,α denote the set of sites in Z d that are connected to infinity by a path whose edges obey ω b ≥ α. Here we note: and then C ∞,α is nonempty and C ∞ \ C ∞,α has only finite components a.s. In fact, if F (x) is the set of sites (possibly empty) in the finite component of C ∞ \ C ∞,α containing x, then

10)
for some C < ∞ and η > 0. Here "diam" is the diameter in the ℓ ∞ distance on Z d .
The restriction of ϕ ω to C ∞,α is still harmonic, but with respect to a walk that can "jump the holes" of C ∞,α . A discrete-time version of this walk was utilized heavily in [7]; for the purposes of this paper it will be more convenient to work with its continuous-time counterpart Y = (Y t ) t≥0 . Explicitly, sample a path of the random walk X = (X n ) from P ω,0 and denote by T 1 , T 2 , . . . the time intervals between successive visits of X to C ∞,α . These are defined recursively by with T 0 = 0. For each x, y ∈ C ∞,α , let and define the operator The continuous-time random walk Y is a Markov process with this generator; alternatively take the standard Poisson process (N t ) t≥0 with jump-rate one and set (2.14) Note that, while Y may jump "over the holes" of C ∞,α , Proposition 2.3 ensures that all of its jumps are finite. The counting measure on C ∞,α is still invariant for this random walk, L (α) ω is self-adjoint on the corresponding space of square integrable functions and L (α) The skeleton of the proof is condensed into the following statement whose proof, and adaptation to the present situation, is the main novel part of this note: Theorem 2.4 Fix α as in (2.8-2.9) and suppose ψ ω : C ∞,α → R d is a function and θ > 0 is a number such that the following holds for a.e. ω: (2) (Sublinearity on average) For every ǫ > 0, Let Y = (Y t ) be the continuous-time random walk on C ∞,α with generator L (α) ω and suppose in addition: (4) (Diffusive upper bounds) For a deterministic sequence b n = o(n 2 ) and a.e. ω, Then for almost every ω, This result -with ψ ω playing the role of the corrector -shows that This readily extends to sublinearity on C ∞ by the maximum principle applied to ϕ ω on the finite components of C ∞ \ C ∞,α and using that the component sizes obey a polylogarithmic upper bound. The assumptions (1-3) are known to hold for the corrector of the supercritical bond-percolation cluster and the proof applies, with minor modifications, to the present case as well. The crux is to prove (2.17-2.18) which is where we need to borrow ideas from the "heat-kernel technology." For our purposes it will suffice to take b n = n in part (4).
We remark that the outline strategy of proof extends rather seamlessly to other (translationinvariant, ergodic) conductance distributions with conductances bounded from above. Of course, one has to assume a number of specific properties for the "strong" component C ∞,α that, in the i.i.d. case, we are able to check explicitly.
The plan of the rest of this paper is a follows: Sect. 3 is devoted to some basic percolation estimates needed in the rest of the paper. In Sect. 4 we define and prove some properties of the corrector χ, which is a random function marking the difference between the harmonic coordinate ϕ ω (x) and the geometric coordinate x. In Sect. 5 we establish the a.s. sublinearity of the corrector as stated in Theorem 2.4 subject to the diffusive bounds (2.17-2.18). Then we assemble all facts into the proof of Theorem 2.1. Finally, in Sect. 6 we adapt some arguments from Barlow [2] to prove (2.17-2.18); first in rather general Propositions 6.1 and 6.2 and then for the case at hand.

Percolation estimates
In this section we provide a proof of Proposition 2.3 and also of a lemma dealing with the maximal distance the random walk Y can travel in a given number of steps. We will need to work with the "static" renormalization (cf Grimmett [17,Section 7.4]) whose salient features we will now recall. The underlying ideas go back to the work of Kesten and Zhang [19], Grimmett and Marstrand [18] and Antal and Pisztora [1].
We say that an edge b is occupied if ω b > 0. Consider the lattice cubes and note thatB 3L (x) consists of 3 d copies of B L (x) that share only sites on their adjacent boundaries. Let G L (x) be the "good event" -whose occurrence designates B L (Lx) to be a "good block" -which is the set of configurations such that: (1) For each neighbor y of x, the side of the block B L (Ly) adjacent to B L (Lx) is connected to the opposite side of B L (Ly) by an occupied path.
(2) Any two occupied paths connecting B L (Lx) to the boundary ofB 3L (Lx) are connected by an occupied path using only edges with both endpoints inB 3L (Lx).
The sheer existence of infinite cluster implies that (1) occurs with high probability once L is large (see Grimmett [17,Theorem 8.97]) while the situation in (2) occurs with large probability once there is percolation in half space (see Grimmett [17,Lemma 7.89]). It follows that and G L (y) occur for neighboring sites x, y ∈ Z d , then the largest connected components inB 3L (Lx) andB 3L (Ly) -sometimes referred to as spanning clusters -are connected. Thus, if G L (x) occurs for all x along an infinite path on Z d , the corresponding spanning clusters are subgraphs of C ∞ .
A minor complication is that the events x ∈ Z d }, regarded as a random process on Z d , stochastically dominate i.i.d. Bernoulli random variables whose density (of ones) tends to one as L → ∞.
To handle general dimensions we will have to invoke the above "static" renormalization. Let G L (x) be as above and consider the event G L,α (x) where we in addition require that ω b ∈ (0, α) for every edge with both endpoints inB 3L (Lx). Clearly, Using the aforementioned domination by site percolation, and adjusting L and α we can thus ensure that, with probability one, the set has a unique infinite component C ∞ , whose complement has only finite components. Moreover, if G(0) is the finite connected component of Z d \C ∞ containing the origin, then a standard Peierls argument yields for some ζ > 0. To prove (2.10), it suffices to show that and so (3.6) implies (2.10) with η := ζ / L and C := e 3Lη .
To prove (3.7), pick z ∈ F (0) and let x be such that z ∈ B L (Lx). It suffices to show that if G L,α (x) occurs, then x is not adjacent to an infinite component in (3.5). Assuming that x is adjacent to such a component, the fact that the spanning clusters in adjecent "good blocks" are connected and thus contained in C ∞,α implies (This is where we need diam F (0) > 3L.) If these components are joined by an occupied path -i.e., a path of edges with ω b > 0 -withinB 3L (Lx), thenB 3L (Lx) contains a "weak" bond and so G L,α fails. In the absence of such a path the requirement (2) in the definition of G L (x) is not satisfied and so G L,α (x) fails too.
Let d(x, y) be the "Markov distance" on V = C ∞,α , i.e., the minimal number of jumps the random walk Y = (Y t ) needs to make to get from x to y. Note that d(x, y) could be quite smaller than the graph-theoretic distance on C ∞,α and/or the Euclidean distance. (The latter distances are known to be comparable, see Antal and Pisztora [1].) To control the volume growth for the Markov graph of the random walk Y we will need to know that d(x, y) is nevertheless comparable with the Euclidean distance |x − y|: Lemma 3.1 There exists ̺ > 0 and for each γ > 0 there is α > 0 obeying (2.8-2.9) and C < ∞ such that Proof. Suppose α is as in the proof of Proposition 2.3. Let (η x ) be independent Bernoulli that dominate the indicators 1 G L,α from below and let C ∞ be the unique infinite component of the set {x ∈ Z d : η x = 1}. We may "wire" the "holes" of C ∞ by putting an edge between every pair of sites on the external boundary of each finite component of Z d \ C ∞ ; we use d ′ (0, x) to denote the distance between 0 and x on the induced graph. The processes η and (1 G L,α (x) ) can be coupled so that each connected component of . As is easy to check, this implies . It thus suffices to show the above bound for distance d ′ (0, x ′ ).
Let p = p L,α be the parameter of the Bernoulli distribution and recall that p can be made as close to one as desired by adjusting L and α. Let z 0 = 0, z 1 , . . . , z n = x be a nearest-neighbor We claim that for each λ > 0 we can adjust p so that E e λℓ(z 0 ,...,zn) ≤ e n (3.12) for all n ≥ 1 and all paths as above. To verify this we note that the components contributing to ℓ(z 0 , . . . , z n ) are distance at least one from one another. So conditioning on all but the last component, and the sites in the ultimate vicinity, we may use the Peierls argument to estimate the conditional expectation of e λ diam G(zn) by, say, e 1 . (We are using also that diam G(z n ) is smaller than the boundary of G(z n ).) Proceeding by induction n times, (3.12) follows.

Corrector
The purpose of this section is to define, and prove some properties of, the corrector χ(ω, x) := ϕ ω (x) − x. This object could be defined probabilistically by the limit unfortunately, at this moment we seem to have no direct (probabilistic) argument showing that the limit exists. The traditional definition of the corrector involves spectral calculus (Kipnis and Varadhan [20]); we will invoke a projection construction from Mathieu and Piatnitski [25] (see also Giacomin, Olla and Spohn [15]).
Let P be an i.i.d. law on (Ω, F ) where Ω := [0, 1] B d and F is the natural product σ-algebra.
Let τ x : Ω → Ω denote the shift by x, i.e., (τ z ω) xy := ω x+z,y+z (4.2) and note that P • τ −1 x = P for all x ∈ Z d . Recall that C ∞ is the infinite connected component of edges with ω b > 0 and, for α > 0, let C ∞,α denote the set of sites connected to infinity by edges with ω b ≥ α. If P(0 ∈ C ∞,α ) > 0, let and let E α be the corresponding expectation. Given ω ∈ Ω and sites x, y ∈ C ∞,α (ω), let d (α) ω (x, y) denote the graph distance between x and y as measured on C ∞,α . (Note this is distinct from the Markov distance d(x, y) discussed, e.g., in Lemma 3.1.) We will also use L ω to denote the generator of the continuous-time version of the walk X, i.e., The following theorem summarizes all relevant properties of the corrector: There exists a function χ : Ω × Z d → R d such that the following holds P 0 -a.s.: (1) (Gradient field) χ(0, ω) = 0 and, for all x, y ∈ C ∞ (ω), (3) (Square integrability) There is a constant C = C(α) < ∞ such that for all x, y ∈ Z d satisfying |x − y| = 1, Let α > 0 be such that P(0 ∈ C ∞,α ) > 0. Then we also have: (5) (Zero mean under random shifts) Let Z : Ω → Z d be a random variable such that (a) Z(ω) ∈ C ∞,α (ω), Then χ(·, Z(·)) ∈ L 1 (Ω, F , P α ) and E α χ(·, Z(·)) = 0. (4.8) As noted before, to construct the corrector we will invoke a projection argument. Abbreviate We may interpret u ∈ L 2 (Ω × B) as a flow by putting u(ω, −b) = −u(τ −b ω, b). Some, but not all, elements of L 2 (Ω × B) can be obtained as gradients of local functions, where the gradient ∇ is the map L 2 (Ω) → L 2 (Ω × B) defined by Let L 2 ∇ denote the closure of the set of gradients of all local functions -i.e., those depending only on the portion of ω in a finite subset of Z d -and note the following orthogonal decomposition L 2 (Ω × B) = L 2 ∇ ⊕ (L 2 ∇ ) ⊥ . The elements of (L 2 ∇ ) ⊥ can be characterized using the concept of divergence, which for u : (4.11) Using the interpretation of u as a flow, div u is simply the net flow out of the origin. The characterization of (L 2 ∇ ) ⊥ is now as follows: Proof. Let u ∈ L 2 (Ω × B) and let φ ∈ L 2 (Ω) be a local function. A direct calculation and the fact that If u ∈ (L 2 ∇ ) ⊥ , then div u integrates to zero against all local functions. Since these are dense in L 2 (Ω), we have div u = 0 a.s.
It is easy to check that every u ∈ L 2 ∇ is curl-free in the sense that for any oriented loop (x 0 , x 1 , . . . , x n ) on C ∞ (ω) with x n = x 0 we have On the other hand, every u : Ω × B → R d which is curl-free can be integrated into a unique function φ : holds for any path (x 0 , . . . , x n ) on C ∞ (ω) with x 0 = 0 and x n = x. This function will automatically satisfy the shift-covariance property We will denote the space of such functions H(Ω × Z d ). To denote the fact that φ is assembled from the shifts of u, we will write u = grad φ, (4. 16) i.e.," grad " is a map from H(Ω×Z d ) to functions Ω×B → R d that takes a function φ ∈ H(Ω×Z d ) and assigns to it the collection of values {φ(·, b) − φ(·, 0) : b ∈ B}.
Then φ is (discrete) harmonic for the random walk on C ∞ , i.e., for P 0 -a.e. ω and all x ∈ C ∞ (ω), Proof. Our definition of divergence is such that "div grad = 2d L ω " holds. Lemma 4.2 implies that u ∈ (L 2 ∇ ) ⊥ if and only if div u = 0, which is equivalent to (L ω φ)(ω, 0) = 0. By translation covariance, this extends to all sites in C ∞ .
Proof of Theorem 4.1 (1-3). Consider the function φ(ω, x) := x and let u := grad φ. Clearly, u ∈ L 2 (Ω×B). Let G ∈ L 2 ∇ be the orthogonal projection of −u onto L 2 ∇ and define χ ∈ H(Ω×Z d ) to be the unique function such that G = grad χ and χ(·, 0) = 0.  For the remaining parts of Theorem 4.1 we will need to work on C ∞,α . However, we do not yet need the full power of Proposition 2.3; it suffices to note that C ∞,α has the law of a supercritical percolation cluster.

Remark 4.5
It it worth pointing out that the proof of properties (1-3) extends nearly verbatim to the setting with arbitrary conductances and arbitrary long jumps (i.e., the case when B is simply all of Z d ). One only needs that x is in L 2 (Ω × B), i.e., The proof of (4-5) seems to require additional (and somewhat unwieldy) conditions.

Convergence to Brownian motion
Here we will prove Theorem 2.1. We commence by establishing the conclusion of Theorem 2.4 whose proof draws on an idea, suggested to us by Yuval Peres, that sublinearity on average plus heat kernel upper bounds imply pointwise sublinearity. We have reduced the extraneous input from heat-kernel technology to the assumptions (2.17-2.18). These imply heat-kernel upper bounds but generally require less work to prove.
The main technical part of Theorem 2.1 is encapsulated into the following lemma: Lemma 5. Note that then c ′ − ǫ ≥ 3 θ δc ′ for all c ′ ≥ c. If R n ≥ cn -which happens for infinitely many n's -and n ≥ n 0 , then (5.2) implies and, inductively, R 3 k n ≥ 3 kθ cn. However, that contradicts (2.16) by which R 3 k n /3 kθ → 0 as k → ∞ (with n fixed).
The idea underlying Lemma 5.1 is simple: We run a continuous-time random walk (Y t ) for time t = o(n 2 ) starting from the maximizer of R n and apply the harmonicity of x → x + ψ ω (x) to derive an estimate on the expectation of ψ(Y t ). The right-hand side of (5.2) expresses two characteristic situations that may occur at time t: Either we have |ψ ω (Y t )| ≤ ǫn -which, by "sublinearity on average," happens with overwhelming probability -or Y will not yet have left the box [−3n, 3n] d and so ψ ω (Y t ) ≤ R 3n . The point is to show that these are the dominating strategies.
Proof of Lemma 5.1. Fix ǫ, δ > 0 and let C 1 = C 1 (ω) and C 2 = C 2 (ω) denote the suprema in (2.17) and (2.18), respectively. Let z be the site where the maximum R n is achieved and denote Let Y = (Y t ) be a continuous-time random walk on C ∞,α with expectation for the walk started at z denoted by E ω,z . Define the stopping time S n := inf t ≥ 0 : |Y t − z| ≥ 2n (5.6) and note that, in light of Proposition 2.3, we have |Y t∧Sn − z| ≤ 3n for all t > 0 provided n ≥ n 1 (ω) where n 1 (ω) < ∞ a.s. The harmonicity of x → x + ψ ω (x) and the Optional Stopping Theorem yield Restricting to t satisfying t ≥ b 4n , (5.8) where b n = o(n 2 ) is the sequence in part (4) of Theorem 2.4, we will now estimate the expectation separately on {S n < t} and {S n ≥ t}.
On the event {S n < t}, the absolute value in the expectation can simply be bounded by R 3n +3n.
To estimate the probability of S n < t we decompose according to whether |Y 2t − z| ≥ 3 2 n or not. For the former, (5.8) and (2.17) imply For the latter we invoke the inclusion and note that 2t − S n ∈ [t, 2t], (5.8) and (2.17) give us similarly From the Strong Markov Property we thus conclude that this serves also as a bound for P ω,z (S n < t, |Y 2t − z| ≥ 3 2 n). Combining both parts and using 8 3 √ 2 ≤ 4 we thus have The S n < t part of the expectation (5.7) is bounded by R 3n + 3n times as much.
On the event {S n ≥ t}, the expectation in (5.7) is bounded by The second term on the right-hand side is then less than C 1 √ t provided t ≥ b n . The first term is estimated depending on whether Y t ∈ O 2n or not: For the probability of Y t ∈ O 2n we get which, in light of the Cauchy-Schwarz estimate 16) and the definition of C 2 , is further estimated by From the above calculations we conclude that Since |O 2n | = o(n d ) as n → ∞, by (2.15) we can choose t := ξn 2 with ξ > 0 so small that (5.8) applies and (5.2) holds for the given ǫ and δ once n is sufficiently large.
We now proceed to prove convergence of the random walk X = (X n ) to Brownian motion. Most of the ideas are drawn directly from Berger and Biskup [6] so we stay rather brief. We will frequently work on the truncated infinite component C ∞,α and the corresponding restriction of the random walk; cf (2.11-2.13). We assume throughout that α is such that (2.8-2.9) hold.

Lemma 5.2
Let χ be the corrector on C ∞ . Then ϕ ω (x) := x + χ(ω, x) is harmonic for the random walk observed only on C ∞,α , i.e., But X n is confined to a finite component of C ∞ \ C ∞,α for n ∈ [0, T 1 ], and so ϕ ω (X n ) is bounded. Since (ϕ ω (X n )) is a martingale and T 1 is an a.s. finite stopping time, the Optional Stopping Theorem tells us E ω,x ϕ ω (X T 1 ) = ϕ ω (x).
We will also need sublinearity of the corrector on average: Remark 5.5 The proof of [6, Theorem 5.4] makes a convenient use of separate ergodicity (i.e., that with respect to shifts only in one of the coordinate directions). This is sufficient for i.i.d. environments as considered in the present situation. However, it is not hard to come up with a modification of the proof that covers general ergodic environments as well (Biskup and Deuschel [8]).
Finally, we will assert the validity of the bounds on the return probability and expected displacement of the walk from Theorem 2.4: Lemma 5.6 Let (Y t ) denote the continuous-time random walk on C ∞,α . Then the diffusive bounds (2.17-2.18) hold for P α -a.e. ω.
We will prove this lemma at the very end of Sect. 6.
Proof of Theorem 2.1. Let α be such that (2.8-2.9) hold and let χ denote the corrector on C ∞ as constructed in Theorem 4.1. The crux of the proof is to show that χ grows sublinearly with x, i.e., χ(ω, x) = o(|x|) a.s.
Having proved the sublinearity of χ on C ∞ , we proceed as in the d = 2 proof of [6]. Let ϕ ω (x) := x + χ(ω, x) and abbreviate M n := ϕ ω (X n ). Fixv ∈ R d and define By Theorem 4.1(3), f K ∈ L 1 (Ω, F , P 0 ) for all K. Since the Markov chain on environments, n → τ Xn (ω), is ergodic (cf [6, Section 3]), we thus have for P 0 -a.e. ω and P ω,0 -a.e. path X = (X k ) of the random walk. Using this for K := 0 and K := ǫ √ n along with the monotonicity of K → f K verifies the conditions of the Lindeberg-Feller Martingale Functional CLT (e.g., Durrett [13,Theorem 7.7.3]). Thereby we conclude that the random continuous function converges weakly to Brownian motion with mean zero and covariance This can be written asv · Dv where D is the matrix with coefficients Invoking the Cramér-Wold device (e.g., Durrett [13,Theorem 2.9.5]) and the fact that continuity of a stochastic process in R d is implied by the continuity of its d one-dimensional projections we get that the linear interpolation of t → M ⌊nt⌋ / √ n scales to d-dimensional Brownian motion with covariance matrix D. The sublinearity of the corrector then ensures, as in [6, (6.11-6.13)], that and so the same conclusion applies to t → B n (t) in (2.2).
The reflection symmetry of P 0 forces D to be diagonal; the rotation symmetry then ensures To see that the limiting process is not degenerate to zero we note that if we had σ = 0 then χ(·, x) = −x would hold a.s. for all x ∈ Z d . But that is impossible since, as we proved above, x → χ(·, x) is sublinear a.s.
Remark 5.7 Note that, unlike the proofs in [28; 6; 25], the above line of argument does not require a separate proof of tightness. In our approach, this comes rather automatically for the deformed random walk ϕ ω (X n ) -via the (soft) stationarity argument (5.28) and the Martingale Functional CLT. Sublinearity of the corrector then extends it readily to the original random walk.
Remark 5. 8 We also wish to use the opportunity to correct an erroneous argument from [6]. There, at the end of the proof of Theorem 6.2 it is claimed that the expectation E 0 E ω,0 (X 1 · χ(X 1 , ω)) is zero. Unfortunately, this is false. In fact, we have where the strict inequality assumes that P is non-degenerate. This shows Thus, once P is non-degenerate, the diffusion constant of the limiting Brownian motion is strictly smaller than the variance of the first step.
A consequence of the above error for the proof of Theorem 6.2 in [6] is that it invalidates one of the three listed arguments to prove that the limiting Brownian motion is non-degenerate. Fortunately, the remaining two arguments are correct.

Heat kernel and expected distance
Here we will derive the bounds (2.17-2.18) and thus establish Lemma 5.6. Most of the derivation will be done for a general countable-state Markov chain; we will specialize to random walk among i.i.d. conductances at the very end of this section. The general ideas underlying these derivations are fairly standard and exist, in some form, in the literature. A novel aspect is the way we control the non-uniformity of volume-growth caused by local irregularities of the underlying graph; cf (6.4) and Lemma 6.3 (1). A well informed reader may nevertheless wish to read only the statements of Propositions 6.1 and 6.2 and then pass directly to the proof of Lemma 5.6.
Let V be a countable set and let (a xy ) x,y∈V denote the collection of positive numbers with the following properties: For all x, y ∈ V , a xy = a yx and π(x) := y∈V a xy < ∞.
We use P x to denote the law of the chain started from x, and E x to denote the corresponding expectation. Consider a graph G = (V, E) where E is the set of all pairs (x, y) such that a xy > 0. Let d(x, y) denote the distance between x and y as measured on G. and Let V (ǫ) ⊂ V denote the set of all x ∈ V that are connected to infinity by a self-avoiding path (Note that this does not require a xy be bounded away from zero.) The first observation is that the heat-kernel, defined by can be bounded in terms of the isoperimetry constant C iso (x). Bounds of this form are well known and have been derived by, e.g., Coulhon, Grigor'yan and Pittet [10] for heat-kernel on manifolds, and by Lovász and Kannan [22], Morris and Peres [23] and Goel, Montenegro and Tetali [16] in the context of countable-state Markov chains. We will use the formulation for infinite graphs developed in Morris and Peres [23].
Proposition 6.1 There exists a constant c 1 ∈ (1, ∞) depending only on d and a ⋆ such that for Proof. We will first derive the corresponding bound for the discrete-time version of (Y t ). Let P(x, y) := a xy /π(x) and defineP := 1 2 (1 + P). Let q n (x, y) :=P n (x, y) π(y) (6.9) We claim that, for some absolute constant c 1 and any z ∈ B n (x), To this end, let Q be the object Q for the Markov chainP and let We claim that Theorem 2 of Morris and Peres [23] then implies that, for any ǫ that satisfies 0 < ǫ < [π(x) ∧ π(y)] −1 and n ≥ 1 + 4/ǫ 4(π(z)∧π(y)) 4dr rφ(r) 2 , (6.12) we haveq n (z, y) ≤ ǫ. To see this we have to check that the restriction Λ ⊂ B 2n (x) in the definition of φ(r), which is absent from the corresponding object in [23], causes no harm. First note that the Markov chain started at z ∈ B n (x) will not leave B 2n (x) by time n. Thus, we can modify the chain outside B 2n (x) arbitrarily. It is easy to come up with a modification that will effectively reduce the infimum in (6.11) to sets inside B 2n (x).
It is well known (and easy to check) that the infimum in (6.11) can be restricted to connected sets Λ. Then (6.5-6.6) give us where the extra half arises due the consideration of time-delayed chainP. The two regimes cross over at r n := (C iso (x)/a ⋆ ) d n dν ; the integral is thus bounded by 4/ǫ 4(π(z)∧π(y)) 14) The first term splits into a harmless factor of order n 2ν log n = o(n) and a term proportional to n 2ν log C iso (x). This is O(n) by n ≥ t(x) where the (implicit) constant can be made as small as desired by choosing c 1 sufficiently large. Setting ǫ := c[C iso (x) 2 n] −d/2 (6.15) we can thus adjust the constant c in such a way that (6.14) is less than n − 1 for all n ≥ t(x). Thereby (6.10) follows.
To extend the bound (6.10) to continuous time, we note that L = 2(P − 1). Thus if N t is Poisson with parameter 2t, then q t (z, y) = Eq Nt (z, y). (6.16) But P (N t ≤ 3 2 t or N t ≥ 3t) is exponentially small in t and, in particular, much smaller than (6.8) for t ≥ c 1 log C iso (x) with c 1 sufficiently large. As q t ≤ (a ⋆ ) −1 , the N t ∈ ( 3 2 t, 3t) portion of the expectation in (6.16) is thus negligible. Once N t is constrained to the interval (t, 3t) the uniform bound (6.10) implies (6.8).
Our next item of business is a diffusive bound on the expected (graph-theoretical) distance traveled by the walk Y t by time t. As was noted by Bass [4] and Nash [26], this can be derived from the above uniform bound on the heat-kernel assuming regularity of the volume growth. Our proof is an adaptation of an argument of Barlow [2]. Proposition 6.2 There exist constants c 2 = c 2 (d) and c 3 = c 3 (d) such that the following holds: Let x ∈ V and suppose A > 0 and t(x) > 1 are numbers for which Then t ≥ T (x), (6.18) with A ′ (x, t) := c 2 + c 3 [log A + C vol (x, t −1/2 )].
Much of the proof boils down to the derivation of rather inconspicuous but deep relations (discovered by Nash [26]) between the following quantities: M (x, t) := E x d(x, Y t ) = y π(y)q t (x, y)d(x, y) (6.19) and Q(x, t) := −E x log q t (x, Y t ) = − y π(y)q t (x, y) log q t (x, y). (6.20) Note that q t (x, ·) ≤ (a ⋆ ) −1 implies Q(x, t) ≥ log a ⋆ .  where we used integration by parts and the positivity of L to derive the last inequality. Now put this together with M (x, t 0 ) ≤ √ dt and apply Lemma 6.3 (1), noting that C vol (z, M (x, t) −1 ) ≤ C vol (z, t −1/2 ) by the assumption M (x, t) ≥ √ t. Dividing out an overall factor √ t, we thus get This implies L(t) ≤c 2 +c 3 log A + C vol (x, t −1/2 ) (6.31) for some constantsc 2 andc 3 depending only on d. Plugging this in (6.29), we get the desired claim.
We are now finally ready to complete the proof of our main theorem: Proof of Lemma 5.6. We will apply the above estimates to obtain the proof of the bounds (2.17-2.18). We use the following specific choices V := C ∞,α , a xy :=ω xy , π(x) := 2d, and b n := n. (6.32) As a ⋆ ≥ α, all required assumptions are satisfied.
Once we have the uniform bound (6.34), as well as the uniform bound (6.17), Proposition 6.2 yields the a.s. inequality sup n≥1 max z∈C∞,α |z|≤n sup t≥n E ω,z d(z, Y t ) √ t < ∞. (6.37) To convert d(z, Y t ) into |z − Y t | in the expectation, we invoke (6.35) one more time.