Recent progress on the Random Conductance Model

Recent progress on the understanding of the Random Conductance Model is reviewed and commented. A particular emphasis is on the results on the scaling limit of the random walk among random conductances for almost every realization of the environment, observations on the behavior of the effective resistance as well as the scaling limit of certain models of gradient fields with non-convex interactions. The text is an expanded version of the lecture notes for a course delivered at the 2011 Cornell Summer School on Probability.


Quenched Invariance
Prologue Random walks in random environments have been at the center of the probabilists' interest for several decades. A specific class of such random walks goes under the banner of the Random Conductance Model. What makes this class special is the fact that the corresponding Markov chains are reversible. This somewhat restrictive feature has the benefit of fruitful connections to other, seemingly unrelated fields: the random resistor networks and gradient fields. At the technical level, many of the problems are thus naturally embedded into the larger area of harmonic analysis and homogenization theory.
This survey article is an expanded version of the set of lecture notes written for a course on the Random Conductance Model that the author delivered at the 2011 Cornell Summer School on Probability. A personal point of view promoted here is that the Random Conductance Model belongs to the collection of "paradigm" problems such as percolation, Ising model, exclusion process, etc, that are characterized by a simple definition and yet feature interesting and nontrivial phenomena (and, of course, pose interesting questions in mathematics). The text below attempts to summarize the important developments in the understanding of the Random Conductance Model. While paying most attention to recent results, much of what is discussed draws on by-now classical work.
The text retains the layout of lecture notes that have been spiced up with comments and references to related subjects. The general structure is as follows: The first section introduces the three rather different areas where the Random Conductance Model naturally appears. Sections 2-5 then deal predominantly with the first such area -namely, the various aspects of the limit behavior of random walks in reversible random environments. Section 6 then applies the introduced machinery to the remaining problems. A number of Problems are mentioned throughout the text; these refer to questions that are either solved directly in the text or remain a subject of research interest until present day. Easier questions are phrased as Exercises; these are of varied difficulty but should all be generally accessible to graduate students.

Acknowledgments
This text would not exist without the generous invitation from Rick Durrett to speak at the 2011 Cornell Summer School on Probability. The author is equally grateful to Geoffrey Grimmett, who suggested rather persuasively that the preliminary and incomplete notes be made into a proper survey articlerather than stay preliminary and incomplete forever. Much credit goes also to the coauthors N. Berger, O. Boukhardra, C. Hoffman, G. Kozma, O. Louidor, T. Prescott, A. Rozinov, H. Spohn and A. Vandenberg-Rodes of various joint projects whose results are reviewed in these notes, and to numerous other colleagues for discussions that helped improve the author's understanding of the subject. T. Kumagai was very kind to provide valuable comments on the section dealing with heat-kernel estimates, J. Dyre suggested interesting pointers to the physics literature concerning the random resistance problem and M. Salvi offered a lot of feedback and suggestions on the material in Section 3. Many thanks go also to an anonymous referee for a quick and efficient report. The research reported on in these notes has partially been supported by the NSF grants DMS-0949250 and DMS-1106850, the NSA grant NSA-AMS 091113 and the GAČR project P201-11-1558.

Random conductance model
We begin with the definition of the problem in the context of random walks in random environments. Consider a countable set V and suppose that we are given a collection of numbers (ω xy ) x,y∈V with the following properties: ω xy ≥ 0 with π ω (x) := y∈V ω xy ∈ (0, ∞), x ∈ V , (1.1) and the symmetry condition We will predominantly take V to be the hypercubic lattice Z d naturally embedded in R d . The quantity ω xy is called the conductance of the pair (x, y) -the use of the term will be clarified in the subsection dealing with resistor networks. When V has an unoriented-graph structure with edge set E , we often enforce ω xy = 0 whenever (x, y) ∈ E ; in that case we speak of the nearest-neighbor model. Such a model is then called uniformly elliptic if there is α ∈ (0, 1) for which α < ω xy < 1 α , (x, y) ∈ E . (1.3) When V := Z d , we use the phrase "nearest-neighbor model" for the situation when E is the set of pairs of vertices that are at the Euclidean distance one from each other. The aforementioned "random walk" in environment ω is technically a discretetime Markov chain with state-space V and transition kernel P ω (x, y) := ω xy π ω (x) , x, y ∈ V .
(1. 4) In plain words, the "walk" at site x chooses its next position y proportionally to the value of the conductance ω xy . The non-degeneracy condition (1.1) guarantees that this chain is well defined everywhere; when positivity of π ω fails at some vertices -as, e.g., for the simple random walk on the supercritical percolation cluster, cf Fig. 1.1 -one simply restricts the chain to the subset of V where π ω (x) > 0.
A key consequence of the symmetry condition (1.2) is: π ω is a stationary and reversible measure for the Markov chain.
Proof. Invoking the above definitions we get π ω (x)P ω (x, y) = ω xy = ω yx = π ω (y)P ω (y, x), (1.5) which is the condition of reversibility (a.k.a. the detailed balance condition). The fact that π ω is stationary follows by summing the extreme ends of this equality on x.
Note that for the nearest-neighbor model on Z d with conductances ω xy = 1 if |x − y| = 1 and ω xy = 0 otherwise, the above Markov chain reduces to the ordinary simple (symmetric) random walk. In this case the increments of the walk are i.i.d. which permits derivation of many deep conclusions -e.g., Donsker's Invariance Principle, Law of Iterated Logarithm, etc. However, when ω is nonconstant, the increments of the chain are no longer independent; worse yet, they are not even stationary. As we will see, this can be overcome but only at the cost of taking ω to be a sample from a shift-invariant distribution. This reasoning underpins the large area of random walks in random environment of which the above chain is only a rather specific example.
Let Ω be the space of all configurations (ω xy ) of the conductances. This space is naturally endowed with a product σ-algebra F . A shift by x is the map τ x : Ω → Ω acting so that (τ x ω) yz := ω y+x,z+x , x, y, z ∈ Z d . (1.6) We will henceforth assume that P is a probability measure on (Ω, F ) which is translation invariant in the sense that We recall that this measure is said to be ergodic if P(A) ∈ {0, 1} for any event A with the property τ −1 x (A) = A for all x ∈ Z d . A canonical example of an ergodic P would be the nearest-neighbor model where the values of conductances are chosen independently at random from the same distribution. We will use E to denote expectation with respect to P.
Let us now turn to the main questions one may wish to ask concerning the above setup. For this let X = (X n ) denote a sample path of the above Markov chain and let P x ω denote the law of X subject to the initial condition P x ω (X 0 = x) := 1. (1.8) Let P n ω denote the n-th power of the transition kernel P ω , i.e., P n ω (x, y) = P x ω (X n = y).
(1.9) Fig 1.2. An example of a random walk in a random environment where, at each vertex, one of the North or East arrows is chosen independently at random (with equal probabilities). The random walk is then forced to follow the arrows. The extremity of this example is seen from the fact that while the quenched law of the path is deterministic -and no invariance principle can hold for fluctuations -the averaged law looks like an ordinary North & East random walk whose fluctuations are described by the Central Limit Theorem. Problem 1.9. Is there a scaling limit of the random walk among random conductances restricted to the half-space, quarter space or a wedge (i.e., for the problem with conductances "leading" outside these regions set to zero)?

Digression on continuous time
Although the discrete-time Markov chain is very natural, one is often interested in a continuous-time version thereof. We will therefore introduce these objects right away and discuss some of the technical issues that come up in this context. There are two natural ways how to make the time flow continuously. First, we may simply Poissonize the discrete time and consider the transition kernel Q t ω (x, y) := n≥0 t n n! e −t P n ω (x, y). (1.12) The corresponding (continuous-time) Markov process is then referred to as constant-speed random walk among random conductances (CSRW), where the adjective highlights the fact that the jumps happen at the same rate regardless of the current position. Another natural way how to make time flow continuously is by attaching a clock to each pair (x, y) that rings after exponential waiting times with expec-tation 1/ω xy . This can just as well be done by prescribing the generator (1. 13) and demanding that the corresponding transition kernel R t ω is the (unique) stochastic solution of the backward Kolmogorov equations, with initial condition R t ω (x, y) = δ x (y). (1.15) Here δ x (z) equals one when x = z and zero otherwise. This leads to the variable speed random walk among random conductances (VSRW), because the resulting Markov chain at x makes a new jump at rate π ω (x). A specific problem with the VSRW is that the walk may escape to infinity in finite time -a blow-up occurs. (This will not happen for the discrete-time walk and thus also the CSRW.) A simple criterion to check is: Exercise 1. 10. Consider a configuration ω of conductances such that π ω (x) ∈ (0, ∞) for each x. Let (X k ) be the path of the discrete-time random walk among conductances ω and let T 0 , T 1 , . . . be the times between the successive jumps of the corresponding VSRW. Show that The upshot of this Exercise is that the question of blow-ups in VSRW can be resolved purely in the context of the discrete-time walk. We refer to, e.g., Liggett [95,Chapter 2] for a thorough discussion of such situations. See also Exercise 2.8 in Sect. 2.2.
The above transition kernels are distinguished by their invariant measures and natural function spaces they act on. Indeed, we can write R t ω as R t ω (x, y) := δ y , e t Lω δ x ℓ 2 (Z d ) , (1.17) where we think of ℓ 2 (Z d ) as endowed by the counting measure. On the other hand, the constant speed Markov chain admits the representation where ℓ 2 (π ω ) is the space of functions f : Z d → R that are square integrable with respect to the measure π ω on Z d . In this case the generator of the Markov chain is simply P ω − 1. The reason why one uses different underlying measure on Z d in the two cases is seen via: Exercise 1.11. Show that L ω is symmetric on ℓ 2 (Z d ) while P ω − 1 is symmetric on ℓ 2 (π ω ). In particular, the VSRW is reversible with respect to the counting measure on Z d while the CSRW is reversible with respect to π ω .
It is clear that the constant-speed chain will follow the discrete-time chain very closely, but the variable-speed chain may deviate considerably because its time parametrization depends on the entire path. This discrepancy will be particularly obvious in the places where, in comparison with the neighbors, π ω (x) is either very small (VSRW gets stuck but CSRW departs easily) or very large (VSRW departs easily but CSRW gets stuck). This may or may not be a disadvantage depending on the context.

Harmonic analysis and resistor networks
The above (discrete-time) Markov chain is in a class of models for which we can apply a well-known connection between reversible Markov processes and harmonic analysis/electrostatic theory. This connection goes back to the work of Kirchhoff in mid 1800s (Kirchhoff [87]) and it underlies many modern treatments of Markov processes. For our purposes the best general introductory text seems to be the monograph by Doyle and Snell [48].
We begin by introducing some relevant notions for the full lattice; the finitevolume counterparts will be dealt with later. For a configuration of the conductances (ω xy ) and a function f : Z d → R, let us define (1. 19) In physics vernacular, this is the electrostatic or Dirichlet energy corresponding to the electrostatic potential f . We then define the effective (point-to-point) resistance R(x, y) between x and y by the formula More generally, we define an effective point-to-set resistance R(x, A) by requiring f (y) = 0 for all y ∈ A in the formula above. Of course, both E(f ) and R(x, y) depend on ω, but we leave that notationally implicit.
A key problem now is a computation, an analysis of various scaling properties, of the effective resistance. As a warm-up, consider now the homogeneous problem when the conductances are equal to one for nearest neighbors and zero otherwise. Leaving aside some technical issues, any minimizer of the Dirichlet energy in (1.20) will then obey 3. An example of an electrostatic problem connected to the Random Conductance Model.
Here part of the percolation cluster in a slab with vertical coordinates in the interval [−N, N ] is attached to metal plates with a given voltage difference. The edges present in the cluster have resistivity one, the edges that are absent are total insulators. A key question is to find the total current density -per unit area of the plates -running through the system. Another question is the value of the electrostatic potential at the origin.
In other words, f is discrete harmonic everywhere away from x and y. It is an interesting exercise in upper-division analysis to solve: For the homogeneous nearest-neighbor problem, use Fourier transform to solve the equation 23) and then adjust I so that f (x) − f (y) = 1. Use this to derive an integral formula for R(x, y).
We can thus check that while the following problem may appear hard, it is at least not ill posed: Exercise 1.13. For the homogeneous nearest-neighbor problem on Z 2 , show without relying on Fourier transform that R(x, y) = 1 / 2 whenever x and y are nearest neighbors.
Returning to the full-fledged Random Conductance Model, let us now discuss the (somewhat degenerate) example of the supercritical percolation cluster depicted in Fig. 1.3. Assuming the potential is fixed to ϕ ≡ −1 at a conducting plate at "height" −N and to ϕ ≡ +1 at the corresponding plate at "height" +N , the question is what is the electrostatic potential right at the center. As before, this potential is a minimizer of the Dirichlet energy E(f ) in (1.19) subject to the conditions that f (x) := 1 whenê 2 · x ≥ N and f (x) := −1 when e 2 · x ≤ −N . Hereê i is the coordinate unit vector in the i-th lattice direction.
What makes this problem relevant for probabilists is the existence of a direct probabilistic "solution:" Let τ (N ) ± be the first hitting time of the upper, resp., lower metal plate, (1.24) Then the electric potential at vertex x turns out to be given by the formula where P x ω is our notation for the law on paths (X n ) of the random walk on environment ω such that P x ω (X 0 = x) = 1. The key point is that the function ϕ defined by (1.25) is harmonic with respect to the generator of the continuous time Markov chain (1.13) with the boundary values given as above. Here a function is said to be harmonic at x when L ω f (x) = 0. Exercise 1.14. Prove the formula (1.25) by showing that such a harmonic function is uniquely determined by its boundary data.
Notice that, as soon as the conductances are non-constant, there is no reason why the potential ϕ at the symmetry point should be equal to zero -as it would be, thanks to symmetry considerations, for the case of homogeneous networks. Obviously, this is quite related to Problem 1.9.
The concept of effective resistance is closely related to the question of recurrence and transience of the corresponding Markov chain. Let τ x := inf{n ≥ 1 : X n = x} (1.26) and, for A ⊂ Z d , The chain will then be recurrent if P 0 ω (τ 0 < τ Λ c N ) → 1 as N → ∞ and transient otherwise. The connection with effective resistance shows that the tendency to recurrence decreases with increasing conductances. Explicitly, we have: is the unique minimizer of the Dirichlet energy for the boundary conditions corresponding to point-to-set resistance R(x, Λ c N ) and use this to derive The upshot of this observation is that if ω xy ≤ ω ′ xy for all pairs x, y, then In particular, if the random walk is recurrent in the environment ω ′ then so it is in ω, and vice versa for the question of transience. For (say) nearest-neighbor Random Conductance Models subject to the ellipticity condition (1.3), recurrence is thus equivalent to the recurrence of the simple random walk. However, as soon as ellipticity is violated, interesting problems arise. Consider for illustration the random walk on the supercritical percolation cluster. There the conductances are bounded above but not below. This still permits us to conclude that the random walk is is recurrent in spatial dimension d = 2, and if it is transient in dimension d = 3, then it is transient in all dimensions d ≥ 3. A key question to resolve is thus: 16. Is the random walk on almost every realization of the threedimensional supercritical percolation cluster transient?
The following question should ideally be solved before tackling Problem 1.16: 1} and let C ∞ (ω) denote the set of vertices in Z d that lie in an infinite self-avoiding path using only edges with ω b = 1. Let ω ′ differ from ω in a finite number of coordinates so that ω and conclude that {X is transient on C ∞ } is a tail event. (In particular, for Bernoulli ω b 's, it is also a zero-one event.) There are a good number of variations on the problem depicted in Fig. 1.3, but here is one that has been particularly perplexing for a number of years -in spite of an existing solution claimed in the book of Jikov, Kozlov and Oleinik [83]. The formulation goes back to Kesten's monograph on percolation (Kesten [85]). Consider the square box Λ N := [−N, N ] 2 ∩ Z d and let G N be the set of those edges whose both endpoints lie in the infinite bond-percolation cluster and also in Λ N . Define the effective resistance corresponding to the boundary conditions −N on the "left" side of the box and +N on the "right" side of the boundary; no boundary condition is prescribed at the remaining portions of the boundary. It is not hard to convince oneself that R −1 N is at most of order N d , but identifying a precise rate is far more challenging: Problem 1.18. Prove that for almost every realization of the supercritical percolation cluster, the limit exists and is independent of the realization. Characterize its value.
Of course, once this has been settled, one may want to go beyond a LLN-type of information and study the fluctuations. Interestingly, as observed already a while ago by Wehr [134], the variance of R −1 N is order at most N d -at least in the elliptic setting -which suggests the following question: Recently, thanks to the work of Gloria and Otto [67], we even know that the variance is actually of order N d (at least in d ≥ 3) so the time seems ripe for resolving this problem as well.

Gradient models
The third and somewhat unexpected context in which one naturally encounters the Random Conductance Model is that of gradient fields. In our formulation, a gradient field is a collection of R-valued random variables φ x indexed by the vertices x ∈ Z d . We impose the following law: Here Λ ⊂ Z d is a finite set and B(Λ) is the set of all edges with at least one endpoint in Λ. The function V : R → R is the potential which we take to be a continuous, even function with sufficient (e.g., quadratic) growth at infinity. The measure depends on the values immediately outside Λ which are set to the boundary conditionφ by the product of delta-masses. Gradient models are ubiquitous in physical sciences where they arise as effective-interface models, with φ x giving the height of a surface above a reference plane, or in descriptions of the fluctuation fields in critical statistical mechanical (spin) models. A higher-dimensional variant, particularly, φ x ∈ R d , has the interpretation of a deformation field representing the displacements of atoms in a crystal from their ideal positions. Further applications can be found in field theory and material physics. The reviews by Giacomin [64], Velenik [132], Funaki [55] and Sheffield [122] give more information and further connections.
We will actually consider the measure (1.34) to be a law on the sigma-field of gradient events which is legitimate since the corresponding restriction of µφ Λ does not depend on the values ofφ but only on their differences. This restriction is dictated by practical reasons -the actual "height" of an interface is usually of lesser importance than the "shape" of its configuration -but also due to technical restrictions in low spatial dimensions. We say that a measure µ on (R Z d , F ) is a gradient Gibbs measure (GGM) if for every A ∈ F and any finite Λ ⊂ Z d , where the expectation is over the boundary conditionφ. Put another way, this says that the conditional probability of A given the configurationφ outside Λ is exactly the measure (1.34).
Before we start discussing the relevant problems arising in this subject area, it is interesting to note two special instances of the above formalism. The first one is the d = 1 case. Let us assume that Λ is connected and, in fact, Λ := (−N, N )∩Z. Then the law of of the gradients, (1.37) This situation can be analyzed with the help of standard methods of largedeviation theory (cf, e.g., Dembo and Zeitouni [45], den Hollander [79]) -in fact, Cramér's theorem more or less suffices -and so one can prove: Exercise 1.20. Suppose d = 1 and a linear boundary condition, i.e.,φ x := tx for some t ∈ R. Show that, for any continuous, even potential V growing superlinearly at infinity, the law of linearly interpolated into a continuous function, scales to a Brownian bridge as N → ∞. Characterize the variance at t = 0.
Another instance of special interest is that when V is quadratic, for some stiffness κ > 0. In this case the above measure is Gaussian and so it is amenable to explicit calculations. In fact, for (say) zero boundary condition φ x = 0, one can even pass to the limit Λ ↑ Z d , provided one restricts to the sigma-algebra of gradient events (1.35). This restriction is necessary because in dimensions d = 1, 2, the law of φ 0 is not tight in this limit. To see this in more explicit terms, note that where G Λ (x, y) is the Green's function associated with the discrete Laplacian with Dirichlet boundary condition on ∂Λ. In probabilist's terms, G Λ (x, y) is the expected number of visits to y by the simple random walk started at x before it exits from Λ. The classical formula see, e.g., Spitzer [126] or Lawler [93], using the notation (1.26-1.27), provides an explicit connection to the issues discussed in the previous subsection.
An analogue of Exercise 1.20 in d ≥ 2 will then be: Consider the Gaussian gradient model with (1.39) with κ := 1.
For a sample of the field (φ x ) from the infinite-volume limit µ := lim Λ↑Z d µ 0 Λ , and a smooth f : R d → R with compact support and f (x)dx = 0 define (1.42) where where G := lim Λ↑Z d G Λ is the infinite-volume Green's function. (The expression on the right is well-define because f ∈ Dom(∆ −1 ).) The problem is meaningful in all d ≥ 1 but only in d = 1 we have a hope to describe the limit as a (real-valued) process. This is because the limiting continuum object, the Gaussian Free Field (GFF), is very rough in d ≥ 2 and, in fact, can only be interpreted in the sense of distribution theory -hence our formulation using a linear functional φ ǫ in (1.42). We refer to, e.g., Sheffield [123] for more information on the tightness issues and other aspects of the GFF.
Having dealt with these instructive examples, let us move on to general potentials V . A remarkable feature of gradient models is that much of what has already been said about the quadratic case applies to any gradient model for which V is uniformly strictly convex -i.e., when V is C 2 with V ′′ positive and uniformly bounded away from zero and infinity. (We will expound on the specifics in the discussion of dynamical environments in Section 4.4). Unfortunately, convex potentials are not what one typically finds in models coming from realistic systems and/or applications and so the last decade has witnessed a major push to obtain a similar level of control also for non-convex interactions. This has so far succeeded only partially because most of the existing techniques fail as soon as V is non-convex anywhere, regardless how unlikely (or energetically unfavorable) a configuration for which this happens may be.
Notwithstanding, there is a family of models with non-convex V that can be studied by way of a connection to the Random Conductance Model. These models are defined generally by requiring that V be given by where ρ is a positive measure on positive reals. Notice that when ρ is supported at a single point, then V is quadratic, but as soon as ρ has at least two points in its support, V can be non-convex, see which is given by To see why this holds, introduce a "private" variable κ xy = κ yx for each edge x, y ∈ B(Λ) and use the additive structure of the interaction to write the exponential weight in (1.34) as the exponential weight in (1.45) integrated over the product of the ρ's. A key point is that, by regarding the κ xy 's as genuine random variables and conditioning on their values, the law of the φ's is again Gaussian, albeit now with a spatially inhomogeneous covariance structure. The above constructions can be performed in infinite volume; see Biskup and Spohn [19] for details. We will only communicate the salient conclusions: First, one can represent every gradient measure µ for the potential V in (1.44) as the φ-marginal of an extended measure ν on pairs of configurations (φ, κ) such that the following holds: (1) Conditional on the φ's, the individual κ's are independent with κ xy having the marginal law proportional to e − 1 2 κxy(φx−φy) 2 ρ(dκ xy ). (2) Conditional on the κ's, the φ's are then Gaussian with covariance given by the inverse of (the negative of) the generator of the Random Conductance Model with nearest-neighbor conductances (κ xy ). (The mean can be characterized too, but we will discuss this in the proof of Theorem 6.7.) (3) The κ-marginal is generally strongly correlated, but if the initial gradient measure is ergodic with respect to translations, then the extended is ergodic as well.
For those familiar with the Random Cluster Model (see, e.g., the monograph by Grimmett [73]) and the Fortuin-Kasteleyn represenation of the Potts model (Fortuin and Kasteleyn [60]), the above should be quite reminiscent of the so called Edwards-Sokal coupling of these two processes (Edwards and Sokal [53]). The structure described above offers the possibility to study the gradient model with non-convex interaction of the type (1.44) by conditioning on the κ's. The proof of scaling of the gradient field to the Gaussian Free Field at large scales then boils down to solving: be a collection of Gaussian fields with mean zero and covariance given for any g : Z d → R with finite support and x g(x) = 0 by where (−L κ ) −1 (x, y) -the inverse of the operator −L κ -can equivalently be described as the full-lattice Green's function of the random walk among nearestneighbor random conductances κ. Show that, for any ergodic law P on the κ's, the random functional φ ǫ (f ) in (1.42) tends to a Gaussian random variable P-a.s. Characterize its variance.
As we will see this will become even more interesting once we start discussing gradient fields with non-vanishing tilt. Naturally, once these basic convergence issues are settled one can turn to more subtle questions such as, for instance: These problems have recently been studied for the homogeneous lattice GFF, e.g., by Bolthausen, Deuschel and Zeitouni [20], Daviaud [37], Hu, Miller and Peres [81], Schramm and Sheffield [119], and also for the uniformly convex interactions (Miller [104]).

Outlook
The upshot of the above overview that all of these problems, although quite varied in nature, can be reduced to specific properties of the Random Conductance Model. In particular, many of the solutions boil down to similar technical questions. In the rest of these notes we will attempt to explain the main ideas underlying the existing solutions and point out the obstacles that are known of for the problems that remain unresolved.

Limit laws for the RCM
The goal of this section is to exhibit the main techniques that will allow us to establish the validity of the SLLN (Problem 1.2) and the Functional CLT (Problem 1.3) for rather general Random Conductance Models. We will take a very pedagogical approach that starts off by addressing the simplest non-trivial cases of interest while isolating, as clearly as possible, various technical issues that come up along the way.

Point of view of the particle
A first basic problem that arises in analyzing the Markov chain (X n ) for a fixed realization of the environment ω is that the increments of this chain are not stationary. A way to mend this is to invoke the first fundamental idea encountered in the theory of random walks in random environment: the point of view of the particle. Namely, instead of making a random walk run through a fixed environment, we will shift the environment around so that the walk remains always at the origin. Technically, this amounts to representing the sequence (τ Xn ω) as a trajectory of a Markov chain on the space of all environments. Proof. The fact that the kernel P generates the Markov chain (τ Xn ω) is a trivial calculation. For the second part, we need to invoke a bit of L 2 -calculus. For any two bounded measurable functions f = f (ω) and g = g(ω), define This is a natural inner product in L 2 (Q). To show reversibility (and thus stationarity) of Q, it suffices to show that f, Pg = Pf, g for any such bounded non-negative f, g -in fact, indicators of measurable events would be enough. For that case we compute where all sums are meaningful by positivity of all terms and the assumption that x ω 0,x is integrable. Now apply τ −x under the expectation to write this as A key property of the environment is its symmetry (1.2) whereby we get Relabeling −x for x, we thus conclude which is, rolling back the first rewrite, exactly Pf, g .
It is not hard to check that, for any bounded f, g, This will help us solve: is self-adjoint and negative semi-definite.
Notice that the stationary measure Q and the a priori law P are mutually absolutely continuous; we in fact even have a very explicit expression for Q. In the studies of general (non-reversible) random walks in random environments it is (usually) not too hard to infer the existence of a stationary measure but a key obstacle is the absolute continuity of P with respect to Q -which we often need to conclude that events that occur Q-a.s. also occur P-a.s. But even in such cases it is unusual to have any sort of explicit handle of Q.
These considerations move us to the question under what conditions is the Markov chain (τ Xn ω) ergodic. In order to explain this a bit better, recall that a stationary Markov chain (Z n ) on a general state space with stationary measure π can always be embedded into a Markov shift as follows: Sample the initial state Z 0 from π and then use the Markov kernel to sample a whole forward trajectory (Z n ) n≥1 . If need be, also use the reversed chain to sample the entire backward trajectory (Z n ) n<0 . This defines -through the Kolmogorov Extension Theorem -a law µ on trajectories of the Markov chain. The canonical shift -simply use Z n for the value of Z n−1 for all n -then defines a measure preserving transformation.
This construction and the Birkhoff-Khinchine Ergodic Theorem imply that, for π-almost every Z 0 and almost every path of the Markov chain -in short, for µ-almost every trajectory -the limit exists and is finite for any function f such that f ∈ L 1 (π). However, we often wonder whether this limit is in fact almost surely constant -and this will only be true for a general f if the chain is ergodic. Explicitly, the above Markov chain is ergodic if any measurable set of trajectories A satisfies µ(A) ∈ {0, 1}.
(2) P is irreducible in the sense that, for every x ∈ Z d , Then the Markov chain (τ Xn ω) with initial law Q is ergodic.
Proof. The proof of this proposition is quite standard -the result has been used at various levels of explicit detail in the literature -although the general setting makes the use of ergodicity of P a bit subtle. Kozlov [89] proves this by way of a functional theoretical argument; we will follow a probabilistic argument from Berger and Biskup [11]. Let A be the event on the space of trajectories (ω n ) n∈Z that is shift invariant. Explicitly, if θ is the Markov shift, (θω) n = ω n+1 , we have θ −1 (A) = A. Let µ denote the law of the trajectories induced by the Markov chain with stationary measure Q. Our goal is to show that µ(A) ∈ {0, 1}.
The first part of the proof is the classical approximation argument that drives the proof of more or less every known zero-one law. Define the function We claim that f 2 = f µ-a.s. To this end approximate A by a sequence of events A n ∈ σ(ω −n , . . . , ω n ) so that The shift invariance of A implies that the same holds for A n replaced by θ n (A n ) and by θ −n (A n ).
Invoking the general fact E µ |E µ (g|ω 0 )| ≤ E µ |g| and applying (2.13), we thus have Similarly, replacing 1 A by 1 A 1 A and approximating the first indicator by 1 θ n (An) and the second by 1 θ −n (An) we obtain . . , ω 0 ) have only one coordinate in common and so, conditional on ω 0 , they are independent. This means The second step is more subtle. Indeed, we claim that f (τ x ω) = f (ω) for all x ∈ Z d and Q-almost every ω. To this end let us note that, by the θinvariance of A, if ω 0 is the initial configuration of a path in A, then also ω 1 is the initial step of a path in A -namely, the shifted path! A moment's thought shows that this implies f (ω 0 ) = f (ω 1 ) µ-a.s. and thus for Q-a.e. ω and P 0 ω -a.e. trajectory (X n ) of the Markov chain. The conditions on P guarantee that for P-a.e. ω, with positive probability (X n ) visits any given x and so we must have f (τ x ω) = f (ω). The event {f = 1} is thus shift invariant and so P(f = 1) ∈ {0, 1}, by the ergodicity of P. Then where we used that the ω 0 -marginal of µ is Q and that Q ∼ P.

Vanishing speed
The conclusion of Lemma 2.1 and Proposition 2.3 can be formalized in multiple ways. E.g., we thus know that for any for P-a.e. ω and P 0 ω -a.e. path (X n ). But since the convergence comes from the Markov shift, we are not limited to functions of only one argument. Thus, for instance, we also know that for any function f = f (ω 0 , ω 1 ) such that for P-a.e. ω and P 0 ω -a.e. path (X n ). This permits us to prove: (2.21) Then for P-a.e. ω and P 0 ω -a.e. trajectory (X n ), Our key problem is to represent X n as an additive functional of the Markov chain (τ X k ω). This can be done easily under the assumption that the environment is not periodic: (Clearly, if the environment is periodic in some direction, there is no way for the walk to "notice" its motion through it when it makes a step in that direction.) We will thus prove the theorem only in this case leaving the periodic caseswhich for ergodic P are a.s. events -to a (simple) Exercise afterwards. We claim that, under (2.23), we get Indeed, for almost every environment and any path of the chain, at most one of the indicators in the definition of f will be non-zero, and it is precisely the one that relates ω ′ to the shifted configuration ω.
We will now apply the conclusion (2.20), but to get the conclusion of the theorem we need to show that This is a matter of a straightforward calculation. First, Second, the absolute summability we just showed permits us to write

M. Biskup
To see that the last expectation vanishes, recall (2.6) to see that The minor trouble with periodic configurations disappears if we encode the sequence of environments along with the corresponding (next) step of the walk. This is an approach that was taken in Kozlov [89]; however, the above works just as well. Indeed, we pose: Find a function of the (joint) environment which encodes X n as an additive function of two consecutive environments. Use this to conclude that (2.22) still holds for almost every path of the Markov chain over ω, regardless of whether the aperiodicity condition (2.23) holds or not.
Notice that, for a shift-invariant configuration ω, the condition (2.21) reduces exactly to the first moment condition in the SLLN. So (2.21) should generally fail once (2.21) is violated, although exact conditions under which this is true do not seem to be available. The same should apply (under a different condition) when only convergence in measure is in question.
The following lemma, which arose in the writing of a proof in Biskup, Louidor, Rozinov and Vandenberg-Rodes [16], can sometimes be useful in applications: Lemma 2.6. Let f ∈ L log L(Q) and suppose that P obeys assumptions (1-3) in Proposition 2.3. Then for P-a.e. ω, (2.29) In particular, the limit exists P-a.s.
Proof. Without loss of generality assume that f ≥ 0 and recall that L log L(Q) is the space of functions f such that f log |f | ∈ L 1 (Q). By Wiener's Dominated Ergodic Theorem (e.g., Petersen [116,Theorem 1.16]) these functions are distinguished by the fact that . ω, the result follows by the Dominated Convergence Theorem.
A subtlety of the above statement is that although the averages 1 n n−1 k=0 f • τ X k converge almost surely and in L 1 (Q ⊗ P 0 ω ), this is not enough to guarantee convergence in L 1 (P 0 ω ), for P-a.e. ω. A useful step towards understanding this is solving: Exercise 2.7. Construct a sequence of random variables Z n such that Z n → Z almost surely in L 1 , but such that, for some σ-algebra A , the conditional expectations E(Z n |A ) do not converge almost surely.
The above arguments are useful even for the continuous-time versions of our random walk. Indeed, we can combine Exercise 1.10 with Theorem 2.4 to solve: Exercise 2.8. Let P be ergodic with P(π ω (0) > 0) = 1 and Eπ ω (0) < ∞. Then for P-a.e. ω, the VSRW does not escape to infinity -i.e., no blow-ups occurin finite time.

Martingale (Functional) CLT
Once a variant of the Law of Large Numbers has been established the next natural question is that of fluctuations. In order to discuss all aspects of this question in a reasonably pedagogical fashion, for a while we will restrict attention to a class of toy models in which the environment has the following properties: Assumptions 2.9 (Toy-model assumptions). For some α ∈ (0, 1) and P-almost every ω, α for all i and all x. In other words, the environments are nearest-neighbor, elliptic and the conductances are constant along the edges on each line of sites in Z d .
What makes these environments special is: Lemma 2.10. Let F n := σ(X 0 , . . . , X n ). For all environments above, {X n , F n } is a martingale.
Proof. Any environment satisfying conditions (1-2) above has the property that the local drift, identically vanishes. To see how this implies the claim we note that, by the Markov property the law of X n+1 − X n conditional on X n is that of X 1 in distribution P 0 τX n ω . Hence, and so X n is a martingale.
We remark that more general (particularly, non-reversible) cases of such balanced environments have been treated by Lawler [92], Guo and Zeitouni [77] and, quite recently, Berger and Deuschel [13]. The main issue dealt with in those papers is a construction, and proper control, of an ergodic, invariant law on environments.
Returning to the setting of Toy Models, the fact that X n is a martingale with bounded increments immediately implies, via Azuma's inequality, Gaussian bounds on its tails. Explicitly, for any unit vectorê ∈ R d we will have However, to get the desired CLT we will have to invoke a more delicate tool which is: square-integrable martingale such that the following conditions hold: Then for each T > 0, the law of on C([0, T ], tends to the Wiener measure with EB t = 0 and EB 2 t = tσ 2 . This is what is sometimes referred to as the "Lindeberg-Feller Functional CLT," although this is only thanks to the formulation which is borrowed from the context of sums of independent random variable (the Lindeberg-Feller CLT, see, e.g., Durrett [51]). The result for martingales is, in this formulation, first due to Brown [23]. Derriennic [46] gave a thoughtful survey of these results; unfortunately, the full version of his paper is somewhat hard to get hold of.
A simple way how to understand the scaling of the martingale paths to Brownian motion is via Skorohod embedding. Explicitly, we have: [125], Strassen [127] and Dubins [49]). Suppose that Law of (M n ) n≥0 = Law of (B Tn ) n≥0 . (2.37) The history of this result is roughly as follows: Skorohod [125] noted its validity for sums of independent random variables, Strassen [127] observed that 1. An example of the Random Conductance Model satisfying Assumptions 2.9(1,2), but not the uniform ellipticity requirement in part (3). Here, for each horizontal or vertical line of edges in Z 2 , we independently retain, resp., drop all edges with probability p, resp., 1 − p. The resulting random subgraph of Z 2 is almost-surely connected and the conclusion of Lemma 2.10 holds for almost every sample thereof.
it holds even for martingales and Dubins [49] finessed an important technical detail where the construction of the stopping times can be done purely on the path-space of the Brownian motion (i.e., without reliance on additional random variables).
Returning to the above Martingale CLT, condition (LF1) guarantees that T n /n → σ 2 which means that the time change between the martingale and the Brownian motion is asymptotically linear. The condition (LF2) ensures tightness in the space of continuous paths (i.e., the Brownian motion will not wiggle too far from the piece-wise linear path interpolating the martingale values). The Skorohod representation only applies to R-valued martingales, hence our restriction to those.
We can now finish the proof of: Proposition 2.13. For any shift-ergodic environment law P satisfying (Toy Model) Assumptions 2.9 and for P-a.e. sample from it, the law of t → X ⌊tn⌋ / √ n, linearly interpolated into a continuous path, tends to Brownian motion.
Proof. We already know that X n is a martingale for the filtration F n := σ(X 0 , . . . , X n ) so we need to verify the conditions of the above theorem. This will be done again by using the point of view of the particle. By the Cramér-Wold device it suffices to prove the convergence for the projections onto all vectors in R d . Fix a vectorê ∈ R d and consider the function and define M n :=ê · X n . The Markov property guarantees and since f is bounded and the environment is elliptic, (LF1) follows with (2.19). The condition (LF2) is trivially satisfied and so we have the result.
Notice the (somewhat counterintuitive) fact that we prove a CLT type of result by invoking a LLN type of result. But this is not so strange when we realize that for convergence to Brownian motion we need three things: asymptotically independent increments, their zero mean/second-moment property and their stationarity. The former two properties can be safely attributed to the use of martingales, but for the last one -and, in this setting, the most difficult one -we need to use the Ergodic Theorem and thus the machinery originally developed for the LLN.
Exercise 2.14. Consider the example of a random environment in Fig. 2.1. Show that, for almost every realization of this environment, the Martingale CLT applies. Characterize the variance of the limiting Brownian motion.

Martingale approximations and other tricks
The derivations in the preceding sections, however elegant, hinge on the crucial assumption of vanishing drift. Unfortunately, this is not what one can (and wants) to ask from a generic Random Conductance Model. Historically, this puts us somewhere in the first half of 1980s when people made first successful attempts to address the CLT in this level of generality. We will follow Kipnis and Varadhan [86] where the following strategy was taken: (1) Represent X n as the sum of a martingale and an additive functional of (a single state of) the Markov chain on environments. (2) Approximate the additive functional by a martingale with an error that can be controlled at the level of the CLT.
The first step can be achieved trivially: The first sum on the right is clearly a martingale -call it M n -while E(X k+1 − X k |F k ) = V (τ X k ω) makes the second part an additive functional of the Markov chain (τ X k ω). (Note that we already know that X n as additive functional of two consecutive environments, but for the application of the Martingale Functional CLT the dependence on a single environment is much easier.) Now we need to write where max k≤n |E k |/ √ n tends to zero in probability. This can be done under proper conditions but one then faces the (rather extreme) difficulty that M n and M ′ n are not independent. To see how an additive functional of a Markov chain can be approximated by a martingale, consider a Markov chain on a state space Ω with transition kernel P. Suppose g : Ω → R is a function such that g ∈ Ran(id − P). In other words, we require for some function h : Ω → R. If ω 0 , ω 1 , . . . denote the successive states of the Markov chain, then a similar trick to the one used above yields and define M ′ n to be the sum. By the Markov property, which implies that (M ′ n ) is a martingale. Of course, in order to have a useful statement, we need that this martingale is properly integrable, which means that the Poisson equation (2.42) must be solved with h in, say, L 2 . As we will comment in a minute, this may be quite a challenge to prove (and in fact, it is often too much to ask). However, such considerations are entirely unnecessary for finite-state Markov chains: Exercise 2.15. Consider a Markov chain with a finite state space Ω and a stationary measure Q. Let g : Ω → R satisfy E Q g = 0. Show that, for Q-a.e. initial state ω 0 , the law of tends to a mean-zero normal random variable. Characterize its variance.
This statement is actually one of the main results of a note due to Gordin and Lifšic [69]. It will be easy to see that the result generalizes to arbitrary state spaces under the condition that g ∈ Ran(id − P) -which we take to mean that (2.42) has a solution h ∈ L 2 (Q); the error, Then the law of the (2.45) tends to a (zero-mean, finite-variance) normal random variable Moreover, the supremum equals σ 2 g and the convergence extends (with the limit given by Brownian motion) even to paths (linearly) interpolating the values of t → n −1/2 ⌊tn⌋ k=0 g(ω k ). Note that the claim concerns the averaged law; no statement about a typical starting point ω 0 is made. This is one of the deficiencies we will have to address in detail when proving the quenched invariance principle in the next two sections. The original method of proof in [86] was to consider the spectral measure µ g associated with the function g and the operator P on L 2 (Q). This measure has the property that, for any F ∈ L 1 (µ g ), (2.48) The Kipnis-Varadhan condition (2.47) can then be written as Notice that the spectrum of P, and thus the support of µ g , is contained in Exercise 2.17. Show that if (ω n ) is a stationary Markov chain with ω 0 distributed according to Q, and g ∈ L 2 (Q), then the variance of (2.45) tends to the quantity in (2.49).
The spectral measure is a very interesting object in its own right due to the connection with the area of random Schrödinger operators. What is quite puzzling is that we do not have any substantive information to report on: Problem 2.18. Describe the connection between the spectral properties of the generator L ω of the random walk among conductances ω -many of which, as is well known, are same for a.e. ω -and the generator L := P − id of the Markov chain on environments.
Let us make some remarks on how the history of the above ideas seems to have evolved. First, the idea to decompose additive functionals (of general stationary ergodic processes) into a martingale and an error is presumably due to Gordin [68] who also had the insight to characterize the objects in terms of their functional-analytic (rather than mixing) properties. Gordin and Lifšic [69] then applied this idea in the specific context of finite-state Markov chains.
The understanding that martingale approximations can be the ultimate passage to limit laws for random walks in random environment seems to have grown out of the work of Papanicolaou and Varadhan [112]; the predecessors of this work were mostly focused on periodic environments. An alternative approach based on resolvent methods was devised by Künnermann [91]. The above (Kipnis-Varadhan) Theorem 2.16 more or less closed the matter for the annealed law in reversible cases. Two natural ways to generalize Theorem 2.16 are as follows: One is to go beyond the annealed law and the other is to extend beyond reversible Markov chains. Both of these directions are far from settled and both constitute a subject of intense research.
We will expound on how to go from annealed to quenched laws in the rest of these notes. Concerning departures from reversible situations, two lines of thought are generally being followed: One approach, drawing on the functionalanalytic ideas, goes by imposing (and checking) various sector conditions (e.g., Olla [110], Sethuraman, Varadhan and Yau [121], Horváth, Tóth and Vető [80]). The role of these conditions is to control the antisymmetric ("non-reversible") part of the generator by the symmetric one. Another approach goes by imposing decay-rate conditions on time-correlations (e.g., Maxwell and Woodroofe [103], Derriennic and Lin [47], Peligrad and Utev [113], Klicnarová and Volný [88], Volný [133], etc.). However, unlike the reversible situations, it does not seem likely that a single condition will eventually cover all cases of interest.

Harmonic embedding and the corrector
Although the subject of martingale approximations is very attractive and useful, in the sequel we will adopt a different approach that emphasizes the geometrical component of the problem over its analytical component. To motivate this approach, consider the explicit example of the simple random walk on the two-dimensional supercritical percolation cluster. When the local drift V (ω) is non-zero, then this is because there is an odd number of neighbors of the origin and the origin thus no longer lies in the barycenter of its neighbors. The martingale defect can therefore be thought to arise from the use of the geometric embedding of the graph, before the edges got removed.
This suggests an idea that one might instead try to look for a different, harmonic embedding for which V would trivially vanish. A moment's thought shows that such an embedding is easy to find in any finite box using a computer -just freeze the positions on the boundary and then ask the computer to sequentially pass through all vertices and always put them at the center of mass of their (graph-theoretic) neighbors. It turns out that this procedure rapidly converges and leads to a picture as in Fig. 3.1. How such an embedding is generated without recourse to finite volume is a slightly more complicated, although not unsolvable problem. The main new ingredient will be the reliance on homogenization theory. Here and henceforth we will make repeated use of this notion: Definition 3.1. We will henceforth say that P obeys the "usual conditions" if it satisfies the conditions (1-3) in Proposition 2.3.
These are exactly the conditions that guarantee the existence and ergodicity of the Markov chain on the space of environments.

Minimizing Dirichlet energy
We begin with some motivational observations for general reversible Markov chains that will explain in more detail how Fig. 3.1 was generated. Suppose a countable set V is given along with the collection of (non-negative) conductances (ω xy ) x,y∈V subject to restrictions (1.1-1.2). Suppose in addition the irreducibility condition: for each x, y ∈ V , there is an n ≥ 0 with P n ω (x, y) > 0. For a finite set A ⊂ V we then define to be the Dirichlet energy in A for the potential f . The following is well known: is achieved by the unique solution to the Dirichlet problem Proof. Pick x ∈ V and any function f . Let f x be defined by We claim that whenever x ∈ A, the "move" f → f x demonstrably lowers the Dirichlet energy, . This is seen from the identity which is proved by optimizing the left-hand side over possible f (x) -this shows that the minimum is achieved at P ω f (x) -and using that, for h(x) shows that, applying the averaging f (x) → P ω f (x) keeps lowering the Dirichlet energy as long as and so any minimizing sequence of E A (f ) in (3.3) is bounded. Reducing to subsequences if needed, we extract a limit which then obeys f = P ω f on A and thus solves the Dirichlet problem. To see that the solution is unique, note that f = P ω f on A implies that f cannot have (strict) local extrema inside A. In particular, we have the maximum principle: Linearity guarantees that the difference between two solutions to (3.3) solves (3.3) with g := 0. The maximum principle ensures that the difference must be zero.
The above proof suggests that we could perhaps use the Dirichlet energy as a kind of measure of distance from a harmonic function. We will explore this very soon in a more general context. However, the argument also highlights a difficulty associated with attempts to "harmonize" the linear function f (x) = x in infinite volume. Indeed, the full-lattice Dirichlet energy of such an f is infinity and so the procedure does not make sense.
This problem is not unknown from other situations and it naturally leads us to a guiding principle of homogenization theory: Instead of trying to find the  Fig. 1.3 based on a harmonic deformation of the underlying graph. The electrostatic potential changes linearly in the height (more precisely, the y-coordinate) of the point. In particular, the potential at the vertex marked by the star -originally, the origin of coordinates -is proportional to the ratio between its distance to the top and the bottom plates.
deformation of the linear function f (x) = x that is harmonic with respect to L ω at all locations for one given ω, we will solve the problem at one specific location -namely the origin -but simultaneously for all ω. Technically, this amounts to replacing the space ℓ 2 (π ω ) associated with the Markov chain (X n ) by the space L 2 (Q) associated with the chain (τ X k ω). The advantage of working on L 2 (Q) is that, unlike π ω , the measure Q is finite.

Weyl decomposition and the corrector
To motivate the forthcoming definitions, recall that the process of substituting P ω f (x) for f (x) applied to the function f (x) := x would replace the value x by x + E 0 τxω (X 1 ). From the point of view of the particle it makes sense to shift this so that the origin of coordinates will not be moved under this action and so we may in fact want to replace x by We are thus led to minimizing the functional Random Conductance Model 327 over all, say, local functions ϕ = ϕ(ω). Here we recall that ϕ = ϕ(ω) is said to be local if it is a bounded, continuous function of a finite number of ω 0,x 's. To see how homogenization translates finite-volume quantities to functionals over space of environments, it is instructive to solve: Assuming ω is a sample from an ergodic measure P, carefully check that For technical reasons it will be advantageous to interpret (3.9) as a quadratic form on vector fields. Let N denote the set of admissible jumps of the Markov chain, By a vector field we will then mean a (measurable) map u : Ω × N → R d , i.e., a vector valued function u = u(ω, x) indexed by environments and points in N . We will always set u(ω, 0) := 0 (3.12) by definition. An example of a vector field is a potential field ∇ϕ where ∇ϕ(ω, x) := ∇ x ϕ(ω). Any potential field is curl-free in the sense that it obeys the cycle conditions. These conditions state that for any sequence x 0 , x 1 , . . . , x n := x 0 of vertices in Z d such that x i+1 − x i ∈ N for all i we have In light of our convention (3.12), whenever N generates all of Z d (as an additive group), this turns out to be equivalent to 14) The vector fields that obey this property (for all ω) will be called shift covariant (sometimes they are called stationary). Note that from (3.14) we automatically have u(ω, x) = −u(τ x ω, −x).
As already alluded to, all potential fields are shift covariant. Another example of a shift-covariant field is the position field, x(ω, z) := z. As we shall see later, the position field and the potential fields generate the vector space of all shiftcovariant fields. The reason for singling out shift-covariant fields is that they correspond to gradients of lattice functions. The following exercise details this connection: Exercise 3.4. Assume the irreducibility condition P(sup n≥1 P n ω (0, x) > 0) = 1, for all x ∈ Z d . Show that for any shift covariant u there is a (P-a.s.) unique and To indicate that the vector field u(ω, x) and the function U (ω, x) are related as in (3.15-3.16), we will sometimes write u = grad U or say that U is an defines a natural inner product on the set of vector fields; the dot in v(ω, x) · w(ω, x) stands for the usual (Euclidean) dot product in R d . This inner product defines a natural L 2 -norm; a minor technical problem -which has often been overlooked in the literature -is that u, u = 0 does not imply that u = 0, only that ω 0,x u(ω, x) = 0 for all x ∈ N . A standard approach would be to factor the space of vector fields by the equivalence relation u ∼ u ′ whenever u − u ′ , u − u ′ = 0. However, this is unnecessary once we restrict attention to shift-covariant fields (and impose a proper non-degeneracy condition). Indeed, define the set L 2 cov := u : shift covariant, u, u < ∞ (3.18) and set u L 2 cov := u, u 1/2 . It is then not too hard to solve: Exercise 3.5. Assume the irreducibility condition P(sup n≥1 P n ω (0, x) > 0) = 1, x ∈ Z d . If u L 2 cov = 0 then u(ω, x) = 0 for all x ∈ N and P-a.e. ω. Once the L 2 -structure is in place, we note that potential fields define a natural closed subspace of L 2 cov . With this space comes the orthogonal decomposition It turns out that the vector fields from (L 2 ∇ ) ⊥ can be quite well characterized. To see that explicitly, define the divergence div(ωu) by the formula: where the bracket simplifies to 2u(ω, x) once u is shift covariant. Thinking of u(ω, x) as the flux from 0 to x, the first term on the right (including ω 0,x ) corresponds to the total flux out of the origin and the second one to the flux into the origin.
Lemma 3.6. For u ∈ L 2 cov , we have u ∈ (L 2 ∇ ) ⊥ if and only if div(ωu) = 0 for Pa.e. ω. In particular, if U is a function such that u = grad U , then L ω U (ω, x) = 0 at all x and P-a.e. ω.
Proof. Pick a local function ϕ and note that where we used u ∈ L 2 cov to split the second expectation into two terms and then relabeled x for −x. It follows that if u, ∇ϕ = 0 for all local functions, then div(ωu) = 0 P-a.s. and vice versa.
For u = grad U , a simple calculation shows div(ωu) = 2L ω U . With the help of shift covariance the condition div(ωu) = 0 then forces L ω U (ω, ·) = 0. Lemma 3.6 shows that the fields in (L 2 ∇ ) ⊥ are, after multiplication by ω, necessarily divergence-free -and are thus sometimes referred to as solenoidal fields. The orthogonal decomposition (3.20) is thus an analogue of the Weyl decomposition from differential geometry. For readers familiar with basic electrostatics, the function U -associated to a shift-covariant field u -can be thought of as an electrostatic potential while ωu plays the role of an electric current. The fact that potential difference and current are related by way of a multiplication by ω is a demonstration of Ohm's law of electrostatics. See Doyle and Snell [48] and/or Sect. 6.
A natural next question to ask now is whether there are any solenoidal fields at all. For nearest-neighbor, constant conductances, a perfect candidate for a solenoidal field is the position field which simply assigns x(ω, x) := x. (Indeed, this function is discrete harmonic with respect to the homogeneous Laplacian on Z d and so it obeys the conclusion of previous lemma.) Of course, once the conductances are not constant, div(ωx) -which equals twice the local drift V (ω) -is generally non-zero, but one can still hope that x has a non-trivial projection into the subspace (L 2 ∇ ) ⊥ . This is all expressed in: Proposition 3.7. Suppose P obeys the "usual conditions" and, in addition, assume that Then there is a function Ψ = Ψ(ω, x) defined for all x ∈ Z d with the properties: (1) Harmonicity: L ω Ψ(ω, x) = 0 for all x ∈ Z d and P-a.e. ω.
(2) Shift covariance: Ψ(ω, 0) = 0 and (3) Square integrability: E Q E 0 ω |Ψ(ω, X 1 )| 2 < ∞. In addition, for any minimizing sequence ϕ n of the function (3.9), we have ∇ϕ n → χ(ω, x) in L 2 cov where χ is the corrector that is given by The infimum of (3.9) over all ∇ϕ ∈ L 2 ∇ is exactly Ψ 2 Proof. The proof could be simply started by defining Ψ via (3.26) and then checking the stated properties based on facts from the theory of abstract Hilbert spaces. However, it will be more instructive to prove some of the those claims directly in the present setting. First note that the object in (3.9) can be interpreted as The condition (3.23) then guarantees that (3.9) takes a finite value for all local functions. Since it is also positive, we can pick a sequence ϕ n for which it tends to its infimum. The parallelogram law then yields The first two terms on the right both tend to the infimum while the last term is bounded by twice the infimum. It follows that ∇ϕ n is Cauchy in L 2 cov and so it converges to a vector field that we denote by χ. This is the corrector in (3.25).
Since χ is a limit of gradients, it is shift-covariant and so it extends to a unique function on Z d . Now we define Ψ := x + χ and note that Ψ 2 L 2 cov is the infimum of (3.9). This implies that for all local functions ϕ and all ǫ, Expanding the left-hand side and taking ǫ → 0 yields Ψ, ∇ϕ = 0 for all local functions, i.e., Ψ ∈ (L 2 ∇ ) ⊥ . By Lemma 3.6(1), Ψ is L ω -harmonic. Obviously, the conditions (1-3) in the above proposition can be satisfied by Ψ := 0; it is thanks to (3.26) that this can generally be excluded. (However, we could still have that Ψ is identically zero; see Exercise 4.3.) A question might also arise whether the function Ψ is uniquely determined by the above properties. Biskup and Spohn [19] showed by fairly soft arguments that this is indeed the case. In fact, one even has a stronger statement: with AΨ(ω, x) denoting the vector whose i-th Cartesian coordinate is given by j a ijêj · Ψ(ω, x) where A = (a ij ). The position function and the potential fields thus generate all shift-covariant square-integrable (R d -valued) vector fields. (Notwithstanding, see Problem 4.18 for a very non-trivial generalization of this question.) Quastel [117] has derived a similar result to (3.30) albeit with the use of Poincaré inequality and spectral-gap estimates.
It should be emphasized at this point that the above constructions have been quite standard -albeit perhaps in different context and using different notations -in various contributions dealing with homogenization theory. An application of these techniques to random walk in random environment was done somewhat independently in the Western school by Varadhan, Papanicolaou and coauthors and in the Russian school by Kozlov. In particular, Kozlov's well-known paper [89] contains an extended version of the Weyl decomposition of vector fields -which he calls forms -into the sum of a gradient field, a harmonic field and a constant field which applies even in non-reversible situations. Apart from strong ellipticity, the main requirements for this decomposition in [89] are: (1) There is an measure Q which is invariant for the Markov chain on environments and absolutely continuous with respect to P. (2) The reciprocal value of the Radon-Nikodym derivative dQ dP is in L 1 (P). While the absolute continuity of an invariant measure is usually somewhat challenging, it is the second condition that is invariably nearly impossible to check directly in any realistic (non-reversible) situation. We note that although Kozlov's paper is known to contain inconsistencies, it puts forward a number of good ideas and is thus a very recommended reading for anyone with interest in this subject.
The construction of the harmonic deformation can be performed rather seamlessly even in the case when π ω (x) is zero at some vertices. What we need to assume is that there is a P-a.s. unique infinite component C ∞ of vertices with π ω (x) > 0 such that the conditional measure P 0 (−) = P(−|0 ∈ C ∞ ), (3.31) with expectation denoted by E 0 , satisfies the following variant of the "usual conditions": (1') P 0 (π ω (0) > 0) = 1 (which holds trivially) and E 0 π ω (0) < ∞.
(2') P 0 is irreducible in the sense that, for every x ∈ Z d with P 0 (x ∈ C ∞ ) > 0, (3.32) (Condition (3) for measure P is not needed for now, the translation invariance of P suffices.) Exercise 3.8. Suppose that C ∞ and P 0 are well defined and assume conditions (1') and (2') above. Suppose also (3.23). If ϕ n is any minimizing sequence of the functional show that ∇ϕ n (ω, ·) still tends to some χ(ω, ·) in L 2 cov . Use this to define Ψ = Ψ(ω, x) with x ∈ C ∞ which is harmonic with respect to L ω .
The function Ψ constructed in this Exercise is the harmonic embedding of C ∞ that we discussed at the beginning of this section. A construction along the above lines can be found in the paper of Mathieu and Piatnitski [101] for the problem of supercritical percolation cluster and in Biskup and Prescott [18] at the current level of generality. Berger and Biskup [11] give a construction which is based on the spectral representation method of Kipnis and Varadhan (see end of Sect. 2.4). Another way to define the corrector might be a result of: Exercise 3.9. Show that the limit in exists and equals Ψ(ω, x) for P-a.e. ω.
It would be of much interest to find a solution to this problem without a recourse to the functional-analytic methods discussed above.

Quenched Invariance Principle on deformed graph
Let us now turn attention back to the problem of a random walk among random conductances. A simple consequence of the above constructions is: Corollary 3.10. Suppose P satisfies the "usual conditions" and, in addition, (3.23) holds. Define M n := Ψ(ω, X n ). Then for P-a.e. ω and each T > 0, the law of induced by P 0 ω on the space C([0, T ], tends to the Brownian motion B t with EB t = 0 and the covariance structure determined by Proof. By the Cramér-Wold device it suffices to prove the convergence in law for the projection of the process onto any vector. We will denote this projection (with some abuse of notation) also by M n := λ · Ψ(ω, X n ). The filtration is as before: F n = σ(X 0 , . . . , X n ).
First, the L ω -harmonicity of Ψ guarantees that M n is a martingale so we just need to verify the conditions (LF1-LF2) of the Martingale Functional CLT. We will take care of both of these by considering the function Indeed, by property (3) in Proposition 3.7, f K ∈ L 2 (Q) for all K ≥ 0. Next, the shift-covariance of Ψ implies M k+1 − M k = λ · Ψ(τ X k ω, X k+1 − X k ) and so, by the Markov property, It follows that the left-hand side of (LF1) equals for K := 0, while the left-hand side of the expression in (LF2) is bounded by this term from above as soon as n is so large that ǫ √ n > K. Ergodicity of P with respect to translations ensures via (2.19) that the expression (3.39) tends to E Q f K (ω) as n → ∞. This verifies (LF1) with σ 2 given by the right-hand side of (3.36), and it also proves (LF2) because, thanks to the Dominated Convergence Theorem, we have The result now follows by applying Theorem 2.11.
The above argument can be pushed through even in the case when the walk is restricted to an infinite connected component C ∞ , as described above. One just needs to carefully check that the current proof of Proposition 2.3 still applies (details are spelled out in Berger and Biskup [11]). However, later arguments might be seriously hampered by the fact that P 0 is no longer shift invariant. This can be circumvented by the introduction of an induced shift. Namely, for each i = 1, . . . , d, let The collection of maps (θ 1 , . . . , θ d ) defines shifts which preserve P 0 and, in fact, make P 0 ergodic. To see why these are well defined and the last property is true, consider the following exercise from abstract ergodic theory: Exercise 3.11. Let (X , F , µ) be a probability space and let A ∈ F be such that µ(A) > 0. Let τ : X → X be a µ-preserving bijection and suppose that µ is ergodic with respect to τ . Let n A (x) := inf{n ≥ 1 : τ n (x) ∈ A} for each x ∈ X . Do the following: (1) Show that n A < ∞ µ-a.s.
(3) Prove that µ A is ergodic with respect to θ.
We will close this section with an exercise that illustrates the above abstract setting in one situation where explicit calculations are possible.
Exercise 3.12. Suppose d = 1 and only nearest-neighbor conductances. Assume that P is ergodic with respect to the canonical shift on Z and suppose that Verify that defines a function satisfying properties (1-3) in Proposition 3.7. Conclude that the random walk (Ψ(ω, X n )) n≥0 satisfies the (quenched) invariance principle.

Taming the deformation
In this section our main goal is to finish the discussion of the essential steps of the proof of the quenched invariance principle. We will do this while leaving the most technically involved part, heat-kernel estimates, to the next section. Most of the material discussed here is quite standard; a possible exception is Theorem 4.7 which has not appeared in this generality before.

Remaining issues
Let us quickly review what we have accomplished so far. First, we used the examples of the balanced environments to isolate the martingale property as the key vehicle that will get us to the CLT (Section 2.3). Then, in the situations which are not balanced, we introduced a new embedding of Z d -described by the function Ψ above -that again makes the random walk into a martingale (Proposition 3.7). On this embedding we succeeded in proving the convergence to Brownian motion (Corollary 3.10). However, two issues remained unresolved: (1) The limiting Brownian motion may be degenerate to a point. (2) The harmonic embedding may be quite distorted from the original lattice.
Although the answer to (1) is ultimately related to the answer to (2), we will first focus on (1) as it is easier. We will start by solving Exercise 3.12.
It is easy to check that the function Ψ from (3.44) is harmonic with respect to L ω . This follows from the calculation The shift-covariance is a consequence of the additive form of the expressions in (3.44) while integrability follows from which is finite and positive by our assumptions. Applying the arguments in the proof of Corollary 3.10, Ψ(ω, X n ) satisfies an invariance principle with a nondegenerate limiting Brownian motion. The remainder of the Exercise is now embedded into: Then the limiting Brownian motion in Corollary 3.10 is non-degenerate.
Proof. We need to show that the right-hand side of (3.36) is bounded below by c|λ| 2 for some c > 0 and all λ ∈ R d . To this end we write . (4.5) By (3.23) and (4.3) we conclude that this exceeds c|λ| 2 for some c > 0.
Note that the same argument would apply whenever the set generates all of Z d (as an additive group). This still does not cover the case of supercritical percolation (which can nonetheless be covered by an alternate argument) so we pose: Note that we have a pointwise bound R(0, x) ≤ 1/ω 0,x with R(0, 1) = 1/ω 0,1 in d = 1 with nearest-neighbor conductances. This suggests also: Exercise 4.3. Suppose d = 1 and let P be a measure on i.i.d. positive and nearest-neighbor conductances such that E(ω 0,1 ) < ∞ and E(1/ω 0,1 ) = ∞. Show that the infimum of (3.9) over local functions is zero. Conclude that we must have χ(ω, x) = −x.
A proof of (an analogue of) Proposition 4.1 appeared in Kozlov [89] and in de Masi, Ferrari, Goldstein and Wick [43,44]. With a bit more effort one can develop a variational characterization of the inverse of the limiting covariance matrix by minimizing a (version of) Dirichlet energy over nearly linear flows (Biskup [14]). This in principle allows one to numerically approximate the covariance matrix with arbitrary precision from above and below.
Approximation arguments for the diffusion constants are at the core of the Kipnis-Varadhan approach sketched in Sect. 2.4. Caputo and Ioffe [27] studied periodized versions of the Random Conductance Model and the convergence of the effective diffusion coefficient to the infinite volume object; related work in a continuum context can be found in Owhadi [111].

Sublinearity of the corrector
Having addressed non-degeneracy of the limiting Brownian motion, we are ready to move to the second -and considerably more involved -issue. The important thing is to realize that for our purposes it would suffice to show that asymptotically along a typical path of the random walk. Indeed, once we know that Ψ(ω, X k ) − X k = o(X k ) we can use the martingale CLT to get Ψ(ω, X k ) = o( √ n) for all k ≤ n which then implies that also X k = O( √ n). But then we will have Ψ(ω, X k ) − X k = o( √ n) for all k ≤ n, which means that the change of embedding of the graph has a vanishing effect at the diffusive scale. A more general version of (4.7) would be to require this for all positions in the lattice, not just those visited by the path. In d = 1, this is not hard to get: Exercise 4.4. Suppose that P is an ergodic law on nearest-neighbor conductances in d = 1. Assume E(ω 0,1 ) < ∞ and E(1/ω 0,1 ) < ∞. Show that |x| → ∞, (4.8) and prove that the corresponding random walk satisfies a quenched invariance principle. (Compare also with Exercise 4.3.) However, the situation in higher dimensions is quite more subtle. While the technical details of derivations in the paper of Kipnis and Varadhan follow a different route, their methods can be used to show: This statement will imply the so called Annealed Invariance Principle, sometimes also called a functional CLT in probability. We will choose to formulate this in a form of a coupling. Here we recall that, given two probability measures P and P ′ , their coupling is a probability measure Q on the product space whose first, resp., second marginal is given by P , resp., P ′ .  induced by P 0 ω and a Brownian motion t → B t with mean zero and covariance (3.36) so that, for each T > 0 and each ǫ > 0, Proof. (Sketch) First let us note that both (4.9) and (4.11) hold equivalently with expectation E Q or expectation E. (This is because Q and P are equivalent and the quantity under expectation is bounded.) To prove (4.11), we will use the fact, implied by the Skorohod embedding, that such a coupling exists between the Brownian motion and the analogue of t → W t defined using the martingale M n := Ψ(ω, X n ). Let Z (n) t denote the expression on the right of (3.35). Then we have where Q 0 ω is induced by the Skorohod embedding. As to (4.11), we note that Since the event on the right does not depend on the second marginal of Q 0 ω , we thus have which tends to zero as n → ∞ by Theorem 4.5. Combining (4.12-4.14) the result follows.
We remark that when the supremum is dropped from (4.9), we talk about an annealed CLT. The averaging over the invariant measure Q in Theorem 4.5 is not a mere technical convenience as the statement is not strong enough to infer (4.11) without the expectation over environment. It actually took nearly 20 years after Kipnis-Varadhan's result before this issue was first successfully addressed and a proper quenched invariance principle proved. This was done in the work of Sidoravicius and Sznitman [124] who realized that one can get further with the help of the heat kernel estimates. However, Berger and Biskup [11] were later able to avoid the use of these in their argument for the two-dimensional supercritical percolation cluster. We will present a sketch of their argument in a slightly more general, albeit non-percolative, setting: Before we set out to prove this, we note that there is a small technical subtlety that arises from the distinction between ergodicity and directional ergodicity. To make this distinction clearer, we invite the reader to first solve: Exercise 4.9. Construct a law P on nearest-neighbor conductances that is (jointly) ergodic with respect to translations -i.e., P(A) = 0 for all A with τ x (A) = A for all x ∈ Z d -but not separately ergodic in the sense that there is a set B of environments which is invariant under translations in the first coordinate direction and for which 0 < P(B) < 1.
The first two items will follow from the construction of the corrector. Recall that we are guaranteed that ∇ϕ n → χ in L 2 cov -which is a kind of weighted L 2 -space. Since for ϕ ∈ L ∞ (P), it suffices to show that χ(·,ê) ∈ L 1 and ∇êϕ n → χ(·,ê) in L 1 . (The former actually follows from the latter, but we find this order more instructive.) And, indeed, by the Cauchy-Schwarz inequality we get and similarly we derive which tends to zero as n → ∞ because ∇ϕ → χ in L 2 cov . Finally, in order to link the limit to the expectation, we also need to show that fê is translation invariant. To that end pick another lattice directionê ′ and note that, by translation covariance, Dividing by n, the L 1 -limit of the last two terms is zero and so from the above L 1 -inclusions we conclude that fê(τê′ ω) = fê(ω) for P-a.e. ω. Putting all pieces together the claim follows.
We remark that the fact that the conditions in Lemma 4.8 are the same as in Proposition 4.1 is not a coincidence. Indeed we have:  Our next goal is to boost the directional subadditivity -which we may assume for both lattice directions under the conditions of Theorem 4.7 -into a corresponding statement over a box of side n. To this end, let us say that the origin is (K, ǫ)-good in ω if for allê ∈ {±ê i : i = 1, 2} and all n ≥ 1, (4.23) A point x is then called (K, ǫ)-good in ω if 0 is (K, ǫ)-good in τ x ω. By Lemma 4.8 we know that P 0 is (K, ǫ)-good −→ K→∞
These observations permit us to define a good grid as follows. Take the two lines {nê i : n ∈ Z}, i = 1, 2, and add to them all vertices of the form n 1ê1 + n 2ê2 with n 1 , n 2 ∈ Z such that either n 1ê1 or n 2ê2 is (K, ǫ)-good. Call the resulting (random) set of vertices G K,ǫ (ω). Then we note: Proof. Let x := n 1ê1 + n 2ê2 be a vertex in G K,ǫ (ω). This means that, e.g., n 1ê1 is (K, ǫ)-good in ω. Since the origin is (K, ǫ)-good as well, we can write But |x| ∞ ≤ n implies |n 1 |, |n 2 | ≤ n and so the claim follows.
We now know how to control the corrector at the vertices of the good gridwhich can be made arbitrary dense -but we still have to worry about those in the complement thereof. An important fact is that the connected components of Z 2 \ G K,ǫ (ω) are finite and, in fact, that any such component intersecting the box [−n, n] 2 has diameter o(n). This can be justified by solving: We can now finish the proof of sublinearity of the corrector: Proof of Theorem 4.7. Pick x ∈ Z 2 \ G K,ǫ (ω) with |x| ≤ n. Let C(x) denote the component containing x. We claim that This is a consequence of L ω -harmonicity of Ψ and the maximum principle. Indeed, define the first hitting time T := inf{n ≥ 0 : X n ∈ C(x)} (4.28) of the complement of C(x). Then which we can rewrite as To finish the argument, we recall that diam C(x) = o(n) and so we may assume that n is so large that C(x) ⊂ [−2n, 2n] 2 . In that case max z∈∂C(x) χ(ω, z) is bounded by the maximum from Lemma 4.12 with n replaced by 2n. We get thus proving the claim.

Above two dimensions
The above reasoning can be boosted to cover all ergodic two-dimensional environments with a finite range of jumps that satisfy the condition E(1/ω 0,êi ) < ∞. However, there is an inherent problem with this approach in higher dimension; indeed, one can still define a good grid but this grid will no longer partition Z d into finite components. In an attempt to adapt the argument based on (4. 29-4.30), one thus has to worry about two things: How long does it take to hit the good grid and how far will X T be from x. This can be done but (insofar) only with the help of heat-kernel technology. We paraphrase a theorem from Biskup and Prescott [18]: Theorem 4.14.
Fix ω such that π ω (x) ∈ (0, ∞) for all x and suppose χ = χ(x) is a function and θ > 0 is a number such that the following holds: We remark that most of the proof of this theorem goes through even when the variable-speed random walk is replaced by the constant-speed walk (for which the bounds (4.34-4.35) may be easier to prove). This is because Ψ(X t ) is a martingale for both walks. The sole point where the variable speed walk seems to be used is formula (5.13) on page 1338 of [18].
In an earlier work (e.g., Berger and Biskup [11, Appendix A2]) the same conclusion as given by Theorem 4.14 could be achieved -although perhaps in a less transparent way -by using the full heat-kernel upper bounds of the form The point of reducing the heat-kernel input to the statements (4. 34-4.35) is that these are easier to verify than the actual heat-kernel upper bounds. We also note that Sidoravicius and Sznitman [124] have used the heat-kernel bounds mainly to control the tightness of the limiting process, while here we are using it to control the deformations of the harmonic embedding. (Tightness follows in our case from the Martingale Functional CLT.) A key input in Theorem 4.14 is the sublinearity-on-average claim which we formalize as: The proof is based on the commutative structure of Z d and a bootstrapping of the one-dimensional sublinearity established in Lemma 4.8 by induction along dimension. Recall the notion of a good grid G K,ǫ introduced (in d = 2) earlier.
The induction argument is contained in the following deterministic "pigeonhole-principle" lemma: and x, y ∈ A ⇒ χ(y, ω) − χ(x, ω) ≤ 2d K + ǫ(2n + 1) . (4.40) Proof. We will prove this by induction on dimension. Fix ω and for ν ∈ {1, . . . , d} define sets Λ n to denote a ν-dimensional set of the above form which contains the maximum number of good sites. Note that if η is as in the statement, we have because the ratio on the left decreases in ν.
To prove also (4.45), we pick two sites x, y ∈ A (ν+1) and letx = Π(x) and y = Π(y). The claim for ν then implies while the fact that x is a good site yields and similarly for the pair y andȳ. Combining these bounds and using the triangle inequality then implies (4.45) for x and y -with, of course, 2ν replaced by 2(ν + 1).

Lemma 4.16 now implies that the corrector is sublinear on average:
Proof of Proposition 4.15. Suppose without loss of generality that δ < 8 −d , fix ǫ < 1 32d δ and note that we can choose K so large that P(0 ∈ G K,ǫ ) ≥ 1 − δ / 2 . By the Spatial Ergodic Theorem and ergodicity of P we thus have once n ≥ n 0 for some a.s. finite n 0 = n 0 (ω). We will assume that n 0 is so large that also δn > 16d K + ǫ(2n + 1) (4.51) holds for all n ≥ n 0 . By Lemma 4.16, for each n ≥ n 0 there exists A n = A n (ω) ⊂ Λ n with and (4.40) valid for all x, y ∈ A n . As In particular, A n ∩ A 2n = ∅ for each n ≥ n 0 . Let k 0 be the smallest integer such that 2 k0 ≥ n 0 and let us pick a site x k ∈ A 2 k ∩ A 2 k+1 for each k ≥ k 0 . The bounds (4.40) and (4.51) then give us (4.53) Choosing k 1 = k 1 (ω) ≥ k 0 so that |χ(x k0 , ω)| < δ2 k1−2 , this and (4.40) imply But this means that for n ∈ {2 k : k ≥ k 1 }, As δ was arbitrary, this proves (4.38) for n increasing along powers of two. A moment's thought now reveals that the same then holds for the unrestricted limit as well.
As for Theorem 4.14, we refer the reader to Biskup and Prescott [18]. It should be emphasized that, although the assumptions to all the above are those of the annealed invariance principle, we in addition require the validity of the diffusive bounds (4. 34-4.35). These are by no means guaranteed for a general ergodic P, so the problem whether the annealed and quenched invariance principle hold simultaneously remains open.
We close this subsection with a simple exercise concerning the invariance principle for the variable-speed continuous-time version of our random walk.
Exercise 4.17. Suppose the "usual assumptions" and assume that (X n ) obeys the Quenched Invariance Principle with the limiting Brownian motion having covariance (3.36). Show that the variable-speed continuous time walk X t obeys a Quenched Invariance Principle with the limiting Brownian motion having covariance Note that the quantity on the right-hand side is closely related to the infimum of (3.9), which was used to define the corrector. The appearance of expectation E instead of E Q is due to the fact that P is invariant for the point of view of the particle induced by the VSRW. As to the constant-speed walk, here the quenched invariance principle follows from the discrete-time case by a strong asymptotic concentration of a sum of i.i.d. exponential times.

Known results and open problems
The following sums up the principal steps in the progress towards proving quenched invariance principle in the class of Random Conductance Models: • Strongly elliptic, ergodic P: proved by Sidoravicus and Sznitman [124].
We remark that that the condition E(ω 0,e ) < ∞ is essentially necessary; indeed Barlow andČerný [6] (d ≥ 3) andČerný [30] (d = 2) proved that for i.i.d. nearest neighbor conductances with α-stable upper tail, α < 1, the law of X nt is under proper scaling described by B Wt , where B t is a Brownian motion and W t is the inverse of an independent stable subordinator with index α. In other words, the paths are still Brownian but the heavy edges introduce a non-trivial trapping effects thus rendering the time parametrization non-linear and, in fact, stochastic. We remark that in physics, the limiting process is referred by as the fractional kinetics process. An important open problem concerns the rate of convergence and quantification of errors in martingale approximations. Although optimal results are probably far from reach, interesting ideas have been developed and quantitative results derived by Mourrat [106] and Gloria and Mourrat [66]. The aforementioned work of Gloria and Otto [67] gives integrability estimates on the corrector in d ≥ 3 under strong ellipticity.
The Random Conductance Model has been also studied over other basegraphs than just Z d . For instance, Caputo, Faggionato and Prescott [26] have investigated the random walks over various point processes in R d . Independent studies for random walks on Voronoi/Delaunay triangulations have been announced by Buckley [24]. Ferrari, Grisi and Groisman [58] have constructed the harmonic coordinates on such triangulations by means of an interacting particle system; namely, a harness process, which is basically a full-space stochastic version of the algorithm described for the finite boxes in Sect. 3.1. The methods of Kipnis and Varadhan can be applied even to some deterministic quasiperiodic structures; see, e.g., Telcs [128] who recently established an annealed invariance principle for the simple random walk on Penrose tilings.
Although we are able to control the corrector to the level required for the quenched invariance principle, the object itself remains rather mysterious and many open questions remain. For instance, regardless of what has been said at the end of Section 3.2, the following problem remains of great interest both from the perspective of probability and analysis: Problem 4.18. Is it true that a.e. realization of random conductances satisfying the "usual conditions" admits no non-constant, sublinear harmonic functions?
Recently, Benjamini, Duminil-Copin, Kozma and Yadin [9] have shown that that on the supercritical percolation cluster in Z d , the space of linear harmonic functions is exactly d + 1-dimensional. In particular, a typical supercritical percolation cluster supports no non-constant sublinear harmonic functions. We expect this to hold for all i.i.d. nearest-neighbor Random Conductance Models; for general environments the problem remains open.
Another open question concerns the scaling limit of the corrector: scales, as ǫ ↓ 0, to a Gaussian with mean zero and variance proportional to Progress in the uniformly elliptic case has been achieved in recent work of Gloria and Otto [67] who have been able to prove that the corrector is in L q (P) for all q < ∞, and thus a tight random variable, in all dimensions d ≥ 3. This settled an open problem from [11].
Another, perhaps somewhat related, question is that of the very definition of the corrector. Indeed, the corrector is defined almost surely for every ergodic law on environments P. However, as different ergodic laws are singular with respect to one another, it is not clear how to mesh the various correctors together. And yet it seems this should be possible: Problem 4.20 (Universal corrector). Consider the set of nearest-neighbor environments Ω := [a, b] B(Z d ) where 0 < a < b < ∞. Define a function χ : Ω × Z d → R d such that, for every ergodic law P on Ω, it agrees with the corrector corresponding to measure P.
We remark that this would be solved if one could find a sequence of local functions ϕ n such that ∇ϕ n → χ almost surely for every P. Note that, although may find functions ϕ n for which the convergence takes place in L 2 cov for any given P, almost sure convergence requires reduction to subsequences which may be strongly P-dependent.
The understanding of the Markov chain permits one to consider more complicated questions. One such question concerns the typical number of points visited by the random walk in a given time. This was recently addressed by Rau [118]. Another question is the Law of Iterated Logarithm; this was established by Duminil-Copin [50]. Next is the question of the behavior of the random walk on very thin percolation clusters. This can be studied directly in the case when p = p c where, technically speaking, the percolation cluster does not exist but one can still enforce it by conditioning. For the resulting incipient infinite cluster (IIC) in sufficiently high dimensions, Nachmias and Kozma [108] proved the Alexander-Orbach conjecture in all dimensions d ≥ 7 -modulo caveats regarding the existing level of lace-expansion technology. This conjecture, due to Alexander and Orbach [2], states that, on IIC, Notably, this is expected to be false in low spatial dimensions. Related to this would be the decay of the diffusive constant for the simple random walk on the supercritical cluster for parameter p, as p ↓ p c . Here we pose: This problem is closely related to the existence of effective conductivity which was studied in, e.g., Grimmett and Kesten [74], Chayes and Chayes [31] and Kesten's monograph [85] on percolation. See also Sect. 6. A rather convincing argument can be obtained for this by analyzing the formula (3.36) and making plausible assumptions on the structural properties of the percolation cluster. Resorting to the electrostatic interpretation, the electric current should be carried only by the backbone of the cluster -which, in the limit p ↓ p c , becomes a "net" of fractal curves. The exponent in (4.61) then comes from realizing that in d ≥ 7, these fractals have Hausdorff dimension 2 (although the relation is not so straightforward as a simple equality of these numbers). This intuition seems be confirmed by observations made in the physics literature, see, e.g., Schrøder and Dyre [120]. A main puzzle that remains is whether, and how exactly should the exponent 2 in (4.61) be related to the exponent 4/3 in (4.60).
We remark that the amount of physics literature written on this and related subject is absolutely overwhelming; just see the articles citing the review by Dyre and Schrøder [52].
Another very interesting class of applications of the above techniques is the random walk in dynamical (albeit still reversible) random environments. We will not go into details here, but let us just say that much of Kipnis-Varadhan theory carries to this case and so annealed limit theorems are available. However, the understanding of quenched invariance principles is far less evolved. Much can be said when the dynamics of the environment is Markovian and there is enough mixing; one can then get enough control via regeneration arguments. However, even here it is far from clear how to formulate convenient, and very general, conditions under which invariance principles can be obtained.
From the perspective of this text, one specific class of dynamical random environments is of special interest. Consider a function V : R → R which is twice continuously differentiable and define a collection of coupled diffusions where B t (x) are independent standard Brownian motions. As it turns out, any gradient Gibbs measure for the potential V is stationary under this dynamics. Assuming that V is convex, and thus V ′′ ≥ 0, we can now define a random walk X = (X t ) which at time t at position X t = x takes a jump to a neighbor y at rate V ′′ (φ y (t) − φ x (t)). An attractive feature of this setting is that it permits us to analyze gradient models with convex interactions. For instance, we have the following formula for the covariance of the (static) field in two locations with respect to a gradient Gibbs measure µ by means of the expected number of visit to x by the above random walk started at 0 -we expect this to be finite only in d ≥ 3 but other formulas exists in d = 1, 2. Obviously, this generalizes the well-known formula from the Gaussian case which is distinguished by the fact that the random walk is not coupled to the evolution of the fields. The formula (4.63) is one instance of the Helffer-Sjostrand random walk representation of correlation functions for the gradient model. These have been indispensable in the study of gradient models with convex interactions (e.g., Naddaf and Spencer [107], Giacomin, Olla and Spohn [65], Funaki [55], etc).

Heat-kernel decay and failures thereof
As discussed at length in the previous section, our current strategy of the proof of the quenched invariance principle seems to generally require the use of rather precise estimates on the probability that the Markov chain moves from x to y in n steps. We emphasize that this is conceptually flawed because we seem to need a local-CLT type of result to finish a plain CLT. Notwithstanding, the study of the heat kernel is interesting in its own right. We will only review the techniques that are ultimately relevant for the applications at hand and refer to, e.g., the upcoming textbook by Kumagai [90] for a more in-depth treatment of that well-developed area.

Some general observations
To set the vocabulary straight, let us first remark that by the heat kernel one usually means the quantity q n (x, y) := P n ω (x, y) π ω (y) . (5.1) As one can expect, P ω (x, ·) will for large n approach (a multiple of) the stationary measure π ω . So q n , being in fact the Radon-Nikodym derivative of P n ω (x, ·) with respect to π ω , is a very natural object to consider. Note that reversibility implies q n (x, y) = q n (y, x).
Theorem 4.14 required in (4.35) that the return probability generally decays as n −d/2 . It turns out that, should the CLT hold, we cannot hope for a faster decay than this: is satisfies a CLT with non-degenerate diffusion constant σ 2 . Assume that π ⋆ := sup x π ω (x) < ∞. Then there is c = c(d, σ 2 , π ⋆ ) > 0 such that for n sufficiently large, Proof. We use reversibility and simple estimates to get The sum on the right-hand can be further bounded using the Cauchy-Schwarz inequality: But the CLT ensures that P 0 ω (|X n | ≤ √ n) ≥ 1 2 P (|B t | ≤ 1/σ) for n large, where B t is the standard d-dimensional Brownian motion, and |{x : |x| ≤ √ n}| ≤ c ′ n d/2 for some c ′ = c ′ (d) < ∞.
We remark that a general method of getting such (including "near-diagonal") lower bounds in elliptic random environments has been put forward by Nash [109] and Fabes and Stroock [57].
For reasons discussed earlier, the main technical problem is to find natural conditions on the Markov chain so that an n −d/2 upper bound can be guaranteed. This problem has been studied for over half a century, starting from proofs of regularity of elliptic PDEs with irregular coefficients (De Giorgi [40], Nash [109], Aronson [4]) and validity and consequences of Faber-Krahn, Sobolev and Nash inequalities for diffusions on manifolds and Markov chains (e.g., Varopoulos [130], Carlen, Kusoka and Stroock [28]). A method to get off-diagonal bounds -i.e., for q n (x, y) with x = y -has been put forward by Davies [39] based on the Carne-Varopoulos bound (Carne [29], Varopoulos [129]).
In the course of time it has been realized that there is a close connection between the desired upper bound and the geometric properties of the underlying state-space. The key property to check is the validity of the isoperimetric inequality (Cheeger [32]) or, more generally, the character of the isopertimetric profile (Grigoryan [70]). This connection was later transferred to the context of (discrete-space) Markov chains by Lawer and Sokal [94] and Jerrum and Sinclair [82] (invoking isoperimetric inquality) and, later, by Lovász and Kannan [97] and Morris and Peres [99] (based on isoperimetric profile).
We will not try to delve deeper into the details of historical developments of the subject; instead, the reader should consult the many texts that have been written on this (e.g., by Coulhon and Grigor'yan [36], Davies [38], Kumagai [90], Montenegro and Tetali [105], Varopoulos [131], Varopoulos, Saloff-Coste and Coulhon [131], Woess [135], etc). For us the key fact is that with many Markov chains we may associate a natural graph structure -simply put an edge between any two states in the state spaces that have a positive transition probability of a jump from one to the other. This permits us to connect the mixing properties of the chain with facts about geometry of this graph.
To illustrate this on an example, consider a graph that consists of two bulky components connected only by a few edges. Clearly, it will take quite a long time to exit one component and discover the other. Naturally, one is thus lead to comparing the size of a set with the size of its boundary which is expressed very well in terms of aforementioned isoperimetric inequalities.
In what follows we will rely on a result from a recent work by Morris and Peres [99] which we find particularly attractive for its probabilistic flavor. Consider a countable state Markov chain with state space V , transition kernel P and a stationary reversible measure π. For a finite set A ⊂ V , we will measure the boundary via that expresses the size of the least possible surface-to-volume ratio for all sets with volume less than r. We can call this function the isoperimetric profile. Its computation is often facilitated by the following fact: Exercise 5.2. Show that in (5.7) we can restrict to A that are connected -in the sense that for every x, y ∈ A there is a time n with P n (x, y) > 0.
We now quote verbatim Theorem 2 of [99]: For all ǫ > 0, all x, y ∈ V and all n satisfying The restriction to uniformly positive holding probability, P(x, x) ≥ γ, is a technical nuisance in applications that often requires analyzing a modified chain that has this property.

Heat kernel on supercritical percolation cluster
It is quite instructive to check how Theorem 5.3 implies the usual bound for the simple random walk and/or elliptic nearest-neighbor environments. However, we will instead do something far less trivial; namely, we will show how this theorem applies in the case of the random walk on the supercritical percolation cluster.
Theorem 5.4. Suppose d ≥ 2 and p > p c (d). There is a constant c = c(d, p) < ∞ and a random variable n 0 = n 0 (ω) such that for almost every sample of the bond-percolation cluster C ∞ containing the origin, we have For a finite set A ⊂ C ∞ (ω), let ∂ ω A denote the set of open edges in ω with exactly one endpoint in A. A simple observation yields If ω := 1 for all edges, then ∂ ω A = ∂A. In such circumstances, one has the isoperimetric inequality of the form: There is a constant c = c(d) > 0, such that This inequality cannot hold on C ∞ because the infinite component contains arbitrarily long one dimensional (and other) pieces. However, we can have this for connected sets that are not too small compared to their distance to the origin:  c 1 (d, p) and c 2 = c 2 (d, p) and an a.s. finite random variable R 0 = R 0 (ω) such that for each R ≥ R 0 and each ω-connected A satisfying There have been a number of proofs of this and/or related results, see e.g. Benjamini and Mossel [10], Heicklen and Hoffman [78], Mathieu and Remy [102], Barlow [5], Berger-Biskup-Hoffman-Kozma [12], Pete [115]. We will not prove this claim here for all p > p c (d) as the proof uses non-trivial facts from percolation theory. However, for p very close to 1 there is a much simpler argument due to Benjamini and Mossel: Exercise 5.6. Show that once p is sufficiently close to one, there is a constant c 1 ∈ (0, ∞) and a random variable Note that from here we will immediately have (5.14) via (5.12). In order to see how (5.14) feeds into Theorem 5.3, note that the Markov chain by time 2n will not leave the box [−2n, 2n]. Thus set R := 2n + 1, pick θ ∈ (0, 1 / 2 ) and for A ∈ [−R, R] d ∩ Z d connected let us estimate the ratio in the definition of φ(r) by c|A| −1/d when |A| ≥ R θ and by cR −θ when |A| ≤ R θ . (In the second step we used that |∂ ω A| ≥ 1.) It follows that Plugging this into (5.8), the integral is at most cR 2θ log R + cǫ −2/d . This will be less than n for ǫ := cn −d/2 . The inequality (5.10) then follows by applying Theorem 5.3. A natural consequence of Theorem 5.4 is the result that was first proved by Grimmett, Kesten and Zhang [75] by rather different methods (see also Problem 1.16): Corollary 5.7. The simple random walk on (a.e. realization of ) the supercritical percolation cluster is recurrent in dimension d = 2 and transient in dimensions d ≥ 3.
Proof. As explained in Sect. 1.3, it suffices to resolve the d = 3 case, but we can cover all d's just as well. From Lemma 5.1 and Theorem 5.4 we know that P 0 ω (0, 0) ≍ n −d/2 . This is summable in dimensions d ≥ 3 and non-summable in d = 1, 2. The summability is then equivalent to the finiteness of the full-lattice Green's function which via (1.41) is then equivalent to transience.

Anomalous decay
From the perspective of nearest-neighbor random walks on Z d , the case of the supercritical percolation is a prototype of a non-elliptic situation. However, when we think of this walk as the simple random walk on the graph C ∞ , it is as elliptic as the one can ever hope for. Indeed, any edge in the graph C ∞ has conductance one and the ellipticity contrast -the difference between a maximal and minimal possible value of the conductance over each edge -is zero. The difficulties in the understanding of this walk on C ∞ is thus not the lack of ellipticity but the intricacies of its random geometry.
From this point of view it is natural to ask what happens when ellipticity gets violated in a robust way. This naturally leads to consideration of i.i.d. nearest-neighbor environments where the law of the individual conductances is unbounded either from zero or from infinity (or both). The point is that both situations can lead to trapping effects although each of them for a slightly different reason. We will henceforth focus on the former case and refer to Barlow and Deuschel [7] for the latter.
Suppose, from now on, that the ω's are nearest-neighbor, i.i.d. with P(0 < ω b ≤ 1) = 1 and essinf(ω b ) = 0. (5.17) Our assumption implies that P(ω b ∈ ·) has no atom at zero. Thus all nearestneighbor jumps on Z d are allowed for the random walk, but some of them may be very unlikely.
It is easy to check that for i.i.d. distribution with these properties, the isoperimetry methods sketched above yield a vacuous conclusion. The situation becomes even more suspicious after an inspection of the work of Fontes and Mathieu [59] in which they design a family of models -not with i.i.d. conductances but close enough -in which the expected diagonal heat kernels, EP 2n ω (0, 0), decay arbitrarily slowly with n. Of course, this could be just a result of taking an average over the environment (remember that we are talking about events whose probabilities decay to zero) so one is naturally intrigued by what the typical (quenched) decay of P 2n ω (0, 0) might be. It will not be too surprising that in d = 1 the trapping can be quite severe even for typical ω. Indeed, the following is an interesting exercise: for n large, along a deterministic subsequence n k → ∞.
A moment's thought -and a right idea -then shows that interesting new behavior may actually occur even in high-enough dimensions. Consider the following example from the paper of Berger, Biskup, Hoffman and Kozma [12]: Fix a sequence λ n → ∞ and define a trap of order n to be the configuration in P P P P P P P  Now one just beefs up the lower tail of P so that, along a deterministic subsequence n k → ∞, we have ℓ n = o(log λ n ). We have a proof of: Theorem 5.9. Suppose d ≥ 5. For each λ n → ∞ there exists an i.i.d. conductance law P satisfying P(0 < ω b ≤ 1) = 1, a deterministic sequence n k → ∞ and a P-a.s. positive random variable C = C(ω) > 0 such that for each n ∈ {n k }, Notice that the above argument yields a similar bound in all dimensions d ≥ 2, but this bound has no significant value in dimensions d = 2, 3, 4 as (by the CLT proved by Mathieu [100] and, independently, Biskup and Prescott [18]) P 2n ω (0, 0) decays at least as n −d/2 ; cf Lemma 5.1. But in d ≥ 5 this shows that the heat kernel may decay more slowly than n −d/2 and, in particular, there is no way that a diffusive heat kernel upper bounds would generally hold.
An interesting question is whether (5.20) is the worst one can do. The answer turns out to be, more or less, in the affirmative: Theorem 5.10. For any nearest-neighbor, i.i.d. conductance law P with P(0 < ω 0,ê ≤ 1) = 1 there is a random variable C = C(ω) < ∞ such that In addition, we have All except (5.23) in this result is due to Berger, Biskup, Hoffman and Kozma [12]; the property (5.23) was derived only recently in Biskup, Louidor, Rozinov and Vandenberg-Rodes [16]. The latter group has also shown that, in many cases where the heat kernel decays subdiffusively, the trapping phenomenon described in the example above actually occurs: the path spends n − o(n) of time in a very small spatial region.
Notice that (5.20) and (5.22) nicely complement each other: anything up to, but no worse than, o(n −2 ) decay can occur in d ≥ 5. A question remains whether the log n factor in d = 4 is an artifact of the proof or a real phenomenon. This was solved recently by Biskup and Boukhadra [15] who constructed an environment, for each sequence λ n → ∞, such that P 2n ω (0, 0) ≥ log n n 2 λ n , (5.24) eventually, along a deterministic subsequence n k → ∞. The construction is quite involved because in d = 4 the trapping occurs more or less equally likely over a whole range of exponentially-growing spatial scales (hence the log n factor).

Conclusions
The upshot of the above results and derivations is that with the random conductance models we are finding ourselves in a somewhat unusual situation when the path distribution satisfies a non-degenerate functional CLT and yet the heat kernel decays anomalously; i.e., we have a CLT without local CLT. Although this may contradict intuition, there is nothing wrong about this: a CLT is a statement about the bulk of the distribution and a local-CLT is a statement about the tails. There is no particular reason why these should match one another.

Applications
In this section we will try to address some aspects of the applications that were introduced in the first section of these notes. Specifically, we will discuss homogenization of discrete parabolic (random) problems, scaling limit of associated Green's functions, convergence of random Gaussian gradient models to Gaussian Free Field and, finally, applications to electrostatics.

Some homogenization theory
The phrase "homogenization theory" usually refers to a diverse set of methods and ideas that address one of the fundamental problems of material science: the computation of macroscopic material constants and characteristics (e.g., heat or electric conductivity, resistivity, etc) from the microscopic properties. One of the typical mathematical issues resolved by homogenization theory concerns differential equations: Although the microscopic quantities evolve according to an differential equation with rapidly varying coefficients, properly rescaled macroscopic versions thereof are governed by equations with smooth coefficients. We will not go into the subject and history of homogenization theory in any further detail; these can be found in the literature, e.g., the monograph by Jikov, Kozlov and Oleinik [83]. Instead, we will attempt to demonstrate the conclusions on an example of heat conduction.
Suppose that some material of a rapidly varying internal microscopic internal structure -described at the lattice level of spacing ǫ by a configuration of conductances ω -is put in a macroscopic temperature profile at time 0. At the lattice level, the evolution of the temperature profile with time is described by the Cauchy problem where L ω is the operator (1.13) (acting only on the x coordinate) that represents the microscopic diffusive properties of the material and f is the initial temperature profile. Our first question concerns the existence and uniqueness of the solution. We note the classical fact: Suppose ω is a sample from an ergodic measure P with Eπ ω (0) < ∞. Let X t denote the variable-speed continuous-time Markov chain on Z d with generator L ω . Pick f : Z d → R bounded. Then is the unique solution to (6.1) which is bounded in both t and x.
Proof. By Exercise 2.8 and the general theory expounded in, e.g., Liggett [95], the conditions on ω guarantee that a stochastic solution to the backward Komogorov equations (1.14-1.15) exits and the semigroup for the VSRW is well defined. The fact that (6.2) is a solution is then a consequence of a direct calculation. Indeed, we have and the boundedness of f and finiteness of π ω permit us to exchange the sum over z with the time-derivative and L ω . Hence, u satisfies (6.1). The remaining issue is thus a proof of uniqueness among bounded solutions. Letũ(t, x) be such a solution and, for 0 ≤ s ≤ t, consider the random variable and let F s := σ(X r : 0 ≤ r ≤ t). Then {M s , F s } 0≤s≤t is a martingale. Indeed, by the Markov property, on the event {X s = z} we have almost surely for every s. Integrating over final intervals and applying the Bounded Convergence Theorem proves that {M s , F s } 0≤s≤t is a martingale. (At s = t we apply continuity from the left.) The Optional Stopping Theorem then yields The uniqueness is proved as well.
Exercise 6.2. Construct a configuration of nearest-neighbor conductances on Z for which there is a non-zero solution to (6.1) with u(0, ·) := 0.
Our next goal is to describe the asymptotic of the solution for the situation when f is a macroscopic profile over a lattice of spacing ǫ. Fix a function f : R d → R in L 1,loc (dx) and let u (ǫ) (t, x) denote the unique bounded solution to (6.1) with initial data Under diffusive scaling of space and time, we get the quantity u ǫ (t, x) := u (ǫ) tǫ −2 , ⌊xǫ −1 ⌋ , t ≥ 0, x ∈ R d . (6.10) Theorem 6.3. Suppose f : R d → R obeys f 2 L 2 (R d ) + ∇f 2 L 2 (R d ) < ∞ and let P be a law on the conductances satisfying the "usual conditions," (3.23) and By translation-invariance of P and the Cauchy-Schwarz inequality, (6.15) where x ǫ (z) := ǫ⌊xǫ −1 ⌋ + ǫz. Our first step is to replace x ǫ (z) by x in the argument of the first f on the right-hand side. The difference tends to zero when ǫ ↓ 0 because we have To control the remaining difference, we note that, by the Annealed CLT (in analogy with Corollary 4.6) there exists a coupling Q 0 ω of the random walk X tǫ −2 and the Brownian motion B t such that, for any δ > 0 and any t > 0, Picking an arbitrary δ > 0, the bound

M. Biskup
then shows that the expectation on the left tend to zero as ǫ ↓ 0 (followed by δ ↓ 0). The proof is then finished by noting that 2 dx ≤ LHS of (6.18), (6.19) as implied by using Cauchy-Schwarz one last time. Theorem 6.3 exemplifies a statement in homogenization theory. Indeed, a solution to the parabolic problem with rapidly varying coefficients does behave, at a large scale, as a solution to a parabolic problem with constant coefficients. As is seen from Exercise 4.17, the coefficients in the equation, namely, the entries in the symmetric, positive semi-definite matrix (q ij ) in q ij ∂ 2 f ∂x i ∂x i (6.20) are given by where Ψ(ω, x) is the harmonic coordinate discussed at length in Section 3. Notice that these are characterized by a variational problem where λ = (λ 1 , . . . , λ d ) ∈ R d and where ϕ : Ω → R runs over all local functions. This is the same variational problem that defines the corrector. This is the desired formula that at least in principle allows us to compute material coefficients from its microscopic properties.

Green's functions and gradient fields
The arguments in the previous section can be cast in a more symmetric form provided we are willing to invoke some functional analysis. Given an operator O on ℓ 2 (Z d ) with coefficients O(x, y) := δ x , Oδ y , we can interpret it as an operator on L 2 (R d ) by way of O ⌊x⌋, ⌊y⌋ f (x)g(y) dx dy. (6.23) For any f ∈ L 2 (R d ) define f ǫ (x) := ǫ d/2+1 f (xǫ). (6.24) In this notation, the statement of Theorem 6.3 implies: Corollary 6.4. For any smooth functions f, g : R d → R of compact support, g, e tQ f L 2 (R d ) , in L 2 (P). (6.25) Proof. Just note that, in the notation of Theorem 6.3, ǫ −2 g ǫ , e tǫ −2 Lω f ǫ = g, u ǫ (t, ·) while g, e tQ f = g,ū(t, ·) . These tend to each other as u ǫ (t, ·) → u(t, ·) in L 2 (dx) ⊗ L 2 (P).
Corollary 6.4 supplies the core idea underlying the proof of our next result: Theorem 6.5. Consider any ergodic law P on nearest-neighbor elliptic conductances and pick any f, g : R d → R that are smooth and of compact support. In d = 1, 2 assume in addition that the integral of f and g over R d equals zero. Then in L 2 (P). (6.26) Proof. (Sketch) We only sketch the main ideas; details for this setting can be found in the work of Biskup and Spohn [19]. All inner products will be those in L 2 (R d ) so we will not make this notationally explicit. First let us note that both inner products are well defined. Indeed, −L ω is self-adjoint and positive semi-definite with empty kernel (in ℓ 2 (Z d )). Moreover, it is invertible on all functions of finite support in Z d subject to -in dimensions d = 1, 2 -the condition of a vanishing total sum. Uniform ellipticity gives us the following inequality between norms: where ∆ is a continuum Laplacian and where the passage from discrete to continuum Laplacian is due to [19, Lemma 2.2]. As is not hard to check, replacing f by f ǫ on the left and using that f ǫ 2 = ǫ 2 f 2 while f, (−∆) −1 f ǫ = f, (−∆) −1 f , the bound still holds all ǫ ≤ 1. The family in (6.26) is thus uniformly bounded.
By the polarization identity, it suffices to prove the claim for g := f . To this end we notice the following representation where we scaled t by ǫ −2 in the second line. By Corollary 6.4, the integrand on the right-hand side tends to f, e tQ f so, ignoring the important issue whether we are able to interchange the limit and the integral, we should have The right hand side is again bounded by the fact that Q is uniformly elliptic, and it equals the term f, (−Q) −1 f . The key technical point of the proof is thus the control of the tails of the integral in (6.28). This is a non-trivial problem where we will have to invoke, once again, heat-kernel estimates. This is easier in dimensions d ≥ 3 where it suffices to invoke the result of Delmotte [41]: x, y ∈ Z d , t ≥ 0, (6.30) with some constant c ∈ (0, ∞), uniformly in ω -subject to the strong-ellipticity condition. For f ∈ L 1 (dx) this yields This is uniformly integrable in all dimensions d ≥ 3.
In dimension d = 1, 2 one needs a corresponding bound on the gradient of the heat kernel. Such bounds were proved in the annealed setting by Delmotte and Deuschel [42]. See Corollary 4.3 in [19] for details.
The above conclusions permit a statement on the random Gaussian field introduced in Problem 1.22. Indeed, let (φ x ) be a sample from the Gaussian measure with zero mean and covariance (−L κ ) −1 , for a collection of nearestneighbor elliptic conductances κ. Recall the notation φ ǫ (f ) from (1.42). Then we have: Corollary 6.6. Suppose f is smooth with compact support and (in d = 1, 2) of zero total integral. As ǫ ↓ 0, the law of φ ǫ (f ) tends to that of a Gaussian with mean zero and limiting variance Var φ ǫ (f ) −→ ǫ↓0 f, (−Q) −1 f L 2 (R d ) (6.32) in P-probability, where Q is the generator of the limiting Brownian motion.
Proof. As φ ǫ is Gaussian, it suffices to prove the convergence of the variances, i.e., (6.32). This is (6.26) in disguise.
The key point is that since the limit is non-random, the same will be true even if the law of the φ's is further averaged over κ. This permits the main conclusion of the paper of Biskup and Spohn [19] which repharse as follows: Theorem 6.7 (Scaling to GFF). Suppose V is as in (1.44) with ̺ compactly supported in (0, ∞). Let µ be a gradient Gibbs measure for the potential V which we assume to be ergodic with respect to the translations of Z d and to have zero tilt. Then for every f ∈ Dom((−∆) −1/2 ), the law of φ ǫ (f ) tends to a Gaussian with mean zero and covariance where Q −1 is the inverse of the operator (6.20).
We note that the above Gaussian field with random (ergodic) covariance structure has been (probably first introduced and) studied by Caputo and Ioffe [27,Section 4.5]. Their motivation was to provide a link between the derivative of the exponential rate function for changing the tilt of the field -the so called surface tension -and the diffusivity of the corresponding random walk among random conductances. For the above Gaussian case, this link is verified by a direct calculation, but for general uniformly-convex interactions -for which one still has a random-walk representation (Naddaf and Spencer [107], Giacomin, Olla and Spohn [65]) -it remains conjectural despite serious effort.
Passing to m → ∞ we thus construct a minimizer ϕ ω on Z d with E(ϕ ω ) < ∞. By adding small perturbations, we find that ϕ ω solves (6.37). The identity (6.40) and the fact that E(ϕ) = 0 only for constants then shows that ϕ ω with a prescribed value at one lattice site is unique.
Having dismissed the questions of existence and uniqueness, let us now investigate what happens when we scale the lattice to have spacing ǫ and scale the charge density to maintain a fixed macroscopic profile. As we will see, the following is just a rewrite of results proved earlier: (This is well defined as f ∈ Dom((−Q) −1 ).