Full-path localization of directed polymers

Certain polymer models are known to exhibit path localization in the sense that at low temperatures, the average fractional overlap of two independent samples from the Gibbs measure is bounded away from $0$. Nevertheless, the question of where along the path this overlap takes place has remained unaddressed. In this article, we prove that on linear scales, overlap occurs along the entire length of the polymer. Namely, we consider time intervals of length $\varepsilon N$, where $\varepsilon>0$ is fixed but arbitrarily small. We then identify a constant number of distinguished trajectories such that the Gibbs measure is concentrated on paths having, with one of these distinguished paths, a fixed positive overlap simultaneously in every such interval. This result is obtained in all dimensions for a Gaussian random environment by using a recent non-local result as a key input.


Introduction
In statistical physics, the phenomenon of localization refers to the tendency of disordered systems, especially at low temperatures, to revert to one of a small number of energetically favorable states, even as the size of the system diverges. Beginning with Anderson's formative work [2], it has been a general goal to describe conditions (e.g. the presence of random impurities, random interaction strengths, or random geometry) under which localization occurs. Often, for a given model, a main challenge is to characterize free energy non-analyticities-which may be already difficult to rigorously detect-as separators between non-localized and localized phases. If this can be done, it gives rise to the task of more precisely quantifying the system's behavior in either regime, the localized phase being more physically anomalous and thus harder to predict. This paper focuses on such questions for directed polymers in random environment. Defined in the next section, this model was introduced by Huse and Henley [22] to study interfaces of the Ising model subject to random impurities, and later adopted in the mathematics literature by Imbrie and Spencer [23] as a model for polymer growth in random media. At low temperatures, directed polymers exhibit localization properties which have been a frequent object of study over the last forty years; a nearly complete survey is provided in the book of Comets [12], and related models are discussed in [19].
Most of the literature on localization has focused on the polymer's endpoint distribution, but very recently there has been progress in proving pathwise localization.
For certain random environments at sufficiently low temperature, it is now known that if two polymers are sampled independently under the same environment, then with non-vanishing probability they will intersect for a non-vanishing fraction of their length. However, owing to the global nature of this property, the results to date provide little information on the local structure needed to produce this effect; for instance, where these intersections occur. The main purpose of this paper is to provide a first result in this direction, stated as Theorem 1.3 in Section 1.3.
Interestingly, the central input to the proof is a recent non-local path localization result from [7]. This plan of attack is natural from the perspective of random walks (from which polymers are defined), whose structure of i.i.d. increments frequently allows one to translate between local and global information. For directed polymers, however, there is no obvious renewal feature to function in the same way. Fortunately, we identify as a weak surrogate a multi-temperature free energy expression (4.3) that permits one to analyze isolated segments of the polymer. This technique is summarized in Section 1.4 and may be of independent interest.
1.1. The model: directed polymers in Gaussian environment. Let σ = (σ i ) i≥0 denote simple random walk on Z d starting at the origin. We will write P to denote the law of σ in the space (1.1) equipped with the standard cylindrical sigma-algebra. Expectation with respect to P will be denoted by E(·). Next let g = (g(i, x) : i ≥ 1, x ∈ Z d ) be a collection of i.i.d. standard normal random variables, supported on some probability space (Ω, F, P). Expectation according to P will be denoted by E(·). The infinite collection g is called the disorder or random environment, and defines a family of Hamiltonians on Σ, At inverse temperature β ≥ 0, the associated Gibbsian polymer measure is given by µ N,β (dσ) := 1 Z N (β) e βH N (σ) P (dσ), (1.2) where Z N (β) := E(e βH N (σ) ) is the random normalization constant known as the partition function. As a function of N , the partition function grows exponentially with a limiting rate p(β) called the free energy. Moreover, for any u > 0 we have (1.4) Consequently, the following limit holds for every β ≥ 0: lim N →∞ log Z N (β) N = p(β) P-a.s. and in L α (P) for all α ∈ [1, ∞).
(1.5) Given this paper's methods, the following observation will help avoid some technical concerns.
Remark 1.1. A priori, the validity of (1.5) might depend on the fact that the random variables defining H N (·) also appear in H M (·) for M ≥ N . On the contrary, because of (1.4), the statement (1.5) is still true if one takes Henceforth, we will take (1.6) as the definition of H N . The distribution of µ N,β does not change; only the joint law of (µ N,β ) N ≥1 is affected, and we will not be concerned with the latter object.
We will be interested in the relationship between p(β) and the overlap function, where the dependence of R(·, ·) on N is understood. The degree to which the model localizes can be measured by the typical size of R(σ 1 , σ 2 ) when σ 1 and σ 2 are sampled independently from µ N,β . For instance, if β = 0, then µ N,0 returns the simple random walk P , and classical results give the overlap's rate of decay: Considering that R(σ 1 , σ 2 ) → 0 as N → ∞ in any one of these cases, it is a striking fact that when disorder is introduced at sufficiently large β > 0, this overlap remains bounded away from 0 (in various senses made precise in Section 1.2). As suggested earlier, the free energy provides an understanding of this dichotomy as a phase transition between high and low temperatures. In the following statements, the function β 2 /2 appears because it is the logarithmic moment generating function of the standard normal distribution.
The high temperature phase 0 ≤ β < β c is thought to indicate a polymer measure still resembling simple random walk; a result to this effect is [17,Thm. 1.2]. On the other hand, in the low temperature phase β > β c , the polymer measure is expected to be so attracted by favorable regions in the random environment that it concentrates near them. The question then is how to relate the condition p(β) < β 2 /2 to this localization, as measured by the overlap function.
Remark 1.2. The function log Z N (·) is a logarithmic moment generating function and thus convex. It thus follows from (1.3) that p(·) is also convex and hence differentiable almost everywhere. It is believed (see [12,Conj. 6.1]) that there are actually no points of non-differentiability, and moreover that p (β) < β for all β > β c . If this is true, then p (β) < β is equivalent to the low-temperature condition p(β) < β 2 /2.

1.2.
Background. The model we have defined makes sense if g is replaced by any family of disorder variables. The i.i.d. assumption is completely standard, and it is only out of methodological necessity that we have assumed Gaussianity. The Gaussian case happens to be one of the few for which some version of path localization has been rigorously established, but the phenomenon is anticipated in much greater generality.
The first path localization result for (1.2) appeared in [12, Thm. 6.1], although the relevant computation was already present in the work of Carmona and Hu [10,Lem. 7.1]. Adopting a Gaussian-integration-by-parts idea used in continuous models [14,18] and earlier in the spin glass literature [1,15,26,24], one can show 1 that if p(·) is differentiable at β, then In particular, when p (β) < β (by Remark 1.2, this is the presumed characterization of low temperature), the average overlap between independent polymer paths has a nonzero limiting expectation. In other words, if σ 1 and σ 2 are sampled independently from µ N,β , then there is a nonzero chance that their fractional overlap R(σ 1 , σ 2 ) is at least some fixed positive number. For continuous models, analogous results can be found in [14,18] as well as [13,Sec. 5.5].
For as elegantly simple as (1.7) is to prove, it only tells us that the previous sentence is true on an event of nonzero P-probability. One should like said probability to be asymptotically equal to 1, meaning the specific realization of the disorder is irrelevant. Such was the advancement provided by Chatterjee [11], for sufficiently large β and a certain class of bounded random environments. In [7], Bates and Chatterjee proved an analogous (but less quantitative) statement in the Gaussian case, and then bootstrapped that result to the following, stronger one. The key feature is that the number J of distinguished paths has no dependence on the polymer length N , although the distinguished paths themselves, called σ 1 , . . . , σ J , are random and do depend on N .
While this brief overview has mentioned essentially all that has been proved about path localization (at least for the discrete model considered in this paper), much more is known about localization of the endpoint distribution µ N,β (σ N ∈ ·). The state of the art goes well beyond the Gaussian case or even simple random walks (on the latter point, see [5,4,29] and references therein), and there is even a one-dimensional exactly solvable model [25] admitting an explicit limiting law for the endpoint distribution [16]. The reader is referred to [6] for a review of the literature.
Finally, a somewhat orthogonal direction of work considers directed polymers in heavy-tailed random environments, mostly in d = 1. In this setting, the degree of localization is much greater (e.g. [3, Thm. 2.1]), and so the interesting questions arise from taking β = β N → 0, where the rate of decay is determined by the index of the heavy tail [20,27,9,8]. Further discussion can be found in [12,Sec. 6.4]; see also [28].
1.3. Main result. The goal of this article is to go beyond the single statistic R(σ 1 , σ 2 ). Although it serves as a natural gauge for localization, it does little to illuminate the geometry of localized polymers. For instance, if we know R(σ 1 , σ 2 ) is bounded away from zero, can we say something about the set of i ∈ {1, 2, . . . , N } for which σ 1 i = σ 2 i ? Our main result addresses this question. For integers a ≤ b, let va, bw denote the integer interval {a, a + 1, . . . , b}. Given 1 ≤ a ≤ b ≤ N , consider the restricted overlap, By examining these restricted overlaps, we will prove that the intersection set mentioned above is dense in v1, N w. Mirroring the language of Theorem C, we make this assertion precise as follows. Theorem 1.3. Assume β > 0 is a point of differentiability for p(·) with p (β) < β. Then for every ε > 0, there exist integers J = J(β, ε), N * = N * (β, ε) and a number δ = δ(β, ε) > 0 such that the following is true for all N ≥ N * . With P-probability at least 1 − ε, there are paths σ 1 , . . . , σ J ∈ Σ satisfying So at low temperatures and up to negligible events, a sample from the polymer measure localizes around one of a fixed number of distinguished paths, and this localization takes place along the entire length of the path; that is, in every interval of size at least εN . It is this latter part that is the contribution of the present article.
An important comment is that the statement of Theorem 1.3 concerns fixed β, which can be arbitrarily close to β c . Prior to this result, it was only possible to make guarantees about localization away from the polymer's endpoint if β were sent to ∞ , N ], where L is some large integer. Our first observation is that Theorem 1.3 will be implied by this special case, which is stated as Theorem 2.1. Indeed, for a given ε > 0, we can choose L large enough that the regular subintervals are somewhat smaller than εN and thus actually contained in any interval I of size εN . In this way, positive overlap in the regular subintervals will imply positive overlap in I. This is the content of Section 2.
Another difficulty of Theorem 1.3 is that we demand the same distinguished path σ j to be used in every subinterval va, bw of appropriate size. The steps of the previous paragraph do not remove this requirement, and so our second reduction is to a version of Theorem 2.1 that allows the index j to depend on which regular subinterval ( ( −1)N N , N L ] is considered. This yet weaker result is stated as Theorem 3.1, and the reduction argument is given in Section 3. The rough idea is to concatenate segments of distinguished paths in order to produce a larger set but still of O(1) size, so that whenever a path σ had intersected two distinct distinguished paths in consecutive subintervals, it will now intersect a single concatenated path in both subintervals. This procedure can be carried out by demanding slightly less overlap in each regular subinterval.
Having made these reductions, we are left to prove that for each regular subinterval, one can (with high probability) find a bounded number of paths such that a sample from the Gibbs measure will (with high probability) have non-vanishing overlap in the given subinterval with at least one of these paths. This statement could be easily proved if one were able to apply Theorem C within each subinterval and then take an appropriate union bound. The seeming obstruction is that the marginal of µ N,β in a given subinterval is not a polymer measure of the same form as µ N,β . Moreover, this marginal depends on the environment at all times, not just those within the subinterval. Nevertheless, we can regard these marginals as polymer measures with respect to random reference measures. That is, we replace P in (1.2) by a random measure which, crucially, is determined entirely by the environment outside the given subinterval. Correspondingly, the Hamiltonian H N (·) is replaced by a sum depending only on the environment inside said subinterval, the remaining disorder having been absorbed into the random reference measure.
In this setup, Theorem C still does not quite apply because it assumes a specific reference measure P . Fortunately, we can appeal to a more general result of [7] from which Theorem C was derived. We recall this general result as Theorem D in Section 5. The only hypothesis to check is that µ N,β still admits a limiting free energy with respect to the random reference measures. To prove this fact, we introduce in Section 4 a "multi-temperature free energy" that, as a special case, can ignore the disorder in a given subinterval. Convergence of this generalized free energy is stated in Theorem 4.1 and proved using modifications of standard techniques. Finally, Theorem D is invoked in Section 5, where further technical issues are addressed en route to proving Theorem 3.1.

Reduction to regular subintervals
Given positive integers N and L, let 0 = n 0 (N ) ≤ n 1 (N ) ≤ · · · ≤ n L (N ) = N be any sequence satisfying We will think of L as fixed throughout, and then such a sequence will be chosen and fixed for each N . In other words, we partition the integer interval v1, N w into L parts of the form vn −1 (N ) + 1, n (N )w, whose sizes are as close to equal as possible.
For the sake of exposition, let us call these parts regular subintervals. The fractional overlap between σ 1 , σ 2 ∈ Σ in the th subinterval will be denoted by When we are not varying N , we will simply write n in place of n (N ).
The following special case of Theorem 1.3 will allow us to prove the general case.
Then for every ε > 0 and positive integer L, there exist integers J = J(β, ε, L), N * = N * (β, ε, L) and a number δ = δ(β, ε, L) > 0 such that the following is true for all N ≥ N * . With P-probability at least 1 − ε, there are paths σ 1 , . . . , σ J ∈ Σ satisfying Given this result, we now show that Theorem 1.3 readily follows by identifying regular subintervals lying within a given interval va, bw of size at least εN .
Let us first address the case when b − a + 1 ≤ 2εN + 1. In particular, assuming Asymptotically we know (n − n −1 ) ∼ N/L as N → ∞, but let us just use the trivial bounds For any σ 1 , σ 2 ∈ Σ, the inclusion vn −1 + 1, n w ⊂ va, bw now gives Therefore, the following implication is true: If b − a + 1 > 2εN + 1, then we can partition va, bw into disjoint subintervals, all having sizes at least εN but no larger than 2εN + 1. The argument from above applies to each of these subintervals, and so We can now deduce Theorem 1.3 from Theorem 2.1 by replacing δ with δ/18.

Reduction to independent subintervals
We continue using the notation of regular subintervals introduced in Section 2, where the task of proving Theorem 1.3 was reduced to showing Theorem 2.1. In this section, we reduce Theorem 2.1 to the following, yet weaker statement.
Then for every ε > 0 and positive integer L, there exist integers J = J(β, ε, L), N * = N * (β, ε, L) and a number δ = δ(β, ε, L) > 0 such that the following is true for all N ≥ N * . With P-probability at least 1 − ε, there are paths σ 1 , . . . , σ J ∈ Σ satisfying Assuming this result, it remains to show that the same overlapping distinguished path can be taken in all L regular subintervals (i.e. exchanging the intersection and the union displayed above). We now argue that this can be done by increasing J an O(1) amount and choosing δ appropriately smaller.
Proof of Theorem 2.1. Let ε > 0 and a positive integer L be given. Then take J, N * , and δ ∈ (0, 1] as in Theorem 3.1 so that for all N ≥ N * , the following event occurs with P-probability at least 1 − ε. There exists a random set of paths We will henceforth assume this event occurs. Set K := 12/δ . By possibly making N * larger, we may assume N is such that We note for later that this assumption implies and also that δ ≤ 1 implies 12 Given D 1 , let us perform the following inductive procedure. For each = 1, . . . , L, partition the interval vn −1 + 1, n w into K subintervals vm ( ) . . , K, whose sizes are as close to equal as possible. That is, we choose a sequence    rule.) We include the following concatenated path, which we call C ( ) k (σ , σ ), as an element of D + (see Figure 1): Once this procedure has been performed for all ordered pairs (σ , σ ) ∈ D × D with σ = σ , the construction of the set D + is complete. Note that |D + | ≤ |D |(|D | − 1)(K − 1), and so |D +1 | ≤ K|D | 2 , which leads to the upper bound In particular, |D L | is bounded by a constant independent of N . We now claim that In light of (3.1) and the earlier observation regarding the cardinality of D L , the containment (3.7) establishes the conclusion of Theorem 2.1 after replacing J by 13 δ 2 L−1 −1 J 2 L−1 and δ by δ 2 /104. In order to prove (3.7), we reduce to the following claim.
If instead σ = σ , we recall the subintervals of vm Therefore, there must be at least two distinct values of k ∈ v1, Kw for which since otherwise we would have the contradictory bound Let us call these two values k 1 and k 2 , where k 1 < k 2 . We will now argue that the time t  Now, we know that σ t = σ t for some t ∈ vn + 1, n +1 w, simply by the fact that R ( +1) (σ , σ) > 0. We claim that any such t must satisfy t ≥ t to σ s = σ s , and then following σ from σ s to σ t = σ t , we will have constructed a nearest-neighbor path connecting σ Now, the argument of the previous paragraph showed that for any t ∈ vn + 1, n +1 w such that σ t = σ t , we necessarily have t ≥ t ( ) k 1 (σ , σ ) and thus σ t = σ t = σ t by (3.12). In particular, we have R ( +1) (σ , σ) ≥ R ( +1) (σ , σ) ≥ δ, thus verifying (3.9c). On the other hand, (3.9a) follows from (3.11) because m ( ) k 1 > n −1 . Finally, to obtain (3.9b), observe that

Multi-temperature free energy
As outlined in Section 1.4, our proof strategy in Section 5 will require us to isolate the Hamiltonian on each regular subinterval vn −1 + 1, n w. Mechanically, this can be done by letting the inverse temperature β depend on time and setting it equal to zero outside the interval vn −1 + 1, n w. As it turns out, it will be easier to take the complementary route of setting the inverse temperature to zero inside vn −1 + 1, n w, and keeping it unchanged outside. Either choice changes the free energy, of course, and we will need to show that a statement analogous to (1.5) still holds. It will not be any more difficult, however, to allow the inverse temperature to assume a different value on each interval vn −1 + 1, n w, ∈ v1, Lw.
In what follows, we will use the notation P N,L to denote the partition P N,L : 0 = n 0 (N ) ≤ n 1 (N ) ≤ · · · ≤ n L (N ) = N, (4.1) which is chosen to satisfy (2.1). Let β = (β 1 , . . . , β L ) ∈ [0, ∞) L and consider the Hamiltonian The associated partition function will be written as The concentration result stated below shows that the multi-temperature expression 1 N log Z P N,L (β) is well-approximated by an average of single-temperature free energies. Ultimately, we will need only the asymptotic statement (4.3), but with minimal extra effort we can prove (4.2) as an intermediate step.
where β max := max{β 1 , . . . , β L }. Consequently, In the proof, we will use the following simple lemma. Proof. We prove the contrapositive. That is, assume f (X) ∈ [y − u, y + u] with probability one, for some y ∈ R. Without loss of generality, we may assume f is non-decreasing; if not, we apply the argument to −f . If X is an almost sure constant, then f (Υ) = Υ f . Otherwise, there exists δ > 0 such that each of {X ≤ Υ − δ} and {X ≥ Υ + δ} occur with positive probability. In this case, we must have Of course, we also know Υ f ∈ [y − u, y + u], and so |f (Υ) − Υ f | ≤ 2u.
Proof of Theorem 4.1. We proceed by induction on L. Our inductive hypothesis is that for any u > 0, any integer N ≥ L 2 , and any partition P N,L : 0 = n 0 (N ) < n 1 (N ) < · · · < n L (N ) = N that is valid in the sense of (2.1), we have where (1.4) provides the base case of L = 1. Once we prove the inductive step, (4.4) will yield (4.2) with C 1 = 3 d · 4L and c 2 = 1/(18L 2 ). Then, by standard arguments, it follows that This limit is seen to be equivalent to (4.3) once we recall (1.3). Therefore, the rest of the proof is establishing the induction. We thus assume (4.4) and consider any β L+1 ∈ [0, ∞) and any partition P N,L+1 : 0 = n 0 (N ) < n 1 (N ) < · · · < n L (N ) < n L+1 (N ) = N, Define the set D i := {x ∈ Z d : P (σ i = x) > 0} for each integer i ≥ 1. Observe that for any fixed realization of the disorder g, if we condition on the value of σ i for some i ∈ v1, N w, then by the Markov property of the random walk, the vectors (g N (1, σ 1 ), . . . , g N (i, σ i )) and (g N (i, σ i ), . . . , g N (N, σ N )) are conditionally independent with respect to P . Using this observation when i = n L , we have To condense notation, let us write

Note that
whereP n L ,L is the partition of v1, n L w into L parts induced by P N,L+1 . That is, P n L ,L : 0 =n 0 (n L ) ≤n 1 (n L ) ≤ · · · ≤n L (n L ) = n L , wheren (n L (N )) = n (N ).
We are thus interested in the limit of Therefore, and so the first term in the final expression of (4.5) is subject to the concentration inequality from (4.4). That is, We are now left with the task of controlling the second term in the final expression of (4.5). Since all variables in g are i.i.d., we have the following equality in law with respect to P: (4.9) In particular, E log B(x) is constant among such x, and so taking this constant as the value of y, we conclude the following from Lemma 4.2 with f (t) = log t and X having probability distribution given by A(·)/A. If Also note that the following holds for all ∈ v1, L + 1w, in particular = L + 1: . (4.10) Moreover, by writing we can repeat the previous estimate to obtain . (4.11) Together, (4.10) and (4.11) yield (4.12) Putting together (4.5), (4.8), and (4.12), we conclude which verifies the inductive step needed for (4.4).

Proof of Theorem 3.1
In preparation for the proof, we introduce the main input, Theorem D, from [7]. The statement is exactly the same as Theorem C but holds for more general Gaussian disordered systems. So that there is no confusion caused by duplicate notation, let us introduce a generic setting.
Let (Ω, F, P) be an abstract probability space, and (Σ N ) N ≥1 a sequence of Polish spaces equipped respectively with probability measures (ν N ) N ≥1 . For each N , we consider a centered Gaussian field (H N (σ)) σ∈Σ N defined on Ω. Regarding this field as a Hamiltonian, we denote the associated Gibbs measure by We make the following assumptions: • There is a deterministic function P : [0, ∞) → R and a deterministic sequence (a N ) N ≥1 tending to infinity, 2 such that lim N →∞ log Z N (β) a N = P(β) P-a.s. and in L 1 (P), for every β ≥ 0. (A1) • For every σ ∈ Σ N , we have • For any σ 1 , σ 2 ∈ Σ N , we have 2 Strictly speaking, [7] considers only the case aN = N , although this is just for purposes of exposition. Even so, this single case would be enough for our purposes, since we will ultimately apply Theorem D with aN = n (N ) − n −1 (N ). Indeed, the associated sequence of partitions (PN,L) N ≥1 from (4.1) is contained in the union of finitely many sequences of the form (P N i ,L ) i≥1 , where n (Ni) − n −1 (Ni) = i. Therefore, one can safely apply Theorem D along each one of these sequences, and then the proof of Corollary 5.2 goes through by choosing the maximum J, maximum N * , and minimum δ resulting from these applications.
• For each N , there exist measurable real-valued functions (ϕ i,N ) ∞ i=1 on Σ N and i.i.d. standard normal random variables (g i,N ) ∞ i=1 defined on Ω such that for each σ ∈ Σ N , where the series on the right converges in L 2 (P).
Theorem D. [7, Thm. 1.2] Assume (A1)-(A4). If β > 0 is a point of differentiability for P(·) with P (β) < β, then for every ε > 0, there exist integers J = J(β, ε) and N * = N * (β, ε) and a number δ = δ(β, ε) > 0 such that the following is true for all N ≥ N * . With P-probability at least 1 − ε, there exist σ 1 , . . . , σ J ∈ Σ N such that Returning to the polymer setting, we consider the following modifications to the random environment: restricting to times inside the interval vn −1 + 1, n w, and restricting to times outside the interval. The resulting Hamiltonians will be written as is an i.i.d. collection even across N (recall Remark 1.1). Therefore, we can partition g into the following pair of independent sub-collections, and then H ( ) is a function of g ( ) . We will write P g ( ) to denote the probability measure obtained by conditioning P on g ( ) , and E g ( ) will denote expectation with respect to P g ( ) (i.e. integrating over just g ( ) ). While the law of g ( ) is no different under P g ( ) than under P, these notational devices will make clearer how we invoke Theorem D and avoid the slightly more cumbersome P( · | g ( ) ).
Next, for each ∈ v1, Lw we introduce the following probability measure on Σ: Observe that .
We will ultimately apply Theorem D to this "restricted" setting, using . Note that ν N and P are now random measures depending on g ( ) , but since the Hamiltonian H N is independent of this randomness, Theorem D will still apply. With these choices, we first need to verify (A1).
By Theorem 4.1, we know where the limits are P-almost sure and in L α (P) for every α ∈ [1, ∞). By definition (5.3) and the fact that (n − n −1 ) ∼ N/L, the following limit thus holds in the same senses: In particular, since (5.7) holds P-almost surely, Fubini's theorem guarantees the following: For almost every realization of g ( ) , (5.7) holds P g ( ) -almost surely. This proves the first part of (5.5).
Meanwhile, for any ε > 0 and α ≥ 1, Markov's inequality gives where C depends on α and β but not on N . By taking α > 2, we can apply Borel-Cantelli to determine that with P-probability one, By taking a countable sequence ε k 0, we further deduce Since (1.3) gives the deterministic limit 1 N E log Z N (β) → p(β), it follows from (5.8) that with P-probability one we have Moreover, given that α can be taken arbitrarily large, this convergence occurs in L α (P g ( ) ) simultaneously for all α ∈ [1, ∞). On the other hand, from (5.6) we know also with P-probability one. Furthermore, since Z ( ) n (β) is determined entirely by g ( ) , this last limit is a deterministic statement with respect to P g ( ) ; in particular, it holds in L α (P g ( ) ). From (5.3), (5.9), and the fact that (n − n −1 ) ∼ N/L, we now have We have thus verified both parts of (5.5).
Given Proposition 5.1, we can make a statement approaching Theorem 3.1. The following result asserts that once the system size becomes large enough, the "external" disorder g ( ) becomes sufficiently well behaved so that when only the "internal" disorder g ( ) is regarded as random, the polymer along the subinterval vn −1 + 1, n w admits the same localization statement as in Theorem C.
Corollary 5.2. Let ∈ v1, Lw, and assume β > 0 is a point of differentiability for p(·) with p (β) < β. Then with P-probability one, the following is true. For every ε > 0, there exist integers J = J(β, ε, g ( ) ) and N * = N * (β, ε, g ( ) ) and a number δ = δ(β, ε, g ( ) ) > 0 such that for all N ≥ N * , the following event has P g ( ) -probability at least 1 − ε: Proof. Because g ( ) and g ( ) are independent, the law of the latter given the former remains i.i.d. standard normal. Therefore, (5.2) is a representation of µ N,β in the form of (5.1). Proposition 5.1 verifies that in this representation, the assumption (A1) holds. Also, it is trivial to check that (5.10) Thus (A2)-(A4) also hold, and we can apply Theorem D with the identifications in (5.4) to obtain the result.
Recall that g is supported on a probability space we denote (Ω, F, P). Following Corollary 5.2, we can define the event Let us postpone verification that such an event is measurable, and proceed directly to the proof of Theorem 3.1.
Proof of Theorem 3.1. Let ε > 0 and L be given. In the notation of Corollary 5.2, it suffices to find J, N * , and δ such that The conclusion of Theorem 3.1 then follows by replacing J with JL. The remainder of the proof is thus establishing (5.12). Let ε = ε (ε, L) be a positive number to be specified later. Take (δ k ) k≥1 to be any decreasing sequence tending to 0 as k → ∞. From Corollary 5.2, we know Since E N,J,δ k ,ε ⊂ E N,J+1,δ k ,ε , we can choose J = J(β, ε , L) sufficiently large that By the assumption δ k > δ k+1 , we also have E ( ) N,J,δ k ,ε ⊂ E ( ) N,J,δ k+1 ,ε , and so we can choose k = k(β, ε , L, J) sufficiently large that Henceforth we simply write δ = δ k . Finally, we choose N * = N * (β, ε , L, J, δ) sufficiently large that N,J,δ,ε has P g ( ) -probability at least 1 − ε . Therefore, by our choice of N * , we have To complete the proof, we set ε = ε/(4L) and note that To conclude the section, we return to the technical issue of measurability for the event E ( ) N,J,δ,ε defined in (5.11). Let F ( ) denote the sub-sigma-algebra of F generated by g ( ) .
Lemma 5.4. If r is a non-constant rational function in n real variables, then for any t ∈ R, the set {x ∈ R n : r(x) = t} has zero Lebesgue measure.
Proof. Let us write r = f /g, where f and g are polynomials. Then r − t = (f − tg)/g, which vanishes if and only if f − tg = 0. By hypothesis, f − tg is not identically equal to 0, and so this polynomial may only vanish on a set of Lebesgue measure zero [21]. Lemma 5.5. Let X ∈ R n be a random vector supported on (Ω, F, P). Suppose that f : R n+m → R is a continuous function such that P(f (X, y) = t) = 0 for any y ∈ R m , t ∈ R. Then the map y → P(f (X, y) ≥ t) is continuous from R m to [0, 1], for any t ∈ R.
As ε is arbitrary, the continuity of the map y → P(f (X, y) ≥ t) has been proved. As before, let us partition this collection is the following disjoint sub-collections: : i ∈ v1, N w \ vn −1 + 1, n w, x ∈ D i }. We will restrict our attention to these finite collections and thus regard all subsequent statements as concerning only finite-dimensional vectors. In keeping with this finitedimensional perspective, we will write Σ N to denote the set of the (2d) N possible simple random walk paths starting at the origin and consisting of N steps. It is natural to regard µ N,β and µ {σ ∈ Σ N : R ( ) (σ j , σ) ≥ δ}, σ 1 , . . . , σ J ∈ Σ N , δ > 0, ∈ v1, Lw.

Acknowledgments
I am indebted to Sourav Chatterjee for suggesting the consideration of a timedependent inverse temperature. This idea was the inspiration leading to the present article. I also thank Francis Comets for valuable discussion during the workshop on "Self-interacting random walks, Polymers and Folding" held at Centre International de Rencontres Mathématiques, and Hubert Lacoin for directing my attention to [10,Sec. 7]. Finally, I am very grateful to the referees for their corrections and suggestions, which improved the exposition.