An almost sure invariance principle for renormalized intersection local times

Let \beta_k(n) be the number of self-intersections of order k, appropriately renormalized, for a mean zero random walk X_n in Z^2 with 2+\delta moments. On a suitable probability space we can construct X_n and a planar Brownian motion W_t such that for each k\geq 2, |\beta_k(n)-\gamma_k(n)|=O(n^{-a}), a.s. for some a>0 where \gamma_k(n) is the renormalized self-intersection local time of order k at time 1 for the Brownian motion W_{nt}/\sqrt n.


Introduction.
If {W t ; t ≥ 0} is a planar Brownian motion with density p t (x), set γ 1,ǫ (t) = t and for k ≥ 2 and x = (x 2 , . . . , x k ) ∈ (R 2 ) k−1 let When x i = 0 for all i the limit γ k (t, x) = lim ǫ→0 γ k,ǫ (t, x) exists and for any bounded continuous function F (x) on R 2(k−1) we have (1.1) (Here we may arbitrarily specify that γ k (t, x) = ∞ if any x i = 0.) When x i = 0 for all i define the renormalized intersection local times as γ k (t, x) = A⊆{2,...,k} where x A c = (x i 1 , . . . , x i k−|A| ) with i 1 < i 2 < · · · < i k−|A| and i j ∈ {2, . . . , k} − A for each j, that is, the vector (x 2 , . . . , x k ) with all terms that have indices in A deleted. It is known that the γ k (t, x) have a continuous extension to all R 1 + × R k−1 ; see [3]. Renormalized self-intersection local time was originally studied by Varadhan [20] for its role in quantum field theory. In Rosen [18] it is shown that γ k (t, 0) can be characterized as the continuous process of zero quadratic variation in the decomposition of a natural Dirichlet process. Renormalized intersection local time turns out to be the right tool for the solution of certain "classical" problems such as the asymptotic expansion of the area of the Wiener sausage in the plane and the range of random walks, [5], [9], [10]. For further work on renormalized self-intersection local times see Dynkin [7], Le Gall [11], Bass and Khoshnevisan [3], Rosen [17] and Marcus and Rosen [14].
Let ξ i be i.i.d. random variables with values in Z 2 that are mean 0, with covariance matrix equal to the identity, and with 2 + δ moments. Let us suppose the ξ i are symmetric and are strongly aperiodic. Let X n be the random walk, that is, X n = n i=1 ξ i . Let p(n, x, y) be the transition probabilities. Let B 1 (n, x) = n and for x ∈ Z 2 set B 2 (n, x) = 0≤i 1 <i 2 ≤n 1 (X i 2 =X i 1 +x) .
Note that B k (n, x) = 0 for all n < k − 1.
With e 1 = (1, 0), let In particular we have Finally we define the renormalized intersection local times for our random walk by β k (n, x) = 1 n B k (n, x √ n). (1.4) In particular we have (1.5) We note from P12.3 of [19] that for x = 0 We know we can find a version of our random walk and a Brownian motion W t such that sup s≤1 |X n s − W n s | = o(n −ζ ), a.s. for some ζ > 0 where X n t = X [nt] / √ n and W n t = W nt / √ n; see, [6], Theorem 3, for example. Let γ k (1, x, n) and γ k (1, x, n) be the intersection local times and renormalized intersection local times up to time 1 of order k, resp., for the Brownian motion W n t . In this paper we prove the following theorem. Theorem 1.1. Let X n = ξ 1 + · · · + ξ n be a random walk in Z 2 , where the ξ i are i.i.d., mean 0, with covariance matrix equal to the identity, with 2 + δ moments for some δ > 0, symmetric, and strongly aperiodic. On a suitable probability space we can construct {X n ; n ≥ 1} and a planar Brownian motion {W t ; t ≥ 0} and we can find η > 0 such that for each k ≥ 2 | β k (n, 0) − γ k (1, 0, n)| = o(n −η ), a.s.
For related work see [4], [5], [16]. We give a brief overview of the proof. There is an equation similar to (1.1) when γ k is replaced by γ k , and also when it is replaced by β k . Since by (1.7) we have X n s close to W n s for n large, we are able to conclude that F (x) γ k (1, x) dx is close to F (x) β k (n, x) dx for n large. If F is smooth and has integral 1, then by the continuity of γ k (t, x) in x, which is proved in [3], we see that F (x) γ k (1, x) dx is not far from γ k (1, 0). If we had a similar result for β k , we would then have that F (x) β k (n, x) dx is not far from β(n, 0), and we would have our proof. So our strategy is to obtain good estimates on | β k (n, x) − β k (n, 0)|. Because of the rate of convergence in (1.7), it turns out we are able to avoid having to find the sharpest estimates on the difference, which simplifies the proof considerably.
Our main tool in obtaining the desired estimates is Proposition 3.2. This proposition may be of independent interest. It has been known for a long time that one way of proving L p estimates for a continuous increasing process is to prove corresponding estimates for the potential. It is not as well known that one can do the same for continuous processes of bounded variation provided one has some control on the total variation; see, e.g., [1] or [3]. Proposition 3.2 is the discrete time analogue of this result, and is proved in a similar way. Unlike the continuous time version, here it is also necessary to have control on the differences of successive terms. Section 2 has some estimates on the potential kernel for random walks in the plane, while Section 3 has the proof of the stochastic calculus results we need. Theorem 1.1 in the case when k = 2 is proved in Section 4, with the proofs of some lemmas postponed to Section 5. We treat the case k = 2 separately for simplicity of exposition. The description of the potentials of intersection local times of random walks in the k > 2 case is a bit different than in the k = 2 case and this is described in Section 6. Theorem 1.1 in the k > 2 case is proved in Section 7, with the proofs of some lemmas given in Sections 8 and 9. Finally in Section 10 we give an extension of our results to L 2 convergence, and more importantly, make a correction to the proof of one of the propositions in [3]. An Appendix contains the detailed proof of that correction. Throughout this paper we use the letter c to denote finite positive constants whose exact value is unimportant and which may vary from line to line.
In this section we prove some estimates for the potential kernel of a random variable. See the forthcoming book by Lawler [13] for further information. Let G be the potential kernel for X. Recall that in 2 dimensions, since X is recurrent, the potential kernel is defined somewhat differently than in higher dimensions, and is defined by where e 1 = (1, 0). (Note e 1 can be replaced by any fixed point.) For us it will be more convenient to work with which, since p(0, 0, 0) = 1 and p(0, 0, e 1 ) = 0, differs from G(x) by 1 {0} (x). By Spitzer [19] is finite. The rest of the assertions follow from Proof. By [19], P7.10, Since p(j, 0, 0) ≤ 1, then we have (2.4) Suppose 0 < |x| ≤ |y|. Let us set R in a moment. Using (2.4) for j ≤ R and (2.2) for j > R, we have that If we select R so that the result follows. Since G(0) is finite and |G(x)| ≤ c log(1 + |x|) ≤ |x| 2/3 , the result holds when either x or y is 0, as well.
Lemma 2.3. For some constant κ and any ρ < δ/2, Proof. Let us begin with the proof of Proposition 3.1 in [2]. We have for δ > 0 So if φ is the characteristic function of a random vector with finite 2 + δ moments, mean 0, and the identity as its covariance matrix, then Applying this also for the characteristic function of a standard normal vector, where E 2 (α, n) has the same bound as E 1 (α, n). If we use this in place of the display in the middle of page 473 of [2], we obtain So if E(n, x) = |p(n, 0, x) − (2πn) −1 e −|x| 2 /2n |, following the proof in [2] we obtain Let us choose δ ′ < δ. We then have . It is shown in the proof of Theorem 1.6.2 in [12] where q(k, x, y) = (2πk) −1 e −|x−y| 2 /2k . Thus, to prove the lemma, it suffices to prove for any ρ < δ/2. To establish (2.7), use [15], p. 60 to observe that and a similar estimate is easily seen to hold for q(k, x, 0). Therefore, using (2.6) and setting R = |x|,

Stochastic calculus.
We will use the following propositions; these may be of independent interest. Propositions 3.1 and 3.2 and their proofs are the discrete time analogues of Propositions 6.1 and 6.2 of [3].
Proposition 3.1. Let A n be an adapted increasing sequence of random variables with A 0 = 0 and A ∞ = sup n A n finite. Suppose that and W is a random variable such that for all n. Then for each integer p larger than 1 there exists a constant c such that Proof. Since A n is increasing, Multiplying by A n − A n−1 , we obtain Summing over n we obtain On the other hand, applying the general summation formula Here we used the fact that Combining (3.1) and (3.2) we obtain Now suppose for the moment that Y is bounded and A n = A n 0 for all n ≥ n 0 for some n 0 . We have We write Therefore using Hölder's inequality, Our temporary assumptions on A allow us to divide both sides by (E A p ∞ ) 1− 1 p to obtain our result in this special case.

In general, look at
and apply the above to A ′′ n = A ′ n∧n 0 ; note that A ′′ will satisfy the hypotheses with the same W and Y . Then let K ↑ ∞ and next n 0 ↑ ∞ and use monotone convergence.
Proposition 3.2. Suppose Q 1 n and Q 2 n are two adapted nonnegative increasing sequences. Suppose Q n = Q 1 n − Q 2 n , H n + Q n is a martingale that is 0 at time 0, Then there exists c such that for p > 1 Proof. There is nothing to prove unless E W 2p < ∞. Since sup n Q n ≤ W , all the random variables that follow will satisfy the appropriate integrability conditions. Let us temporarily assume that there exists n 0 such that Note that V ∞ = 0, M m is a martingale, and Q m = M m − V m . In fact, in view of our temporary assumption, Our first observation is that since We will use This lemma will be proved shortly. We first show how Proposition 3.2 follows from this lemma. By the Burkholder-Davis-Gundy inequalities, we obtain Combining with (3.7) and the fact that Q m = M m − V m and then using Cauchy-Schwarz completes the proof of Proposition 3.2 in the special case where the Q i are constant from some n 0 on. In the general case, let Q n , let n 0 → ∞, and apply monotone convergence.
Proof of Lemma 3.3. We now prove (3.8). Algebra shows that We now take the conditional expectation with respect to F m .
Recalling that V n = M n − Q n we see that . By Cauchy-Schwarz and (3.12), We therefore conclude (3.13) and therefore Using (3.7), (3.16) and the fact that (3.8) then follows using (3.15) and Proposition 3.
is a martingale with M 0 = 0.
Proof. If

Now for any
where P j is the transition operator associated to p(j, x, y). Hence Comparing with (4.1) and using we see that as required.
The key to proving Theorem 1.1 in the k = 2 case is the following proposition.
for each integer p > 1 and x, In the proof of Proposition 4.2 we will need the following three lemmas, whose proofs are deferred until the next section.
Proof of Proposition 4.2. Converting from β's to B's, estimate (4.4) for k = 2 is equivalent to We want to apply Proposition 3.2. We fix an n. We use so that Q 1 and Q 2 are increasing and From Proposition 2.1, Lemmas 4.3 and 4.4, and the fact that |x|, |x ′ | ≤ √ n, we see and Combining (4.10), (4.11), Lemma 4.5, and the fact that 1 . This is the bound we need.
We also have by Lemma 2.3 that and it is easy to see from the support properties of f τ n (x) that On the other hand, recalling the notation By [3], (This conforms with the definition given in Section 1 above; the definition in [3] is very slightly different and would yield Combining the above, Recall f τ n (x)dx = 1. Without loss of generality we may assume ζ is small enough so that ψ n =: (If ζ were too large, then τ n would tend to 0 too quickly, and then the above estimate for ψ n might not be valid. In general one only has ψ n = 1 + O(n −1/2 τ −3 n ). ) Jensen's inequality and estimates (4.4), (4.5) imply that If we take p big enough, then By Borel-Cantelli, we conclude that A very similar argument to the above also shows that we have the analogue to estimate (4.4) is in [3]. Combining, we conclude that Remark 4.6. To see the importance of renormalization, note that if we also had the estimate (4.8) for B 2 (n, x) − B 2 (n, x ′ ), this would imply that uniformly in n which is impossible if p > 6 and n is sufficiently large.
Proof of Lemma 4.4. Let We show we are then done. But Proof of Lemma 4.5. We begin by estimating Next, using (2.1) Finally, We conclude that for any 0 < b < 2 Using the estimate of Proposition 2.2, the fact that symmetry tells us that E [(1 + |X i + x| 2 ) −1/3 ] is largest when x = 0, and the estimate (5.6) above, we obtain So by independence, usingX i ,Ē to denote an independent copy of X i and its expectation operator,

Using Proposition 3.1 with
Replacing x and y by −x and −x ′ , resp., and using the fact that n i=1 G(X i − x) is equal in law to n−1 i=0 G(X n − X i − x) yields the L p estimate that we want.
Let B 1,m (j, x) = j and for k ≥ 2 define Proof. We will show that for each k This will prove the proposition since, with the notation D k = D ∪ {k}, Setting From the definition ofŪ n we havē
Remark. The statement of Proposition 6.1 is not an exact analogue of that of Proposition 4.1. Consider the summands in the definition of U k,m (n, x): When k = 2 and i = n, this is nonrandom, whereas this is not the case when k > 2 and i = n. On the other hand, recalling that B k−1 (i, x) = 0 if i > k − 1, it is natural to define B k−1 (−1, x) to be 0. It is also natural to define B 1,m (i, x) = i for i ≥ 0. Then (6.3) will be 0 if i = 0 for all k ≥ 2, but the i = 0 term in the statement of Proposition 4.1 is not zero.
7. The case of general k.
The key to proving Theorem 1.1 for the case of general k is the following proposition.
In the proof of Proposition 7.1 we will need the following three lemmas, whose proofs are deferred until the next two sections. and Proof of Proposition 7.1. Converting from β's to B's, estimate (7.1) is equivalent to We want to apply Proposition 3.2. We fix an n. Let For i ≤ n set so that Q 1 and Q 2 are increasing and . By Proposition 6.1, Q i + H i is a martingale. Using Lemmas 7.2-7.4 and Proposition 2.1 to bound the right hand side of (3.5) in Proposition 3.2 and using the fact that 1 for x, x ′ ∈ (Z 2 ) k−1 with |x|, |x ′ | ≤ √ n, which implies (7.7). This is the bound we need.

Proofs of Lemmas 7.2-7.3.
These are again similar to the k = 2 case.
Proof of Lemma 7.2. Using Proposition 2.1 it suffices to prove (7.2) for all k.
Proof of Lemma 7.3. Using Proposition 2.1 it suffices to prove (7.4) for all k. Let then using We can then see that (8.5) equals which is (7.4).

Proof of Lemma 7.4.
This proof is substantially different from the proof of Lemma 4.5.
Proof of Lemma 7.4. We use induction on k. We already know (7.6) for k = 2. Thus assume (7.6) has been proved with k replaced by i for all i ≤ k − 1. Then as explained above in the proof of Proposition 7.1 we will have that (7.10) holds with k replaced by i for all i ≤ k − 1.
We will show that for m ≤ n. This and the inequality , where x k c is the same as (x 2 , . . . , x k−1 ), we have by (5.9) and (7.5).
After interchanging x ′ and x for convenience it remains to bound Using Proposition 2.1 and our inductive hypothesis concerning (7.10) we see that To complete the proof of (9.1) it therefore suffices to show that (9.6) By Propositions 2.1 and 2.3, G(x) is bounded above (but G(x) → −∞ as |x| → ∞). Let . . , m and let K i = J(X m − x) for i < 0. Let B be a small positive real to be chosen later and let We see that Since J is bounded in absolute value by c log m, the same is true for K i and L i for any i, i.e. sup i |K i − L i | ≤ c log m. (9.8) Note that L i and K i are independent of F h for i ≥ h + Bm, and thus Now by Proposition 2.2

By (5.6) and symmetry
Then using Holder's inequality in the form |E (f g)| ≤ f 3 g 3/2 we obtain from the last two displays that Thus for i ≥ 2Bm, summing over j from i − Bm to i and dividing by Bm shows Recalling (9.8)-(9.9) and then using Proposition 3.1 we have that for n large. Combining with this with (9.7), (7.5) and Cauchy-Schwarz, the left hand side of (9.7) is bounded in L p norm by c(log n) k−1 n 1+(1/2p) B 1/3 . (9.12) We use summation by parts on and we see that it is equal to Using the fact that L m is bounded by c log m and our inductive hypothesis concerning (7.10), we can bound the L p norm of the first term of (9.14) by c(log n) k−1 n 1+(k−1)/p w 8 −k+1 .

Other results.
A. L 2 norms. By Section 3 of [5] we see that we can choose W t and X n such that for some ζ > 0. If we then use this (in place of (1.7)), our proof shows that we obtain for some η > 0.
B. A correction. We take this opportunity to correct an error in [3]. In the statement of (8.3) in Theorem 8.1 of that paper, G ∨ := max 1≤j≤k−1 |G(x j )| should be replaced by N ∨ := max 1≤j≤k−1 |x j | −1 . The term G ∨ also needs to be replaced by N ∨ throughout the proof of (8.3). Proposition 9.2 of that paper is correct as stated. Where the proof of this proposition says to follow the lines of the proof of (8.3), it is to be understood that here one uses G ∨ throughout.
For the convenience of the interested reader we give a complete proof of that proposition in the following Appendix.
The proof of Proposition 9.2 in [3] is perhaps a bit confusing due to an error in the statement of (8.3) in Theorem 8.1 of that paper. This Appendix provides a complete proof of Proposition 9.2 of [3].
Write g(y) = 1 π log(1/|y|), γ 1 (x, t) = t, and for x = (x 2 , . . . , . . , k, and let g + = g ∨ (x) + g ∨ (x ′ ) + 1. There exist a k and ν k such that for k ≥ 2 Except for the restriction on the size of x, x ′ , this is Proposition 9.2 of [3] translated to the notation of this paper. Using the argument of [3] this is sufficient to prove the joint continuity of γ k (t, x) over t ∈ Note that renormalization allows us to use U k (t, x) in place of If one were to try to use U * k (t, x) in Proposition A.1, the right hand sides of (a) and (b) would have to have g + replaced by N ∨ , which is not a good enough bound for the joint continuity argument.
Proof. Since g + is infinite if any component of x or x ′ is zero, we may assume that no component of either is 0.
Let A ∈ (0, 1 2 ] be chosen later and let The proof is by induction. We start with k = 2. In preparation for general k we retain the general notation, but note that when k = 2, we have L 2 (t, x) = t, x k c is superfluous, and we have γ k−1 (dr, x k c ) = dr.
If we connect x, x ′ by a curve Γ of length c|x − x ′ | that never gets closer to 0 than |x| ∧ |x ′ |, use the fact that |∇g A | ≤ A −1 , and use inequality (8.1) of [3] (this is only needed for k > 2) By Proposition 5.2 of [3], for some constants b 1 and ν ′ and similarly for I 3 . We next turn to I 4 . Standard estimates on Brownian motion tells us that | log(1/|W t − W r − x k |)| 2p ]) 1/2 (P( sup 0≤r≤t≤1 |W t − W r | ≥ (2A) −1 )) 1/2 ≤ cA p for each p ≥ 1. Using the fact that another application of Cauchy-Schwarz shows that E |I 4 | p ≤ (g + ) ν 1 p A p .
I 5 is handled the same way.
Next we look at (b) for the k = 2 case. We write =: I 6 + I 7 + I 8 − I 9 − I 10 + I 11 .
We bound I 8 and I 9 just as we did I 2 and bound I 10 and I 11 as we did I 4 . Combining, the left hand side of (A.5) is bounded by c(g + ) ν 2 p [A −p |t − s| p/2 + A b 1 p + A p ], and (b) follows by setting A = |t − s| 1/4 ∧ (2M ) −1 .
We now turn to the case when k > 2. We suppose (a) and (b) hold for k − 1 and prove them for k. We prove (a) in two cases, when x k c = x ′ k c and when x k = x ′ k ; the general case follows by the triangle inequality. Suppose first that x k c = x ′ k c . Using the induction hypothesis, the proof is almost exactly the same as the proof of (a) in the case k = 2.
Suppose next that x k = x ′ k . Let Standard estimates on Brownian motion show that We write Since |g A | ≤ log(1/A), for some constant ν k−1 and similarly for I 13 . We bound I 14 and I 15 just as we did I 2 and bound I 16 and I 17 as we did I 4 .
We turn to I 18 . Let f (r) = g A (W t − W r − x k ) and f A (t) = 1 A 12 Using integration by parts, we write We bound E |I 19 | p ≤ cA p (g + ) ν ′ k−1 p for some constant ν ′ k−1 and similarly for I 20 . By the induction hypothesis and the fact that |f A | is bounded by log(1/A) ≤ cA −p , If we combine all the terms, we see that the left hand side of (A.6) is bounded by Setting A = |x − x ′ | a ′′ k−1 /(2b ′ k ) ∧ (2M ) −1 completes the proof of (b) for k > 2. The proof of (b) for k > 2 is almost identical to the k = 2 case.