On the size of earthworm's trail

We investigate the number of holes created by an ``earthworm'' moving on the two-dimensional integer lattice. The earthworm is modeled by a simple random walk. At the initial time, all vertices are filled with grains of soil except for the position of the earthworm. At each step, the earthworm pushes the soil in the direction of its motion. It leaves a hole (an empty vertex with no grain of soil) behind it. If there are holes in front of the earthworm (in the direction of its step), the closest hole is filled with a grain of soil. Thus the number of holes increases by 1 or remains unchanged at every step. We show that the number of holes is at least $\mathcal{O}(n^{3/4})$ after $n$ steps.


Introduction
We will investigate the number of holes created by an "earthworm" moving on the two-dimensional integer lattice.The earthworm is modeled by a simple random walk.At the initial time, all vertices of Z 2 are filled with grains of soil except for the position of the earthworm.At each step, the earthworm pushes the soil in the direction of its motion.It leaves a hole (an empty vertex with no grain of soil) behind it.If there are holes in front of the earthworm (in the direction of its step), the closest hole is filled with a grain of soil.See Section 2 for the rigorous definition.
An earthworm model very similar to ours was investigated in [BHP13] except that the state space was a two-dimensional discrete torus.In that paper, the number of holes was constant but their distribution changed over time and converged to the stationary regime.The "dimension" of the set of holes was investigated-the definition was based on the local scaling properties of the set of holes.A similar definition of the "dimension" of the set of holes was adopted in [BBF + 22] but the state space was the whole Z 2 .In both cases, no theorems were proved-only the results of simulations were reported.The dimension of the set of holes was close to 3/2 but appeared to be strictly larger than 3/2.
In this article, we will prove a theorem supporting the conjecture that the dimension of the set of holes is at least 3/2 but our interpretation of the "dimension" will be different.Namely, we will show that the number of holes is at least O(n 3/4 ) after n steps with high probability.See Theorem 2.1 for a precise statement.Note that the exponent 3/4 in our theorem is consistent with the "dimension" 3/2, in the sense that after n 2 steps, the random walk will visit about n 2 vertices, on the same order as the number of sites in a square with side n, and then the number of holes will be at least O(n 3/2 ).
The model appears to be very hard to analyze rigorously despite its simple nature.We hope that our result, simulations, and open problems will inspire other researchers.
The model and our main result will be presented in Section 2. This will be followed by Section 3 containing the proofs.Finally, Section 4 will present some open problems and simulations.
After submitting the paper for publication, we learned from the anonymous Referee that our main result is very close to [BW03, Lem.2].This is because "tan points" introduced in [BW03] are essentially the same as points B right n defined in Section 3 below.In view of this preexisting research, the contributions of the present paper include the rigorous introduction of the earthworm model and Conjecture 4.1 which, in our opinion, is challenging but not impossible to resolve.The proof of our main result, Theorem 2.1 seems to be considerably simpler than that of [BW03, Lem.2].The latter was based on hard estimates from [BMS02].Our Theorem 2.1 is slightly stronger than [BW03, Lem.2] because ε > 0 in the statement of our theorem can be arbitrarily small.

Model, notation and the main result
The earthworm is represented by a simple random walk X n on Z 2 , starting at X 0 = (0, 0).At every time n, every lattice point is in one of two states-either it is a hole or it is filled with a grain of soil.At time n = 0, (0, 0) is a hole and every other site is filled.Let H n denote the set of holes at time n.Then H 0 = {(0, 0)}.
The process (X n , H n ) is Markov with the following dynamics.Suppose X n = (x n , y n ) and the earthworm goes to the right at the next step, i.e., X n+1 = (x n + 1, y n ).We need to check if there is a hole to the right of X n .Let If X right n ∩ H n = ∅ then we say that the earthworm created a hole at position X n+1 = (x n + 1, y n ), and we let and X n+1 is not a hole then we let the earthworm create a hole at X n+1 , push the soil in front of it, and eliminate the nearest hole to the right of X n+1 .If X right n ∩ H n ̸ = ∅ and X n+1 is a hole then no holes are created or annihilated.More precisely, let (x * , y n ) ∈ X right n ∩ H n be the element with the smallest x-coordinate.Then let In this case, we also say that the earthworm transferred the hole from (x * , y n ) to X n+1 .
If the earthworm goes in any other direction at time n+1, the mechanism is the same with respect to that direction-check if there are any holes in front of the earthworm and update holes accordingly.This completes the definition of the process (X n , H n ).
Let F i be the σ-field generated by X 0 , X 1 , ..., X i .Note that H i is F i -measurable for every i.
The indicator random variable of an event A will be denoted I(A).Let H i be the indicator of the event that the earthworm created an extra hole (i.e., that the number of holes increased by 1) at time i for i ≥ 1. Set H 0 = 1.Let S n be the total number of holes at time n.Then, The goal of this paper is to provide a lower bound for S n .The following is our main result.

Proofs
We will first derive a lower bound for the expectation of S n using a Beurling-type estimate for random walks.Then we will give an upper bound for the variance of S n .We will combine these estimates using the "second-moment method" and the associated Paley-Zygmund inequality.Finally, we will show that S n cannot fluctuate too much to fall below the order of its expectation.
Proof.We start with two observations that are crucial for the proof.First, the earthworm creates a new hole at time n + 1, i.e., H n+1 = 1, if and only if there is no hole in the direction of its motion at time n.Second, for any (x, y) ∈ Z 2 , if the earthworm did not visit (x, y) before or at time n then (x, y) is not a hole at time n.Therefore {H n+1 = 1} = {There is no hole in the direction of the step at time n} (3.1) ⊇ {The earthworm did not visit any points in the direction of the step before or at time n}.
We will define the last event in a more precise way and estimate its probability.
Denote A right n+1 = {X n+1 − X n = (1, 0)}, i.e., A right n+1 is the event that X goes to the right at (n + 1)st step.We will use a similar notation: a, 0) for all 0 ≤ k ≤ n and a ≥ 1}, i.e., B right n is the event that the random walk did not visit any point to the right of X n before time n.We will use the analogous notation: B left n , B up n , B down n .Note that B i n ∈ F n for i = {right, left, up, down}.The event on the right hand side of (3.1) can be expressed as the union of four events as follows, i={left,right,up,down} Therefore, conditioning on F n , applying (3.1) and (3.2) and using the symmetry of simple random walk, we obtain For a fixed n, let Y i = X n−i − X n for i = 0, 1, . . ., n, and note that Y is a simple random walk starting from Y 0 = (0, 0).We denote C right Therefore, using the integral approximation, for some

□
To bound the variance of S n , we need to bound E[H i H j ].Inequality (3.5) in the following lemma will be used to derive an upper bound.Inequality (3.6) will be used later.
Lemma 3.2.For any 0 ≤ i ≤ j, The dynamics of H ′ k , i.e., the mechanism of creating new holes, is the same as for the original model.In other words, all holes created by X before time i have been erased but otherwise, X ′ follows the same trajectory as that of X.
Let H ′ k be the indicator of the event that X ′ creates a new hole at time k for k > i.
As a consequence of the erasure of the initial holes created by X, the time shift, and the Markov property, we obtain It requires a moment's thought but it is totally elementary to see that if k+1 ⊆ H k+1 , for k ≥ i, since the creation of new holes is governed in both cases by identical steps of the two earthworms.Recall that H ′ i ⊆ H i .By induction, H ′ k ⊆ H k for all k ≥ i.Just before the jth step, X and X ′ are at the same location, and they will go in the same direction at step j.Since H ′ j−1 ⊆ H j−1 , if X does not have a hole in the direction of the next step, X ′ also does not have a hole in the direction of the next step.Therefore, if H j = 1 then H ′ j = 1.Hence, H j ≤ H ′ j .Taking conditional expectation with respect to F i and using (3.7), we obtain . This proves (3.5).
To prove (3.6), we denote The following formula holds for the same reasons as those for (3.7), Therefore, taking conditional expectation with respect to F i , we get Given (3.5), we can bound the variance of S n by expanding it into the sum of H i 's.Then we will use the Paley-Zygmund inequality to derive a lower bound for S n .
Lemma 3.3.For all n ≥ 0 and 0 < θ < 1, Proof.By (3.5), for any 0 ≤ i ≤ j, Taking expectation on both sides, we get By the Paley-Zygmund inequality (see [FG97, Sect.5.1, Cor.5] or [Pet07, (13)]), for any 0 < θ < 1, □ By combining Lemmas 3.1 and 3.3 we see that, with positive probability, S n is at least of the order of E[S n ], which is at least of the order of n 3/4 .We will improve this result and show that the probability can be arbitrarily close to 1 using (3.6).
Lemma 3.4.For every ε > 0 there exist δ > 0 and n 0 such that for n ≥ n 0 , Proof.We start with a heuristic outline of the proof.We will subdivide the interval [0, n] into m equally long subintervals with endpoints t i = i(n/m), 0 ≤ i ≤ m, where m ≥ 1 will be determined later.In the main part of the proof, we will assume that n is divisible by m so that t i 's are integers.We will use (3.6) to show that S t i is larger than δ E[S n ], where δ dependents on m, with probability bounded below no matter whether the events {S t j ≥ δ E[S n ]} occurred or not for j < i.This implies that the probability {S tm ≥ δ E[S n ]} goes to 1 exponentially fast as m increases.
Fix an arbitrary ε > 0. To start the rigorous proof, we apply expectation to both sides of (3.5) and take i = 1 and j This and the linearity of expectation imply that E[S n/m ] ≥ E[S n ]/m for any positive integer m.Taking δ = 1/(2m), we get Hence, it will suffice to find an m such that Recall that t i = i(n/m) for 0 ≤ i ≤ m, and let ]/2} so we only need to prove that P(A m ) ≤ ε.
We have S 0 = 1.It is easy to see that P(S n = 2) > 0 for all n ≥ 1. Lemma 3.1 implies that for any fixed m there exists n 1 such that for n ≥ n 1 , E[S n/m ]/2 ≥ 2. This implies that P(A i ) > 0 for all 0 ≤ i ≤ m if n ≥ n 1 .
We pointed out earlier in the proof that This proves the lemma with δ ′ in place of δ. □ Proof of Theorem 2.1.Fix any ε > 0. By Lemma 3.4 we can find δ 1 > 0 such that Simple random walk is transient in dimensions higher than 2, so for dimensions 4 and higher, Therefore, Theorem 2.1 can be extended to higher dimensions d as follows.For every ε > 0 there exists δ > 0 such that lim inf

Open problems and conjectures
The earthworm model is very simple but it seems to be rather hard to analyze.We will present some simulation results and open problems and conjectures based on the simulations.4.1.Dimension of the set of holes.Our simulations of the set of holes created by the earthworm suggest that S n ∼ n α with α ≈ 0.79.This is consistent with the simulation results in [BHP13, BBF + 22].Fig. 1 shows the results of simulations of S n for n = 10 4 , 3•10 4 , 10 5 , 3•10 5 , 10 6 , 3•10 6 , 10 7 .For each n, we generated 10 i.i.d.samples of S n and calculated their means µ n (estimates of E S n ).The regression line for (log(µ n ), log(n)) is y = 0.79x + 0.06.Fig. 1 shows the plot of the values of (log(µ n ), log(n)) and the regression line.(ii) Find the distribution of sizes of connected components of the "complement" of H n in the trail of the earthworm, i.e., 1≤k≤n {X k } \ H n .4.3.Central Limit Theorem.The following remarks on the CLT for S n are highly speculative in view of the fact that we even do not have a good understanding of the mean of S n .Nevertheless, we present our simulation results in Fig. 3.
We set n = 10, 000 and generated 2,000 i.i.d.samples of S n .The histogram in Fig. 3 suggests that CLT may hold for S n .The Kolmogorov-Smirnov test (see [MJ51]) yields the statistic equal to 0.0214 which gives the p-value equal to 0.315, supporting the CLT conjecture.Email address: burdzy@uw.edu,fengshi@uw.edu

n=
{Y k ̸ = (a, 0) for all 0 ≤ k ≤ n and a ≥ 1}, i.e., C right n is the event that Y 0 , . . ., Y n did not visit the positive x-axis.Note that B right n = C right n .By [Law13, (2.35)] there exists a constant c * > 0 such that lim inf n→∞ n 1/4 E[I(C right n )] > c * .(3.4) This, the equality of events B right n and C right n , and (3.3) imply that Conjecture 4.1.lim inf n→∞ log(E[S n ])/ log(n) > 3/4.The above conjecture is a mild version of the following apparently very hard problem.

Figure 2 .
Figure 2. Locations of holes H n after n = 10 8 steps.
for sufficiently large n.Lemma 3.1 implies that there exists c 2 > 0 such that E[S n ] ≥ c 2 n 3/4 for sufficiently large n.Hence for δ = c 2 δ 1 ,