The Littlewood-Offord Problem for Markov Chains

The celebrated Littlewood-Offord problem asks for an upper bound on the probability that the random variable $\epsilon_1 v_1 + \cdots + \epsilon_n v_n$ lies in the Euclidean unit ball, where $\epsilon_1, \ldots, \epsilon_n \in \{-1, 1\}$ are independent Rademacher random variables and $v_1, \ldots, v_n \in \mathbb{R}^d$ are fixed vectors of at least unit length.We extend many known results to the case that the $\epsilon_i$ are obtained from a Markov chain, including the general bounds first shown by Erd\H{o}s in the scalar case and Kleitman in the vector case, and also under the restriction that the $v_i$ are distinct integers due to S\'ark\"ozy and Szemeredi. In all extensions, the upper bound includes an extra factor depending on the spectral gap. We also construct a pseudorandom generator for the Littlewood-Offord problem using similar techniques.


Introduction
Let v 1 , . . . , v n ∈ R d be fixed vectors of Euclidean length at least 1, and let ε 1 , . . . , ε n be independent Rademacher random variables, so that Pr[ε i = 1] = Pr[ε i = −1] = 1/2 for all i. The celebrated Littlewood-Offord problem [11] asks for an upper bound on the probability, Pr[ε 1 v 1 + · · · + ε n v n ∈ B] (1.1) for an open Euclidean ball B with radius 1. This question was first investigated by Littlewood and Offord for the case d = 1 and d = 2 [11]. A tight bound of n n/2 /2 n = Θ(1/ √ n) when n is even, with the worst case being when the vectors are equal, was found by Erdős for the case d = 1 using a clever combinatorial argument [1]. Such bounds can be contrasted with concentration inequalities like the Hoeffding inequality in the scalar case and the Khintchine-Kahane inequality in the vector case, both of which give an upper bound on the probability Pr[ ε 1 v 1 + · · · + ε n v n ≥ k √ n] for positive k. In contrast, an upper bound on Eq. (1.1) can be considered a form of anti-concentration, that is showing that the random sum is unlikely to be in B.
In the case that the v i are d-dimensional vectors, a tight bound up to constant factors of C/ √ n was found by Kleitman [8], and was improved by series of work [16,17,3,20].
In the scalar case, under the restriction that v 1 , . . . , v n are distinct integers, an upper bound of n −3/2 was found by Sárközy and Szemeredi [18].
In this work, we investigate the case in which ε 1 , . . . , ε n are not independent, but are obtained from a stationary reversible Markov chain {Y i } ∞ i=1 with state space [N ] and transition matrix A, and functions f 1 , . . . , f n : [N ] → {−1, 1}, using ε i = f i (Y i ).
Let µ be the stationary distribution for the Markov chain, and let E µ be the associated averaging operator defined by (E µ ) ij = µ j , so that for v ∈ R N , E µ v = E µ [v]1 where 1 is the vector whose entries are all 1. Like many results on Markov chains, our generalizations will be in terms of the quantity If the Y i are independent, that is A = E µ , it follows that λ = 0. Often, if λ is small, the corresponding Markov chain behaves almost as if it were independent. In particular, there exists a Berry-Esseen theorem for Markov chains [13] and various concentration inequalities for Markov chain [4,10,9]. In all of these cases, there is an extra factor in the bounds in terms of λ which disappears if λ = 0.
We show that the Littlewood-Offord problem can also be generalized to Markov chains with an extra dependence on λ, for all dimensions. We additionally consider the one-dimensional case when the scalars are distinct integers. In all cases, the proof is based off a Fourier-analytic argument due to Halász [5].
The random variables in all cases are defined in the same way, which we state below.
We obtain the following theorem that upper bounds the probability that the random sum is concentrated on any unit ball. In the case that the v i are one-dimensional, the bound is tight up to a factor of (1 − λ)/(1 + λ) in λ. Note that the bound depends on the dimension, while in the independent case, there is no dependence on the dimension.
for some universal constant C.
In the one-dimensional case, we also consider the restriction that v 1 , . . . , v n are distinct integers.
for some universal constant C > 0.
Finally, we consider a different setting, where rather than choosing ε 1 , . . . , ε n independently, we choose these uniformly at random from a subset D of {−1, 1} n that we can construct explicitly. √ n for some universal constant C 1 > 0 such that the following holds. For every v 1 , . . . , v n ≥ 1 and x 0 ∈ R and ε chosen uniformly at random from D for some universal constant C > 0 independent of n.
One interpretation of Theorem 1.4 is that one can obtain similar results as in the Littlewood-Offord problem in one dimension using much less randomness, and in particular, using C 1 √ n bits of randomness rather than n.
This setting was also considered in [7], in which the authors were able to construct an explicit set of cardinality n2 n c , from which a random sample satisfies for any constant c bounded above by 1. Sampling from the set in Theorem 1.4 guarantees a stronger bound on the probability that the sum lands in any interval, while requiring more randomness when c < 1/2.

Future work
It would be interesting to remove the dependence on the dimension in Theorem 1.2, which does not appear in the tightest bounds for independent random variables.
It would also be interesting to improve Theorem 1.4 by constructing explicit sets of cardinality smaller than 2 C1 √ n that achieve similar properties.

Preliminaries
Given vectors v, µ ∈ R N (typically µ will be a distribution over [N ]), we define the We define the inner product for two vectors u, v ∈ R N and µ ∈ R N with positive entries to be Finally, we will use p in place of L p (µ) when µ is the vector whose entries are all 1. Note that in this case, µ is not a distribution.
Note that E µ is also stochastic and reversible on µ. Theorem 2.1. Let v 1 , . . . , v n ∈ R be non-zero, and let ε 1 , . . . , ε n be independent random variables uniform over the set {−1, 1}. Then for all x 0 ∈ R, for some constant C independent of n.

The Littlewood-Offord problem for random variables from a Markov chain
In this section, we consider the case that ε 1 , . . . , ε n are obtained from a Markov chain.
The proof follows very closely the proof for independent random variables in Proposition 7.18 in [19] which itself is due to Halász [5]. We start by presenting the following concentration inequality due to Esséen [2], which will allow us to upper-bound probabilities. This inequality is in the spirit of Fourier inversion, but written in a way that can be more readily applied for our purposes.

Theorem 3.1 (Esséen concentration inequality)
. Let X ∈ R d be a random variable taking a finite number of values. For R, ε > 0, The following bound is implicit in the proof of Proposition 7.18 in [19] and will be used to further bound the quantities obtained from Theorem 3.1 for some constant C.
In order to handle the extra dependencies from the Markov chain, we will use the following technical lemma, which is a straightforward adaptation of Lemma 2.3 from [15]. We include a proof in Appendix A.
When applying this lemma, we will choose T j so that A = T j + (1 − λ)E µ , and thus the left-hand side of Eq. (3.1) is an upper bound on the expected value of a product of random variables obtained from a Markov chain. Thus, Lemma 3.3 can be used in place of bounds on the expected value of the product of independent random variables, which appear in the proofs of the Littlewood-Offord problem.
Before proving Theorem 1.2, we first prove the following that will allow us to upperbound negative moments of binomial random variables. Claim 3.4. Let X = B(n, p) be a binomial random variable with n trials, each with success probability p > 0. Then for all positive integers d, Proof: Note that because d(i + 1) ≥ i + d for all non-negative i, the right-hand side is bounded above by d d E The claim follows by noting that n ≤ n + i for 1 ≤ i ≤ d. 2 We start by considering the case of 1-dimensional vectors, or scalars. We also consider the case in which at most one-half of the v i have length less than 1. This will allow us to generalize to higher dimensions. We note that in the case of independent random variables the corresponding statement follows from the usual Littlewood-Offord problem, by conditioning on the ε i such that |v i | < 1, for just an increase in the constant factor in the bound. Lemma 3.5. Assume the setting of 1.1. Then for every v 1 , . . . , v n ∈ R such that |{i : |v i | ≥ 1}| ≥ n/2 and x 0 ∈ R, for some universal constant C.
Proof: By Theorem 3.1, for some constant C 1 . Note that Let u j be the vector defined by u j (y) = exp(2πiξf j (y)v j ) for y ∈ [N ], and let U j = diag(u j ). Then,  where the inequality follows by Lemma 3.3 and evaluating | u, 1 L2(µ) |.
Let t (s) be the set of indices j ∈ t(s) such that |v j | is greater than 1. When |t (s)| = 0, the corresponding product can be bounded above by 1. When |t (s)| > 0, we can apply Claim 3.2. Thus, the right-hand side of Eq. (3.2) can be bounded above by By the definition of r and s, Eq. (3.4) is bounded above by, We conclude with the following argument. Let r = B( n/4 − 1, (1 − λ) 2 ) + 1 where B(n, p) denotes a binomial random variable with n trials, each with success probability p. It follows that r is dominated by r(s) + 1, and thus E C where the second inequality follows by Jensen's inequality. Finally, by Claim 3.4, the right-hand side of Eq. (3.5) is bounded above by C (1 − λ) n/4 −1 as desired.

2
Before proving Theorem 1.2, we prove the following bound on random unit vectors.

Claim 3.6.
Let v ∈ R d be a random unit vector uniform over the d − 1-dimensional sphere. Then there exists a constant C such that where v(1) denotes the first coordinate of v.
Proof: We start by noting that the probability density function of v(1) at t is proportional to (1 − t 2 ) (d−3)/2 , which is also the probability density of the beta distribution, shifted so that the domain is [−1, 1]. The probability density function at all points is bounded above for some constants C 1 and C 2 , where the inequality follows from Stirling's approximation (see [6]). The claim follows by letting C = C 2 /4. 2 We now use Lemma 3.5 to prove Theorem 1.2 as follows.
Proof of Theorem 1.2: Let M ∈ SO(d) be a random rotation uniform over the Haar measure of the special orthogonal group. Then it is enough to consider the random Additionally, the left-hand side in the statement of the theorem is bounded above by This is because if the absolute value of the first coordinate of the random vector is greater than R, so is the Euclidean norm. By Claim 3.6, it holds that the expected number of i such that |( for at least half the i. By Lemma 3.5, we have that √ n as desired. a Markov chain can be interpreted as first choosing a state at random, and then at each subsequent step choosing a new state uniformly at random with probability 1 − λ, or switching states with probability λ. We can associate with this walk a sequence of numbers, (X 1 , X 2 , . . .) obtained as follows. Whenever a state is chosen at random, we add a new entry in the sequence starting at 1, and increase this entry every time the state is switched. Then conditioned on this sequence, f (Y 1 ) + f (Y 2 ) + · · · + f (Y n ) is distributed as ε 1 + ε 2 + · · · + ε n where n is the number of entries in the sequence that are odd and ε i are independent and uniform over {−1, 1}. Thus, if n is considered as a random variable, If we assume that n is large, then the probability that any given step in the walk is the start of a entry that will eventually be of odd length is approximately 1/(1 + λ), and thus, n is approximately distributed like B(n, (1 − λ)/(1 + λ)), and thus

Extension to distinct v i 's
Theorem 2.1, the bound obtained in the independent case, is tight when v 1 = · · · = v n = 1. It is reasonable to ask if one can obtain better bounds on the probability Pr[ε 1 v 1 + · · · + ε n v n ∈ B] under certain restrictions of v 1 , . . . , v n . In particular, when the ECP 26 (2021), paper 47. v i are distinct integers, Sárközy and Szemeredi [18] showed that for all x 0 and for some constant C Pr[ε 1 v 1 + · · · + ε n v n = x 0 ] ≤ C n 3/2 , which is a factor n smaller than Theorem 2.1.
Like Erdős' proof of Theorem 2.1, the proof of the above by Sárközy and Szemeredi uses a clever combinatorial argument. However, Halász's Fourier-analytic argument can also be used to prove the above. In particular, the techniques used in [19] for the same problem can be applied. Here, the Fourier-analytic argument is over the group Z p for some large enough p, rather than over the integers or over the real numbers. The following claim is implicit in Corollary 7.16 in [19].

A pseudorandom generator for the Littlewood-Offord problem
In this section we prove Theorem 1.4. As stated in the introduction, this theorem can be interpreted as proving the existence of a pseudorandom generator for the Littlewood-Offord problem.
We start by describing the construction of D. Our construction will be based on expander graphs which we define as follows. Given a d-regular graph G = (V, E), let A be the normalized adjacency matrix of G and let J be the matrix whose entries are all 1/|V |. We say that a family of d-regular graphs G is a family of expanders if for all graphs G in the family, A − J L2(µ)→L2(µ) ≤ λ for some constant λ bounded away from 1, where µ is the vector whose entries are all 1/|V |. Note that when G = (V, E) is d-regular, the stationary distribution is µ, and the averaging operator is J. Thus, 1 − A − J L2(µ)→L2(µ) is also the spectral gap of the Markov chain that is a simple random walk on G. It is well known that there exist infinite families of expander graphs of constant degree d independent of the number of vertices (see for example, [12] and [14]).
Let G = ({−1, 1} k , E) be a d-regular graph from such a family for which A − J L2(µ)→L2(µ) ≤ λ for some constant λ independent of k. We let our set D be the set of concatenations of the labels of walks of length n/k on G, and thus D has cardinality 2 k+C1n/k for some constant C 1 independent of n and k.
The expression u j , 1 L2(µ) can be replaced by noting that it is the Fourier transform at ξ of the random variable w (j−1)k+1 v (j−1)k+1 + · · · + w jk v jk where each coordinate of w is uniformly random over the set {−1, 1}. This brings us back to the original setting of completely independent random variables, and thus Thus by inserting the above in Eq. (5.2) we obtain and upper-bound on the right-hand side of Eq. (5.1) of 1 2π where the inequality follows from Claim 3.2, We proceed by using the same argument as in Lemma 3.5 starting from Eq. (3.4), which gives an upper bound of C/ k · (n/k) = C/ √ n as desired. Finally, we obtain a construction of the desired size by letting k = √ n. 2

A Proof of Lemma 3.3
We prove Lemma 3.3, which as mentioned previously, is a straightforward adaptation of Lemma 2.3 from [15]. Before getting to the proof, we first state the following two claims.
Claim A.1. For all k ≥ 1, matrices R 1 , . . . , R k ∈ R N ×N , and distributions µ over [N ], Proof: Notice that for any vector v, E µ v = E µ [v]1. The claim follows by noting that E µ [v] ≤ v L1(µ) and by induction. Proof: By Jensen's inequality, the right-hand side is bounded above by and the claim follows by the definition of operator norm and the fact that U i L2(µ)→L2(µ) = u i L∞(µ) . 2 Claim A.3. Let µ ∈ R N be a distribution, and let E µ be the associated averaging operator. Then for any u ∈ C N , E µ diag(u)E µ = u, 1 L2(µ) E µ . Proof: (E µ ) i,k u k (E µ ) k,j = u, 1 L2(µ) µ j .