Sharpness of KKL on Schreier graphs

Recently, the Kahn-Kalai-Linial (KKL) Theorem on influences of functions on $\{0,1\}^n$ was extended to the setting of functions on Schreier graphs.  Specifically, it was shown that for an undirected Schreier graph $\text{Sch}(G,X,U)$ with log Sobolev constant $\rho$ and generating set $U$ closed under conjugation, if $f : X \to \{0,1\}$ then $$\mathcal{E}[f] \gtrsim \log(1/\text{MaxInf}[f]) \cdot \rho \cdot {\bf Var}[f].$$ Here $\mathcal{E}[f]$ denotes the average of $f$'s influences, and $\text{MaxInf}[f]$ denotes their maximum. In this work we investigate the extent to which this result is sharp.  We show: 1. The condition that $U$ is closed under conjugation cannot in general be eliminated. 2. The log-Sobolev constant cannot  be replaced by the modified log-Sobolev constant. 3. The result cannot be improved for the Cayley graph on $S_n$ with transpositions. 4. The result can be improved for the Cayley graph on $\mathbb{Z}_m^n$ with standard generators. 5. Talagrand's strengthened version of KKL also holds in the Schreier graph setting: $$\mathrm{avg}_{u \in U} \{\mathrm{Inf}_u[f]/\log(1/\mathrm{Inf}_u[f]) \} \gtrsim \rho \cdot {\bf Var}[f].$$

Definition 1.For f : {0, 1} n → R we define: Here the random variable x is always uniformly distributed on {0, 1} n , and x e i denotes x with its ith coordinate flipped. 3 We may now state the KKL Theorem:

(Throughout this paper, A
B means that A ≥ cB for some absolute constant c > 0, and log denotes the natural logarithm.)The KKL Theorem should be compared to the much easier "Poincaré Inequality" for {0, 1} n : Note that the Poincaré Inequality may be sharp even for functions f : {0, 1} n → {0, 1}; e.g., when f (x) = x i for some i ∈ [n].The proof method used in [KKL88] can easily be extended to give the following sharper result, first stated by Talagrand [Tal94]: Indeed, Talagrand [Tal94] gave an even sharper version: To see that this is strictly sharper, note that it rules out a function f having Var[f ] 1, I 1 [f ] = 1/ log n, and I i [f ] = (log log n)/n for i ≥ 2, something that isn't ruled out by KKL Theorem 2.

KKL on Schreier graphs
In a recent work [OW09], the authors generalized KKL Theorem 2 to the setting of functions on Schreier graphs.Let us recall this setting.Let G be a finite group acting transitively on a finite set X. Let U ⊆ G be a generating set which is symmetric; i.e., closed under inverses.The associated Schreier graph Sch(G, X, U ) is the undirected graph with vertex set X and edges (x, x u ) for each x ∈ X, u ∈ U , where x u denotes the action of u on x.In the special case that G acts on X = G by x u = xu, the Schreier graph is simply the Cayley graph Cay(G, U ).This is the case in the original KKL Theorem setting, where G = X = Z n 2 and U is the standard generating set U = {e i : i ∈ [n]}.There is a natural random walk on a Schreier graph Sch(G, X, U ).Let K denote the Markov transition matrix for this walk and write L = id−K for the normalized Laplacian.Since Sch(G, X, U ) is undirected, regular, and connected, the random walk has a unique invariant probability measure: the uniform distribution on X.We denote this measure by π, and we write 2 (π) for the inner product space of functions f : Thinking of K as an operator on 2 (π), we have K = avg u∈U K u , where K u is the operator defined by K u f (x) = f (x u ).Similarly, L = avg u∈U L u , where we define the operator L u = id − K u .We have the following simple facts: Proof.This holds because for each fixed u ∈ U , the pairs (x, x u ) and (x u −1 , x) have the same distribution when x ∼ π.
).The first quantity in parentheses is precisely f, L u g .As for the second quantity, we have Hence the second quantity in parentheses is indeed f, L * u g as required.
We may now define "influences" in the Schreier graph setting: Definition 2. The influence of generator u ∈ U on f ∈ 2 (π) is defined to be where the second equality is by Proposition 2. We also define which is sometimes called the "energy" of f .
These definitions agree with those in the original KKL Theorem setting where To state the KKL Theorem in Schreier graphs we must also recall the "log-Sobolev inequality" for Markov chains.For a nonnegative function f ∈ 2 (π), the entropy of f is defined to be with 0 log 0 defined to be 0. The log-Sobolev constant for the Markov chain on Sch(G, X, U ) is defined to be the largest constant ρ such that the following inequality holds: This notion was introduced by Gross [Gro75] who showed the following: We may now state the authors' generalization [OW09] of KKL Theorem 2 to Schreier graphs: Theorem 1.Let Sch(G, X, U ) be a Schreier graph with log-Sobolev constant ρ.Assume that U is closed under conjugation.Then for all f : X → {0, 1} it holds that The motivation for this theorem was the setting where X is the set of length-n binary strings of Hamming weight k, G = S n acts on X by permuting coordinates, and The log-Sobolev constant for this Schreier graph is known [LY98] to be ρ = Θ( 1 n ) assuming k n is bounded away from 0 and 1.Using the resulting KKL Theorem, the authors were able to give a "robust" version of the classical Kruskal-Katona theorem, as well as an optimal weak-learning algorithm for the class of monotone Boolean functions.

Our results on sharpness of the inequality
In this paper we address several natural questions one might ask regarding the sharpness of Theorem 1.The most obvious question is whether the condition that U be closed under conjugation is really necessary.Although originally inclined to believe it is not, we show here the following: The assumption that U is closed under conjugation cannot in general be removed from Theorem 1, even for Cayley graphs.
There are also natural cases where Theorem 1 does not give a strong result because the log-Sobolev constant is too small.One such example is the Cayley graph on S n with generating set given by transpositions.In this case the log-Sobolev constant is known [DSC96,LY98] One might ask whether this inequality can nevertheless be improved.Unfortunately, the answer is no: Theorem 3.For the Cayley graph on S n (n > 1) with generating set U equal to all transpositions, there is a function f : n for all u ∈ U .The proof is short enough that we can give it here.Say that f (σ) = 1 if σ is a derangement (i.e., has no fixed point) and f (σ) = 0 otherwise.It is well known [?] that the fraction of permutations in S n which are derangements is n i=0 . By symmetry, all transpositions u have the same influence; the influence of (12), say, is For the event in question to occur, σ must have either σ(1) = 2 or σ(2) = 1.But the probability of this is at most 1 n + 1 n = 2 n , completing the proof of Theorem 3.This theorem also immediately implies: Corollary 1.In general, one cannot replace the log-Sobolev constant ρ in Theorem 1 with the modified log-Sobolev constant ρ 0 .
The modified log-Sobolev constant ρ 0 , which always satisfies ρ 0 ≥ ρ, was introduced in several papers, dating back to [Wu00]; it is defined to be the largest the constant such that Ent Corollary 1 follows from Theorem 3 because it is known [GQ03,Goe04,BT06] that the modified log-Sobolev constant for the Cayley graph of S n with transpositions is Another natural setting for which Theorem 1 does not give a strong result is the Cayley graph on Z n m with standard generating set U = {±e i : i ∈ n}.It is known that the log-Sobolev constant for this Cayley graph satisfies In contrast to the case of S n , we show that Theorem 1 can be improved for Z n m .Theorem 4. For any f : Further, these inequalities can be sharp up to a constant factor.Finally, one may ask whether Talagrand's strengthening of the KKL Theorem also holds in the Schreier graph setting.We establish this using a method of proof alluded to in Talagrand's paper [Tal94].
Theorem 5. Let Sch(G, X, U ) be a Schreier graph with log-Sobolev constant ρ.Assume that U is closed under conjugation.Then for all f : X → {0, 1} it holds that In particular, this proves the original KKL Theorem with a proof that has not previously appeared in the literature.

Organization of the remainder of the paper
In Section 2 we show Theorem 2, establishing that the condition that U be closed under conjugation cannot be removed, even for Cayley graphs.The example takes place on the semidirect product Z n 2 Z n .In Section 3 we show Theorem 4 regarding Z n m ; the proof follows from a simple combinatorial compression argument combined with the "BKKKL generalization" [BKK + 92] of the KKL Theorem.Finally, in Section 4 we prove Theorem 5, the generalization of the Talagrand Theorem to the Schreier graph setting.This proof uses Orlicz norms and is probably the most technically interesting part of the paper.
In this section we prove Theorem 2, establishing that the condition that U be closed under conjugation cannot be removed from Theorem 1.Our counterexample will take place on a Cayley graph, Cay(G, U ).The group G is the semidirect product Z n 2 Z n , where Z n acts on Z n 2 by the natural cyclic shift of coordinates.I.e., for (x, i), (y, j) ∈ Z n 2 × Z n the group multiplication is given by (x, i) where y π i ∈ Z n 2 is the vector given by cyclically shifting y's coordinates i places to the right.We take U to be the following symmetric generating set of 2n elements: (If one prefers not to have the group identity in the generating set, it is not hard to alter our argument so that it works for U \ {(0, 0)}.)We remark that U is not closed under conjugation; e.g., (e 1 , 0) • (0, 1) • (e 1 , 0) −1 = (e 1 , 0) • (0, 1) • (e 1 , 0) = (e 1 , 0) • (e 2 , 1) = (e 1 + e 2 , 1) = (0, 1).
To show that the conclusion of Theorem 1 is not satisfied for Cay(G, U ), it suffices to establish the following two lemmas: 2 and hence Var[f ] = 1/4.The generators (0, i) ∈ U have 0 influence on f , and it is easy to calculate that the generators (e i , 0) have influence 1 2n each.We now prove the second lemma.Proof.(Lemma 2.) We determine ρ G by comparing it with the log-Sobolev constant ρ H of a related Cayley graph.Specifically, let X be the set Z n 2 × Z n ; we write G = (X, •), where • is the group multiplication defined in (1).Let H be the abelian direct product group Z n 2 × Z n ; we write H = (X, +).We may interpret U ⊆ X as both a subset of G and of H; it is a symmetric generating set of both.We may interpret any function f : X → R as being both a function on G and on H; we distinguish the influence of u ∈ U on f within G and within H as for any u = (0, j), since x • u = x + u for such u.Second, for the generators in U of the form (e i , 0) we have avg for all f : X → R and since Ent[f ] does not depend on the group structure, it follows that ρ G = ρ H .It thus remains to show that ρ H = 1 n .The random walk on Cay(H, U ) is the product two random walks, one the standard random walk on Z n 2 and one the random walk on the complete graph K n with self-loops.It follows [DSC96, Lemma 3.2] that Here we prove Theorem 4.
Theorem 4 restated.For any f : Z n m → {0, 1} it holds that In particular, Further, these inequalities can be sharp up to a constant factor.
Proof.Let g : Z n m → {0, 1}.For x ∈ Z n m and i ∈ [n], we write g i,x : Z m → {0, 1} for the function g i,x (a) = g(x 1 , . . ., x i−1 , a, x i+1 , . . ., x n ); we also identify g i,x with a subset of Z m .It follows from the definitions that where we write ∂g i,x = {x ∈ Z m : g(x) = g(x + 1)}.We will write simply I i [g] for this common value.
Consider the jth compression operator σ j for j ∈ [n]; one may define σ j g : It is a familiar fact in the study of influences (see, e.g., [BOL90]) that compressing a function does not increase any of its influences.In our particular context of Z n m the proof is straightforward and essentially appears in [BL91].(That paper studies 'grids' rather than our 'discrete torus'; the only difference this makes is for the claim that Given any f : Z n m → {0, 1}, the first two facts above show that to prove (2) for f , it suffices to prove it for f .Thus without loss of generality we may assume f satisfies the third condition above: for each i ∈ [n] and x ∈ Z n m it holds that f i,x = {0, 1, . . ., a − 1} for some 0 ≤ a < m.In this case, note that |∂f i,x | is always either 0 or 2, depending on whether or not f i,x is a constant.It follows that where I i [f ] denotes the influence of the ith coordinate on f in the sense of Bourgain-Kahn-Kalai-Katznelson-Linial [BKK + 92].Hence proving (2) for f is equivalent to proving But the inequality (3) was proved for by Friedgut and Kalai [FK96] (building on [BKK + 92]) for any f : Ω n → {0, 1}, where Ω n is a product probability space.Finally, to show that (2) may be sharp up to a constant, consider functions f : Z n m → {0, 1} of the form f (x) = h( 2x 1 /m , . . .2x m /m ), where h : {0, 1} n → {0, 1}.Then inequality (2) is sharp up to a constant for f if and only if inequality (3) is sharp up to a constant for h with respect to Ω, a p-biased probability space on {0, 1} n .Here p = 1/2 if m is even and p = 1/2 − 1/2m ∈ [1/3, 1/2) if m is odd.In either case, it is well known [FK96] that there are function families h (namely "Tribes") which are sharp for (3) on Ω up to a universal constant.Recall Talagrand's Theorem, which generalizes the KKL Theorem: for all f : {0, 1} n → {0, 1}, In fact, in [Tal94] Talagrand also proved a version of this result for {0, 1} n equipped with the p-biased measure, p = 1/2.Talagrand straightforwardly deduced (4) from the following Fouriertheoretic inequality: Talagrand's Inequality.For g : {0, Here • M denotes a certain Orlicz-type norm with M (t) ∼ t 2 / log t.
Talagrand's proof of (a p-biased version of) ( 5) is slightly lengthy.It relies in part on a hypercontractive inequality for {0, 1} n under the p-biased distribution, which Talagrand proves by reduction to the standard ([Bon70]) p = 1/2 case.After the proof, Talagrand remarks that one can obtain (5) in the p = 1/2 case "by duality from an inequality of L. Gross [the log-Sobolev inequality], that itself follows from [the hypercontractive inequality]".However it would be two years before the sharp log-Sobolev and hypercontractive constants for the p-biased distribution were obtained [DSC96]; as Talagrand wrote, this "creates complications in using this [log-Sobolev and duality] approach when p = 1/2".In this section we deduce the approach Talagrand presumably had in mind, and show that it can be extended to the setting of Schreier graphs.

Talagrand's key inequality for Markov chains
We now show how to generalize Talagrand's key inequality (5) to the setting of Markov chains.These Markov chains need not be random walks on Schreier graphs; for this subsection, we merely assume that we have an irreducible finite Markov chain X with a transition matrix K which is not necessarily reversible.We write π for the (unique) probability distribution on X which is invariant for K, and 2 (π) for the inner product space of functions f : X → R with inner product f, g = E x∼π [f (x)g(x)].We also let L = id − K be the (normalized) Laplacian of the Markov chain.The entropy and energy of functions f ∈ 2 (π) are defined as in the Schreier graph case, as is the log-Sobolev constant of the chain.We write 2 0 (π) for the subspace of functions f with E x∼π [f (x)] = 0.By the assumption that the chain is irreducible, L is invertible when restricted to 2 0 (π); we write L −1 for its inverse on this subspace.For example, in the setting of the natural random walk on {0, 1} n we have for all f ∈ 2 (π), 2 0 (π) respectively.The generalization of Talagrand's inequality (5) involves certain gauge norms on 2 (π).Let follows immediately from the log-Sobolev inequality because E[f + c] = E[f ] for all constants c.
We now state and prove the generalization of Talagrand's inequality (5) to our setting of Markov chains: as required, where the first inequality uses generalized Hölder, the second inequality uses M ∼ M and N ∼ M , the third inequality uses Proposition 3, and the fourth inequality is the log-Sobolev inequality (or rather, (6)).

The Talagrand Theorem for Schreier graphs
We now return to our setting of Schreier graphs and prove Theorem 5, the generalization of the Talagrand Theorem.We begin with a simple calculation (cf.[KR61, (9.23)]):With this calculation in hand, we are able to deduce Theorem 5 from Theorem 6.The deduction is not quite as straightforward as in Talagrand's case, since our operators L u are not self-adjoint.
Theorem 5 restated.Let Sch(G, X, U ) be a Schreier graph with log-Sobolev constant ρ.Assume that U is closed under conjugation.Then for all f : X → {0, 1} it holds that Proof.Given u ∈ U , let g = L u f .Since x u is uniformly distributed when x is, it follows that E x∼π [g(x)] = 0. Hence we may apply Theorem 6, obtaining g, L −1 g ρ −1 g 2 M . (7) Since |g| is the 0-1 indicator of a set of measure 2I u [f ], we conclude from Fact 7 that (using 0 ≤ I u [f ] ≤ 1/2).Thus from (7) we deduce Since U is closed under conjugation, it follows that L u commutes with L (see [OW09]).Hence L u commutes with L −1 on 2 0 (π) and we obtain using Propositions 2 and 1.Thus Averaging over u ∈ U and noting that avg u∈U {L u } = L = avg u∈U {L u −1 } (because U is closed under inverses), we get completing the proof.