On the central limit theorem for the two-sided descent statistics in Coxeter groups

In 2018, Kahle and Stump raised the following problem: identify sequences of finite Coxeter groups $W_n$ for which the two-sided descent statistics on a uniform random element of $W_n$ is asymptotically normal. Recently, Br\"uck and R\"ottger provided an almost-complete answer, assuming some regularity condition on the sequence $W_n$. In this note, we provide a shorter proof of their result, which does not require any regularity condition. The main new proof ingredient is the use of the second Wasserstein distance on probability distributions, based on the work of Mallows (1972).

We recall that a sequence of random variables (X n ) n≥0 is said to be asymptotically normal if Xn−E [Xn] √ Var(Xn) converges in distribution to a standard random variable Z ∼ N (0, 1). Asymptotic normality of permutation statistics is a vast topic in discrete probability, dating back at least to Goncharov [Gon44] and Hoeffding [Hoe51]; we refer also to [Vat96,Ful04,CD17,Özd19] for more recent works on the descent and two-sided descent statistics. Recently, there has been some interest into generalizing such asymptotic normality results to statistics of Coxeter group elements 1 . In particular, Kahle and Stump [KS19] have given sufficient and necessary conditions on a sequence W n of finite Coxeter groups so that the number of inversions (resp. of descents) of a uniform random element in W n is asymptotically normal. They then asked for a similar characterization for the two sided descent statistics t defined as follows: for an element w of a Coxeter group W , we set t(w) = des(w) + des(w −1 ), where des(w) is the number of descents of w. Unlike for inversions and descents, the two sided-descent statistics on a uniform random element does not decompose as a sum of independent Bernoulli variable, making the problem more difficult. For further background on the topic, we refer to [KS19] and [BR19].
The main result of this note is a complete answer to the Kahle-Stump question.
Theorem 1. Let (W n ) n≥1 be a sequence of finite Coxeter groups. For each n, we let w n be a uniform random element in W n . Then the following assertions are equivalent: a) The sequence t(w n ) is asymptotically normal; b) Var t(w n ) tends to +∞. For the reader's convenience, we provide an appendix with the necessary definitions regarding Coxeter groups, in particular the notion of descent. 1 of [BR19]). In addition to not requiring any regularity assumption, the proof that we provide here is shorter. In particular, we do not need any fourth moment estimates.
As in [BR19], we will take as granted that asymptotic normality holds when (W n ) n≥1 is one of the infinite families A n , B n and D n ; this was proved previously in [Vat96,CD17,Röt18,Özd19]. In addition to the fact that t(w) is bounded by 2 rk(W ), this is the only specific information we will need on the two-sided descent statistics. All other arguments are of probabilistic nature. In particular, we shall use characteristic function analysis, and Lindeberg type arguments to prove the asymptotic normality (as in [BR19]). We also introduce a new proof ingredient: the second Wasserstein metric for probabilistic distributions.
We first recall the definition of this Wasserstein metric, and some useful properties of it, and then proceed to the proof of the main theorem.
is square integrable if a random variable with that probability distribution is.
Definition 2 (see Lemma 2 in [Mal72]). Let µ and ν be square integrable probability distributions on R. Then we define where the infimum is taken over all pairs (X, Y ) of random variables defined on the same probability space and with distributions µ and ν, respectively.
As usual in probability theory, we sometimes identify a random variable and its distribution: namely for random variables Z and T (not necessarily on the same probability space), we write d 2 (Z, T ) = d 2 (P Z , P T ), where P Z and P T are the distributions of Z and T .
The introduction of the Wasserstein metric (using L 1 norm instead of L 2 norm, and for general metric space) is usually attributed to Wasserstein (sometimes also spelled Vasershtein), though it seems that it appeared in several earlier works [EOM11]. The L 2 case and its relation with asymptotic normality were studied by Mallows [Mal72]. In particular, he established the following lemmas (Lemmas 1 and 3 in [Mal72]): Lemma 3. Let X n and X be square integrable random variables. Then d 2 (X n , X) tends to 0 if and only if X n → X in distribution and E[X 2 n ] → E[X 2 ]. Lemma 4. Let k > 0 be an integer and Z be standard normal random variable. If X 1 , · · · , X k are independent random variables and (a j ) j≤k are real coefficients with j≤k a 2 j = 1, then We can now prove the main result of this note.
Proof of Theorem 1. The implication a) ⇒ b) is immediate: since t(w n ) is integer valued, it cannot tend to a continuous distribution without a renormalization factor tending to +∞; see [KS19, Proposition 6.15] for details. We focus on b) ⇒ a) and assume that Var t(w n ) tends to +∞.
For each n ≥ 1, we can decompose the group W n as a direct product of irreducible factors W n = j≤rn W n,j . For each j ≤ r n , we denote by w n,j uniform random elements in W n,j and by t n,j = t(w n,j ) the associated two-sided descent statistics. Setting t n = t(w n ), we have the following decomposition: where the t n,j in the right-hand side are independent; see [BR19, Lemma 2.2]. We denote s 2 n,j = Var t(w n,j ) and s 2 n = j≤rn s 2 n,j = Var t(w n ) . Introducing the renormalized random variables the decomposition (1) writes as Here and in the following, all tilde variables are centered with variance 1.
We recall that irreducible finite Coxeter groups are of the following types: A p (p ≥ 1), B p (p ≥ 2), D p (p ≥ 4), I 2 (m) (m ≥ 3) or one of the exceptional types (H 3 , H 4 , E 6 , E 7 , E 8 ) [Cox35]. We write a p , b p and d p for uniform random elements in A p , B p and D p respectively. As mentioned above, from previous results [Vat96, CD17, Röt18, Özd19], we know that the three sequences Var(d p ) converge in distribution to a standard normal random variable Z. In addition, their second moment is equal to 1 for all p, so we also have convergence of second moments. From Lemma 3, the distributions of a p , b p and d p converge to that of Z for the d 2 metric.
Fix ε > 0 (everything below, including the definitions of large and small components, depends on ε). We can find p 0 = p 0 (ε) such that for p ≥ p 0 , we have We now split the irreducible components (W n,j ) j≤rn into two groups: those of type A p , B p or D p for some p ≥ p 0 , which we call large and those of other types to which we will refer to as small.
Up to reordering, we can assume that there is an index q n = q n (ε) such that large components are exactly those with j ≤ q n .
We further write s 2 n,+ = qn j=1 s 2 n,j and s 2 n,− = rn j=qn+1 s 2 n,j . We also introduce Using that u → exp(iu) is a 1-Lipschitz function on R, we have, for ζ in R: sn |ζ| E t n,+ − Z ≤ |ζ| t n,+ − Z 2 ≤ |ζ| ε, where the second to last inequality uses s n,+ sn ≤ 1 and Cauchy-Schwartz inequality. Estimates for the small component part. Here, we will use classical characteristic function estimates, as used in Lindeberg central limit theorem (see, e.g., [Bil86,Theorem 27.2]). By definition, small components are of some exceptional type, of type I 2 (m) or of type A p , B p or D p for p < p 0 . Their rank is therefore at most max(8, p 0 ) (I 2 (m) has rank 2, the largest exceptional group E 8 has rank 8 and A p , B p or D p have rank p). But the two sided-descent statistics on any Coxeter group W cannot exceed 2 rk(W ). We conclude that there is a uniform bound K = K(ε) = 2 max(8, p 0 ) on all the t n,j corresponding to small components (j > q n ). In particular, for j > q n , we have s n,j ≤ K.
Fix ζ in R. Using, the definition of t n,− , we have We assumed lim s n = +∞ and argued above that s n,j is uniformly bounded for j > q n . Thus, for n sufficiently large and j > q n , we have ζ 2 s 2 n s 2 n,j ≤ 1. This implies (see [Bil86,eqs. (27.11) and (27.15)] that, for j > q n , we have Since t n,j is bounded by K, we have E |t n,j − E[t n,j ]| 3 ≤ K E |t n,j − E[t n,j ]| 2 = Ks 2 n,j .
Using also s 4 n,j ≤ K 2 s 2 n,j and taking n large enough so that |ζ| ≤ s n , we can simplify the upper bound in (5) to We now use the following basic inequality: if (a i ) i≤t and (b i ) i≤t are collections of numbers of absolute values at most one, then The first term is the left-hand side is exactly E exp iζ s n,− sn t n,− ; see (4). Since s n tends to +∞, the upper bound in the last display tends to 0. Therefore for n large enough, we have Conclusion of the proof. We recall that s 2 n = s 2 n,+ + s 2 n,− and t n = s n,+ sn t n,+ + s n,− sn t n,− . Using again that |a 1 a 2 − b 1 b 2 | ≤ |a 1 − b 1 | + |a 2 − b 2 | for numbers of absolute values at most 1, Eqs. (3) and (7) imply that, for n large enough, Since this holds for any ε and any ζ in R (with a threshold value for n depending on ε and ζ), we have proved that the characteristic function of t n converges pointwise towards exp − ζ 2 2 , which is the characteristic function of a Gaussian random variable. By Lévy's continuity theorem, this concludes our proof.
Technical comment: a naive characteristic function estimates for the large component part would lead to an upper bound in (3) depending on the number q n of large components. Since we have no control on this number, we would have not been able to conclude. Using the second Wasserstein distance avoids this problem. APPENDIX: COXETER GROUPS Coxeter groups have been introduced by Coxeter in the '30s [Cox34,Cox35]. They are now standard objects in combinatorial geometry; we give here a short introduction to the topic to make this note self-contained. Classical references are [Hum92,Bou02,BB05].
A Coxeter matrix M = (m ij ) i,j∈S indexed by some set S is a symmetric matrix with entries in {1, 2, 3, · · · } ∪ {+∞} such that m ij = 1 if and only if i = j. A group W is a Coxeter group if one can find a set S of generators and a Coxeter matrix M indexed by S such that W admits the presentation The pair (W, S) is then called a Coxeter system. When we consider a Coxeter group W , we often also consider a fixed set S such that (W, S) is a Coxeter system. The rank of a Coxeter group (or rather of a Coxeter system) is the size of S. Apart from this combinatorial definition, finite Coxeter groups can also be characterized geometrically: they are finite subgroups of general linear groups generated by reflections.
The direct product of two Coxeter groups is a Coxeter group. A Coxeter group (or rather a Coxeter system) is irreducible if it cannot be written as a direct product of two smaller Coxeter groups. Trivially, any finite Coxeter group is a direct product of irreducible factors. Finite irreducible Coxeter groups have been classified by Coxeter in 1935: • there are three infinite families of increasing rank, commonly denoted A n , B n and D n .
the group A n is the symmetric group on n + 1 elements, B n is the group of permutations of n elements with 2 colors, and D n is a index 2 subgroup of B n . • there is one infinite family I 2 (m) of groups all of rank 2, called dihedral groups. These are the groups of symmetry of regular polygons. • Finally, there are 6 exceptional groups, commonly denotes E 6 , E 7 , E 8 , F 4 , H 3 and H 4 (the index is always the rank or the group). This classification and previous results for the infinite families are crucial in this note.
We end this appendix by defining the notion of descent in a Coxeter group studied in this note. This generalizes the notion of descents in permutations, corresponding to Coxeter groups of type A n . For an element w in a Coxeter group W , we write ℓ(w) for the minimal number of factors needed to write w as a product of elements of S. Then, by definition, a generator s in S is a descent of w if ℓ(ws) < ℓ(w).