ELECTRONIC COMMUNICATIONS in PROBABILITY

Motivated by a generalization of Sturm-Lott-Villani theory to discrete spaces and by a conjecture stated by Shepp and Olkin about the entropy of sums of Bernoulli random variables, we prove the concavity in $t$ of the entropy of the convolution of a probability measure $a$, which has the law of a sum of independent Bernoulli variables, by the binomial measure of parameters $n\geq 1$ and $t$.


Introduction
Throughout this article, a probability measure a on Z will be considered as a family of non-negative real numbers (a(k)) k∈Z satisfying k∈Z a(k) = 1.The entropy of a probability measure a is defined by the formula H(a) := k∈Z U (a(k)), where the function U : R + → R is defined by U (0) := 0 and U (x) := −x log(x) for x > 0.
Given an integer n ≥ 1 and a real parameter t ∈ [0, 1], the binomial measure (b n,t (k)) k∈Z is defined by ∀k ∈ {0, ...n}, b n,t (k where the "natural derivation" operator ∇ [0,n] is defined on the set of functions f : {0, . . ., n} → R by Some properties of this operator can be found in the (forthcoming) article [2].
Equation (1.1) can be seen as a discrete version of the continuous transport equation (where n can be any real number) whose solutions are be defined by an initial measure a 0 and by the formula a t := a 0 * δ nt .In other words, the family of measures (a t ) t∈[0,1] describes the translation of the measure a 0 on a distance n on its right.Consequently, binomial convolutions can be seen as a canonical way to translate smoothly a probabilty measure on the discrete line on a distance n on its right.
A curve (a t ) t∈[0,1] in the space of probability measures on R is a translation if and only if it is a Wasserstein geodesic between two measures a 0 and a 1 such that a 1 = a 0 * δ n for some n ∈ R.
Since the works of Sturm ([7], [8]) and ), geometric properties of a measured length space (X, d, ν) are linked to concavity properties of the entropy (with respect to ν) functional along the geodesics of the Wasserstein space associated to (X, d).In particular, these authors have proven the concavity of the entropy functional along Wasserstein geodesics on the real line.Moreover, the worst cases in this concavity theorem correspond to translations of measures, along which the entropy is constant.
This shows that a proof of the concavity of entropy along binomial convolutions could be an important first step in the description of the "non-negatively curved" behavior of the discrete line.
The paper is organized as follows.We begin by stating some non-rigorous arguments, based on a classical Gaussian approximation of binomial laws, that let us hope that Theorem 1.1 should be true.The next section is technical, and presents the two main tools used in the proof of Theorem 1.1, which are a re-statement of the transport equation and the "modulus of log-concavity" whose non-negativity will be used in controlling some Taylor expansions.The last section is devoted to the proof itself of Theorem 1.1.

Heuristics
In this section, we provide some non-rigorous arguments that explain why we should expect Theorem 1.1 to be true.More precisely, we prove two results about concavity of entropy: the first for solutions of the transport equation and the second for convolutions of functions by a well-chosen Gaussian kernel.
Let f (x, t) be a positive function satisfying the transport equation (2.1) We want to prove that its entropy function H(t) := R U (ρ(x, t))dx is constant (and thus concave).Using the change of variable y = x + nt gives that H(t) = H(0), so H is constant.
The major drawback of this argument is that it does not seem possible to adapt it directly in the discrete case.But it is possible to give another proof of this fact avoiding the change of variable, by writing Remark: Here we used an integration by parts, and this method has a discrete counterpart which is the use of telescopic series.We will use this fact in the proof of Theorem 1.1.
So far, the heuristic discussion suggests us that the entropy of a binomial convolution is almost constant.In the rest of this section, we will show that a better continuous approximation let us hope that this entropy function is actually concave.
In order to construct this "better approximation", we notice that, although constructed to be a discrete version of δ nt , the binomial measure bin(n, t) is known to be better approximated by the continuous Gaussian measure γ(nt, nt(1 − t)) whose density ρ(x, t) is the fundamental solution of a "modified transport equation" Now, let f (x, t) denote the convolution of a smooth probability density f (0, x) by ρ(x, t).
Then f (x, t) is a family of smooth probability densities satisfying the equation (2.2).
The next proposition establishes the concavity of the entropy H(t and is solution to the PDE Until the end of the proof, we simplify the notations, denoting by g , g , . . . the successive partial differentials of g with respect to x. 4) log(g) + (g ) 2 g dx.
To prove the concavity of H, it thus suffices to show that the last integral is nonnegative, and this can be done by tricky integrations by parts: This proves the concavity of H(t) and gives an explicit value for its second derivative: Remark: Actually, the proof of Theorem 1.1 will provide an upper bound on ∂ 2 ∂t 2 H(t) which will be very similar to the exact value of ∂ 2 ∂t 2 H(t) found in this continuous approximation (equation (4.7)).

Technical tools
In this chapter, we isolate some technical lemmas in order to make the proof of Theorem 1.1 more readable.We first study some "differential" properties of binomial convolutions.Then we introduce the modulus of log-concavity and see how it can be recovered in various formulas involving binomial convolutions.
From now, we will often omit the dependence in t of the measures (a n,t (k)) k∈Z .

The transport equation revisited
The discrete transport equation (1.1) satisfied by binomial measures is in fact not expressed in a way that can be easily used for our purposes.Moreover, there is no nice way to generalize this equation to binomial convolutions.It is more convenient to use instead the following well-konwn formula.Definition 3.1.We denote by ∇ 1 (resp.∇ 2 ) the left derivation (resp.the "twice left" derivation) operator, which satisfies, for every function f : Z → R, Let us fix an integer n ≥ 2 and a probability measure a = (a(k)) k∈Z .The next proposition can be seen as another way to express the transport equation, which will be easier to use in the proof of Theorem 1.1: Proposition 3.2.For any binomial convolution a n , i : Proof: By elementary properties of convolution products, it suffices to prove the first point of Proposition 3.2 in the special case where a n,t (k) = b n,t (k), and this points follows from a direct calculation and the elementary formulas k .The second point of the proposition follows from a double application of the first point.
Remark: It is easy to link the measures a n , a n−1 and a n−2 by the formulas

The modulus of log-concavity
A measure (a(k)) k∈Z is said to be log-concave if it satisfies,for every k ∈ Z, a(k) 2 ≥ a(k − 1)a(k + 1).A well-known property states that any s.o.i.B. (and thus any binomial convolution of a s.o.i.B.) is a log-concave measure (see for instance the introduction of [3]).This suggests us to introduce the modulus of log-concavity: Definition 3.3.For any n ≥ 2 and any binomial convolution (a n (k)) k∈Z , the modulus of log-concavity is the function (v n (k)) k∈Z defined by Remark: The choice of defining v n (k) in terms of the (a n−2 (k)) instead of the (a n (k)) will be justified by the simpler forms the next equations will take.
The main interest in introducing the modulus of log-concavity is the fact that, when (a(k)) k∈Z is a s.o.i.B., the log-concavity of the binomial convolution a n−2 implies that v n (k) is always non-negative.A closer look at this fact allows us to be more precise.
The formula This intuition is made rigorous by the following: Proposition 3.4.For any binomial convolution (a n (k)) k∈Z with n ≥ 2, we have: Proof: The idea of the proof consists in writing (∇ 1 a n−1 (k)) 2 − a n (k)∇ 2 a n−2 (k) only in terms of the (a n−2 (k)) k∈Z , using formulas (3.1) and (3.2).The equality is reached after a few lines of calculations.
The same method can be used to find another formulas which are more difficult to be interpreted, but will be useful in the proof of Theorem 1.1: Proposition 3.5.For any binomial convolution a n (t),

Proof of Theorem 1.1
The proof of Theorem 1.1 is inspired by the heuristic proof given at the beginning of the second section.This proof was based on two ingredients: the transport equation (2.1) and the use of integrations by parts which lead to the identity As in the heuristic proof, Theorem 1.1 will be proven by a tricky use of integrations by parts, that in the discrete case will take the form of sums of telescopic series.Equation (4.1) suggests the use of the two following telecospic sums (where by convention 0. log(0) = 0): ECP 17 (2012), paper 4.  The next proposition shows that a tricky use of these two sums gives a very useful inequality: Proposition 4.1.For any k ∈ Z such that a n (k) = 0 we have the inequality: .
Proof: We begin the proof by writing: The next step of the proof consists in studying (and bounding by below) each term (4.4), (4.5) and (4.6) of the right-hand side of the above equation separately.In each case, the method consists in using Proposition 3.5 to make appear the modulus of log-concavity and then use some elementary inequalities involving the function U .
For the first term, we write: We used here the non-negativity of v n (k) and the elementary inequality The same method leads to similar bounds for the term (4.6): The term (4.5) is bounded by using the inequality .
Combining these three inequalities gives ∇ 2 a n−2 (k) log(a n (k)) The rest of the proof of Theorem 1.1 is now straightforward.

Proof of Theorem 1.1:
Let (p 1 , . . ., p r ) ∈ [0, 1] r be such that a is the s.o.i.B. of parameters p 1 , . . .p r .We can suppose that each p i is different from 0 and 1.In this case, for every t ∈]0, 1[, the support of a n,t is exactly {0, . . .N }, where N := n + r.
Moreover, the telescopic series (4.2) and (4.3) can be rewritten (still with the convention that 0. log(0) = 0) The inequality comes from the fact that U (a n (k)) = 1 an(k) ≥ 0. Theorem 1.1 will thus be proven if the last sum is non-negative.

( 3 . 2 )
Remark:We recover in some twisted way the modified transport equation (see Equation (2.2)) in the first point of Proposition 3.2 by noticing that