2005): A matrix representation of the Bercovici-Pata bijection

Let $\mu$ be an infinitely divisible law on the real line, $\Lambda(\mu)$ its freely infinitely divisible image by the Bercovici-Pata bijection. The purpose of this article is to produce a new kind of random matrices with distribution $\mu$ at dimension 1, and with its empirical spectral law converging to $\Lambda(\mu)$ as the dimension tends to infinity. This constitutes a generalisation of Wigner's result for the Gaussian Unitary Ensemble.


I Introduction
Wigner and Arnold have established (cf [2], [25]) that the empirical spectral law of matrices of the Gaussian Unitary Ensemble converges almost surely to the semi-circular law.To be more precise, let us consider W n , a n × n Hermitian Gaussian matrix such that E [exp (itrAW n )] = exp − 1 n trA 2 , where tr denotes the ordinary trace on matrices; let λ where SC(0, 1) is the standard centered semi-circular distribution.This distribution corresponds to the Gaussian law in the framework of free probability.
For instance, it arises as the asymptotic law in the free central limit theorem.Moreover, Wigner's result has been reinterpreted by Voiculescu in the early nineties, using the concept of asymptotic freeness (cf [24]).
A few years ago, this correspondence between Gaussian and semi-circular laws has been extended to infinitely divisible laws on R by Bercovici and Pata (cf [7]) using the Lévy-Hinçin formulas.The so-called Bercovici-Pata bijection maps any classically infinitely divisible law µ to the probability measure Λ (µ), as characterized by Barndorff-Nielsen and Thorbjørnsen in [4] by: ∀ζ < 0, iζR Λ(µ) (iζ) = In this formula, C * (µ) denotes the classical cumulant transform of µ, and R Λ(µ) denotes Voiculescu's R-transform of Λ(µ): where F µ is the Fourier transform of µ, and G Λ(µ) the Cauchy transform of Λ (µ).It turns out that Λ(µ) is freely infinitely divisible: for each k there exists a probability measure ν k such that Λ (µ) = ν k k , where denotes the free convolution.The Bercovici-Pata bijection has many interesting features, among them the fact that Λ (µ) is freely stable if µ is stable, and Λ (µ) = SC(0, 1) if µ = N (0, 1).The reader who is not very familiar with free convolution or the Bercovici-Pata bijection will find in [3] a most informative exposition of the subject.
The aim of this paper is to propose a new kind of matrix ensembles; between classically and freely infinitely divisible laws connected through the Bercovici-Pata bijection, those ensembles establish a sequence similar to the Gaussian Unitary Ensemble between the Gaussian and semi-circular laws.More specifically, we will show that, for each integer n and infinitely divisible law µ, an Hermitian random matrix X (µ) n of size n can be produced such that its empirical spectral law μn satisfies E • μ1 = µ and lim n→+∞ μn = Λ (µ) a.s.(I. 1) This main result of our paper is stated in theorem III.2.We hope to have achieved this goal in a rather canonical way, even if X (N (0,1)) n is not equal to W n , but to a slight modification of it.Here are two facts which may justify this opinion: 1.For infinitely divisible measures with moments of all order, it is easy to describe the Bercovici-Pata bijection by noting that the classical cumulants of µ are equal to the free cumulants of Λ (µ).We will define some kind of matrix cumulants, directly inspired by Lehner's recent work ( [18]), and with the property that the matrices X (µ) n have the same cumulants as µ.
2. If µ is the Cauchy law, then the classical convolution µ * and the free one µ coincide.We will prove a simple but somehow surprising result, namely that for each n they coincide also with the convolution with respect to X (µ) n .
Using these Lévy matrix ensembles as a link between classical and free frameworks, it is natural to expect to derive free properties from classical ones.For instance, there is an intimate connection between the moments of a classically infinitely divisible law and those of its Lévy measure.We shall present here how this yields an analogous result in the free framework.
The rest of the paper is organised in 4 sections.Section 2 is devoted to the definition of our matricial laws and their elementary properties.Section 3 states and proves the main theorem: the almost sure convergence of the empirical spectral laws.The proof is based on stochastic ingredients; it first establishes a concentration result and then determines the asymptotic behaviour of the Cauchy transform of the empirical spectral laws.This approach differs from the moment method used by Benaych-Georges for studying the same matrices (cf [5]).Section 4 presents further interesting features of these matrix ensembles, concerning the cumulants and the Cauchy convolution.And section 5 explains the application to moments of the Lévy measure.
Acknowledgements.The author would like to thank Jacques Azéma, Florent Benaych-Georges and Bernard Ycart for many interesting discussions and suggestions.

Notations
• The set of Hermitian matrices of size n will be denoted by H n , the subset of positive ones by H + n .• The set of the classically infinitely divisible laws on R (resp.H n ) will be denoted by ID * (R) (resp.by ID * (H n )).The set of the freely infinitely divisible laws on R will be denoted by ID (R).
• The normalised trace on square matrices will be denoted by tr n (with tr n (Id) = 1).
By the Lévy-Hinçin formula, there exist a unique finite measure σ (µ) * and a unique real γ with the convention that Conversely, µ * (γ, σ) denotes the classically infinitely divisible law determined by γ and σ, and for the sake of simplicity we shall write ϕ (γ,σ) instead of ϕ (µ * (γ,σ)) .Let V n be a random vector Haar distributed on the unit sphere S 2n−1 ⊂ C n .Let us define for every Hermitian matrix with the convention that Due to the Lévy-Hinçin formula, ϕ is the Lévy exponent of a classically infinitely divisible measure on H n , which will be denoted by Λ n (µ).This means: The following properties are obvious: is invariant by conjugation with a unitary matrix.

II.B Explicit realisation
The following proposition describes how to construct a random matrix with distribution Λ n (µ) using µ-distributed random variables: Proposition II.2Let us consider a probability measure µ in ID * (R), a positive real s, n independent random variables λ 1 (s), . . ., λ n (s) with distribution µ * s , and the diagonal matrix L(s) with diagonal entries (λ 1 (s), . . ., λ n (s)).Let U be an independent random matrix of size n, Haar distributed on the unitary group U(n), and M s be the Hermitian matrix U L(s)U * .Then Λ n (µ) is the weak limit of the law of when p tends to infinity, with M (k) We would like to explain how this realisation suggests our main result (cf (I.1) and theorem III.2), just by letting the dimension n tend to infinity before letting the number p of replicas tend to infinity.Because of the law of large numbers, it is obvious that the empirical spectral law of each matrix M (k) = Λ (µ) : this is precisely what is expected to be the limit of the empirical spectral law associated to Λ n (µ) in theorem III.2.
Proof of proposition II.2.We only need to prove the convergence of the characteristic function Notice that Therefore, noting that ϕ (µ) is locally bounded for an infinitely divisible law µ, we obtain: Remark.This explicit realisation has been used by Benaych-Georges in [5] to establish convergence results and to deal with the non-Hermitian case.

II.C Examples
• If µ = N (0, 1), we get: where a 1 , . . ., a n are the eigenvalues of A, and v 1 , . . ., v n the entries of the random vector V n Haar distributed on the unit sphere.Now, using for instance [14] proposition 4.2.3, we know that: Therefore, we obtain: .
This implies that Λ n (µ) is the distribution of where W n is the Gaussian random matrix described in the introduction, and g is a centered reduced Gaussian variable, g and W n being independent.
• If µ is the standard Cauchy distribution dx, then ϕ (µ) (λ) = −|λ|, and the notations being the same as in the previous example.This can be evaluated using the general method proposed by Shatashvili (cf [22]): where a 1 , . . ., a n are the eigenvalues of A. There is a remarkable consequence of this formula: if we denote by X (µ) n a random matrix of law Λ n (µ), then we have Such a property appears usually with a normalisation coefficient; but here there is none, as if µ were a Dirac mass.
• If µ is the standard Poisson distribution, then ϕ (µ) (λ) = e iλ − 1 and Due to the Lévy-Ito decomposition, there is a simple explicit representation of Λ n (µ).Let (X(s), s ≥ 0) be a standard Poisson process, and let (V n (p), p ∈ N) be a family of independent unitary Haar distributed random vectors, (X(s), s ≥ 0) and (V n (p), p ∈ N) being independent.Then Λ n (µ) is the law of In [19], Marçenko and Pastur studied a very similar set of random matrices, precisely of the form with q/n converging when n tends to infinity.It appends that both families of random matrices have same asymptotic empirical spectral distribution, the Marçenko-Pastur law or free Poisson law.

III.A The Bercovici-Pata bijection
Let us consider a real γ and a finite measure σ on R, and let µ ≡ µ * (γ, σ) be the associated classically infinitely divisible law with Lévy exponent ϕ (γ,σ) .Then its image through the Bercovici-Pata bijection is the probability measure on the real line Λ(µ) such that (cf [7]) where R is Voiculescu's R-transform.The mapping Λ is a bijection between classically and freely infinitely divisible laws (cf [7]).Moreover, it has the following properties which we shall recall for comparison with properties II.1: 1.If µ and ν are freely infinitely divisible measures, then 4. If µ is Gaussian (resp.classically stable, classically self-decomposable, classically infinitely divisible) then Λ(µ) is semi-circular (resp.freely stable, freely self-decomposable, freely infinitely divisible).
For instance, if µ = N (0, 1), then Λ(µ) = SC(0, 1), as has been already noticed.If µ is the Cauchy distribution, then Λ(µ) = µ.If µ is the Poisson distribution with parameter θ, then Λ(µ) is the Marçenko-Pastur distribution We can now state the main theorem of the article: be a random matrix defined on a fixed probability space (Ω, F, P) and with distribution Λ n (µ).Then its empirical spectral law μn converges weakly almost surely to Λ (µ) as n tends to infinity.
The proof of this theorem is based on semigroup tools.A Lévy process can be associated to each matricial Lévy law.Its semigroup is characterised by an infinitesimal generator with a rather simple explicit form.This will be a key ingredient.

III.B The infinitesimal generator
Let γ be a real, let σ be a finite measure on R, let µ be the probability measure in ID * (R) such that µ = µ * (γ, σ).Let Λ n (µ) be the associated infinitely divisible law on H n .Then there exists a Lévy process (X has distribution Λ n (µ) (see e.g.corollary 11.6 in [21]).The corresponding semigroup will be denoted by P The following formula is a classical result of the theory (see e.g.theorem 31.5 in [21]): ) -the set of twice continuously differentiable functions vanishing at infinity -and is defined by Remark.If a function f ∈ L(R, R) is differentiable, and if we extend its domain to H n using spectral calculus, then df (A)[I n ] = f (A).

III.C A concentration result
The first step of the proof of theorem III.2 is to establish a concentration result for the empirical spectral law, which has some interest by himself.
Theorem III.4 Let µ be a probability measure in ID * (R), and let (X n (s), s ≥ 0) be the Lévy process associated to distribution Λ n (µ).Let us consider a Lipschitz function f ∈ L(R, R) with finite total variation, and let us define Moreover, δ ε (τ ) is non-increasing in τ .
We need a preliminary lemma: Lemma III.5 Let f ∈ L(R, R) be a Lipschitz function with finite total variation.We shall set Then for all A, B ∈ H n , we have If we replace B 1 by rg(B) B 2 , then the second inequality reduces to the one already established in a similar context by Guionnet and Zeitouni in lemma 1.2 of [13].
Proof.We only need to prove the lemma for B non-negative of rank one, then the result follows by an immediate induction.Let the eigenvalues of A and A + B respectively.Following Weyl's well-known inequalities (see e.g.section III.2 of [8]), we have that Proof of theorem III.4.We will use semigroup tools to establish the concentration property.This method is well-known, see e.g.[16].
We can assume that τ = 1 and E Since the semigroup (P Λn(µ) s , s ≥ 0) and its generator A (µ) n commute, we obtain: where we set Now, due to lemma III.5, we have: Therefore, since e u − 1 − u ≤ u 2 2 e u for all u ≥ 0, we get: We can then infer that for a function C defined by By integrating φ λ (s)/φ λ (s), the preceding inequality results in: ¿From this exponential inequality, we can deduce that, for all ε > 0, where δ ε (1) is defined by (1) .
To conclude the proof, let us notice that which is obviously non-increasing in τ . 2

III.D Proof of theorem III.2
We need two preliminary lemmas: Lemma III.6 Let µ be a probability measure in ID * (R), let X n (s) be a random matrix with distribution Λ n (µ * s ), λ i (s), i = 1, . . ., n its eigenvalues, and let (v i , i = 1, . . ., n) be an independent random vector Haar distributed on the unit complex sphere.Then, for any Lipschitz function f with bounded variations, This is a consequence of the concentration result of section III.C, and of the concentration property of the law of (|v i | 2 , i = 1, . . ., n).
Proof.If ψ ∈ L(C n , R) is a Lipschitz function, with Lipschitz constant ψ ∞ , then (cf e.g.introduction of chapter 1 in [17]), where m ψ is a median of ψ (v 1 , . . ., v n ).From this we can deduce that Therefore, we get the following inequality: for some constant C. Now, for any u = (u 1 , . . ., u n ) and u = (u 1 , . . ., u n ) on the unit complex sphere, we have: Therefore, (λ 1 (s), . . ., λ n (s)) being fixed, the function Notice that n (s)), where f (n) is defined as in theorem III.4.From this theorem, there exists δ ε (1) > 0 such that for all s ∈ [0, 1] We can then infer that: To conclude the proof, just remark that ¿From this inequality, the result of the lemma can be deduced directly.
Proof.First notice that we can suppose A to be a diagonal matrix.Then, due to the resolvant equation, we obtain: Now, for t small enough, we have: The last equality can be extended to any value of t.Therefore Proof of theorem III.2.Using the notations and definitions of the two previous lemmas, we are going to prove that the empirical spectral law μn of X Step 1.For any z = a+ib with b > 0, let us define f z (x) = (z −x) −1 .We shall denote by f (n) z the functional 1  n trf z defined on H n , and by λ the eigenvalues of X (µ) n (s).Due to lemma III.7, it is easy to check that where A(s) and B(s) are defined by and (v 1 , . . ., v n ) denotes a random vector, Haar distributed on the unit sphere of n (s) .This is the Cauchy transform of the empirical spectral law of X (µ) n (s).Then Before going any further, it is time to explain intuitively the proof.The Cauchy transform ψ(z, s) = Λ (µ * s ) (f z ) of Λ (µ * s ) is characterised by ψ(z, 0) = z −1 and (cf [6], or lemma 3.3.9 in [14]): Let us denote by µ (∞) s the (expected) limit law of 1 n n i=1 δ λi(s) as n tends to infinity, and by ψ (∞) (z, s) = µ (∞) s (f z ) its Cauchy transform.Then, following lemma III.6, A(s) converges almost surely to ψ (∞) (z, s) and B(s) to −∂ z ψ (∞) (z, s).Therefore, if we replace A(s) and B(s) in equation (III.1) by their limits, we remark that ψ (∞) is expected to satisfy the same equation (III.2) as ψ.Hence ψ (∞) ought to be equal to ψ and µ ∞ s equal to Λ (µ * s ).
Step 2. Following the previous heuristic hint, we would like to evaluate ∂ s ψ(z, s)− ∂ s ψ (n) (z, s) using equations (III.1) and (III.2): Let us consider z 0 = a 0 + ib 0 with b 0 > 0, and let us introduce (ζ(s), s ≥ 0) such that The mapping is locally Lipschitz on C \ R, as composition of locally Lipschitz functions.Indeed, (where (z) = (z − z)/2) and Therefore, there exists a unique maximal solution to the preceding differential equation (III.3),since we have supposed (ζ(0)) = b 0 > 0. Now let us remark that following equation (III.2).Hence, we can deduce that for any s ≥ 0 such that ζ(s) is defined, then In particular, This integral vanishes when b 0 tends to +∞ uniformly for a 0 in a compact set.Thus, for any b 1 > 0 and any compact set F ⊂ R, there exists a threshold S such that for any a 0 in F , any b 0 > S and any s in [0, 1], one has: Moreover, we can infer that for any pair (b 1 , b 2 ) with 0 < b 2 < b 1 , there exists a compact set K ⊂ C whose interior is non empty, such that for any z 0 = a 0 + ib 0 in K and any s in [0, 1] For the rest of the proof, we shall suppose this property to be satisfied.
Step 3. We now focus on the evaluation of we can establish the following upper bounds: the last inequality coming from , it follows from the preceding bounds that where C, D, E do not depend of n nor s.From Gronwall's lemma, those inequalities provide the following bound Step 4. We now have to find a lower bound for we can infer from inequality (III.4) and lemma III.6 that ψ Hence, there exists n 0 such that for any n ≥ n 0 , for all h ≥ 0. It is easy to see that is uniformly bounded in n and u by some constant c since (ζ(u)) > b 1 .This implies that Step 5.The set of non-negative measures with total mass less than 1 is compact for the topology associated to the set C 0 (R) of continuous functions , n ≥ 0 , with limit ν; then the Cauchy transform 1) is equal to the Cauchy transform ψ(ζ(1), 1) of Λ (µ).Let us remark now that runs over a set whose interior is non empty.Due to the maximum principle, this implies that ν (f z ) = ψ(z , 1) for all z ∈ C with (z ) > 0. Hence, ν is equal to Λ (µ).This gives that there is only one accumulation point for the sequence , n ≥ 1 , hence this sequence is convergent w.r.t. the topology associated to C 0 (R).Since this topology coincides with the weak topology, we can conclude that Using the concentration result in theorem III.4, and the Borel-Cantelli lemma, it is now easy to check ) converges a.s. to Λ (µ) (f z ) for all z in a fixed dense denumerable set in C. Following same arguments as before, this implies that μn = 1 , n ≥ 1 converges weakly a.s. to Λ (µ). 2

IV.A Matrix cumulants
If a probability measure µ in ID * (R) has moments of all orders, so has Λ (µ).Moreover, there is a simple characterisation of the relation between µ and Λ (µ), first remarked by Anshelevich in [1].If we denote by c * p , p ≥ 1 and c p , p ≥ 1 the classical and the free cumulants respectively, then Our aim in this section is to show that this property can be extended to Λ n (µ).We first need to define matrix cumulants, and in so doing we shall be inspired by a recent paper by Lehner.In 1975, Good remarked that where the X (k) , k = 1, . . ., p are i.i.d.random variables with distribution µ (cf [11]).Lehner has established the corresponding formula for the free case (cf [18]): where the X (k) , k = 1, . . ., p are free random variables with distribution µ.This suggests the following definition of a matrix cumulant of order p for a distribution µ on H n which is invariant by unitary conjugation: where the X (k) , k = 1, . . ., p are i.i.d.random matrices with distribution µ.
Remark.Unlike the classical and free cases, these matrix cumulants do not characterise µ, nor its empirical spectral law.
Proposition IV.1 Let µ be a probability measure in ID * (R) with moments of all orders.Then Proof of proposition IV.1.The explicit realization of law Λ n (µ) (cf section II.B) could have been used with some combinatorics, but it is easier to adapt the simple proof of Good's result due to Groeneveld and van Kampen (cf [12]).Recall that Let us consider p random matrices X n , k = 1, . . ., p i.i.d. with distribution Λ n (µ).We can then infer that where V n is a unitary uniformly distributed random vector on C n .Now p k=1 e i 2πk p q = p if q is a multiple of p, and it vanishes otherwise.Therefore But, denoting by a i,j the entries of the matrix A, we obtain: In other terms, we have established that

IV.B The Cauchy matrix
The Cauchy distribution is a meeting point between classical and free probabilities.It is 1-stable in both cases, and it has the same convolution kernel.Let us develop the latter point.Let C be a real even Cauchy variable with parameter ε, and let U be another random variable.Then for each Borelian bounded function f , we have whether C and U are independent or free, if E [•|U ] denotes the orthogonal projector onto L 2 (U ) and P ε the convolution kernel such that dy. Due to this property, the Cauchy distribution is very useful for regularizing in free probability, since ordinary free convolution kernels are much more complicated (cf [9]).The question arised whether there were a family of "Cauchy" matrices with the same property, being a link between the classical case (in one dimension) and the free case (in infinite dimension).This is the original motivation of our introduction of the new matrix ensembles Λ n (µ).
Theorem IV.2 Let µ ε be the even Cauchy distribution with parameter ε and let C n be a random Hermitian matrix of law Λ n (µ ε ).Then, for each Hermitian matrix A and each real Borelian bounded function f , we have Lemma IV.3 Let M = (M i,j ) n i,j=1 be a square matrix of size n, such that M = A + iB for some A ∈ H n and B ∈ H + n .Then det(M ) det(M (1,1) ) > 0, where M (1,1) denotes (M i,j ) n i,j=2 .
Proof.The eigenvalues of M have a positive imaginary part.Hence M is inversible, and the eigenvalues of M −1 have a negative imaginary part.Therefore det(M (1,1) ) has a negative imaginary part, which proves the lemma. 2 Lemma IV.4 Let λ 1 , . . ., λ n be n real independent even Cauchy random variables with parameter 1, let Λ be the diagonal matrix with λ 1 , . . ., λ n as diagonal entries, and let U be an independent random matrix Haar distributed on U(n).
for all A ∈ H n , B ∈ H + n , and c ∈ R.
Proof.This is an elementary consequence of the following result: if (z) > 0, then we have More generally, if f (X) is a rational fonction with numerator's degree less or equal to denominator's degree, and whose singularities are all in Let us introduce For symmetry reasons, we can assume c > 0. Define M = U * AU + iU * BU − cΛ ∨1,1 where Λ ∨1,1 is the diagonal matrix whose upper left entry has been cancelled.Then the entries of (A + iB − cU ΛU * ) −1 are rational fonctions in λ 1 , with det (A + iB − cU ΛU * ) = det(M ) − cλ 1 det(M (1,1) ) as denominator, and with pole ) ) , whose imaginary part is positive, following lemma IV.3.We can then infer that: Repeating this argument, we can deduce 2 Using the same tricks, we can establish the following generalisation of the previous lemma: Lemma IV.5 Let Λ 1 , . . ., Λ m be independent replicas of the preceding matrix Λ, let U 1 , . . ., U m be independent random matrices Haar distributed on U(n), let A 1 , . . ., A q be Hermitian matrices, let B 1 , . . ., B q be positive Hermitian matrices, and let α j,k , j = 1, . . ., m; k = 1, . . ., q be reals.Then Proof of theorem IV.2.Due to linearity and density arguments, we only need to check the desired property for the function f z (x) = 1 z−x .We can suppose (z) > 0. Now, due to lemma IV.4, with its notations, we have for all Hermitian matrix A Therefore, since C n is the limit in law of ε 1 m m j=1 U j Λ j U * j when m → +∞ (cf section II.B), we can deduce that:

V Application: moments of the Lévy laws
Using matrix ensembles, it is tempting to try to transport classical properties to the free framework.We shall present here a simple example, dealing with the relation between the generalised moments of a probability measure µ in ID * (R) and those of its associated Lévy measure.Let us first recall Kruglov's standard result: Definition.A function g: R + → R + is said to be sub-multiplicative if there exists a non-negative real K such that ∀x, y ∈ R, g (|x + y|) ≤ Kg (|x|) g (|y|) .
Theorem V.1 (cf [15] or section 25 of [21]) Let µ be a probability measure in ID * (R), and let g be a sub-multiplicative locally bounded function.Then Our aim is to adapt the proof given in [21], and, using the Lévy matrices, to establish a similar, although partial, result for the free infinitely divisible laws.But we shall first introduce a new definition.Remark that a function g is submultiplicative if and only if there exists K in R + such that for any probability measures µ and ν we have

This may suggest the following definition:
Definition.A function g: R + → R + is said to be freely sub-multiplicative if there exists a non-negative real K such that for any probability measures µ and ν we have Of course, if g is freely sub-multiplicative, then it is sub-multiplicative (take µ, ν = δ x , δ y ).Proposition V.2 Functions 1 + x β with β > 0, exp x and ln(e + x) are freely sub-multiplicative.¿From which we can conclude that function ln(e+x) is freely sub-multiplicative.

2
Remark.Despite these examples, we believe that free sub-multiplicativity is strictly stronger than (ordinary) sub-multiplicativity.
We shall now introduce the main theorem of this section.
Remark.Hiai and Petz have already established that ν has compact support if and only if σ (ν) has (cf theorem 3.3.6 in [14]).
Lemma V.7 Let ν be a probability measure in ID (R) such that the support of σ (ν) is bounded.Then, for all c > 0, ν (exp (c| • |)) is finite.

2 Lemma III. 7
Let us consider a complex ζ ∈ C \ R, the function f ζ (x) = (ζ − x) −1 ,an Hermitian matrix A with eigenvalues λ 1 , . . ., λ n , a unit vector V = (v i , i = 1, . . ., n) ∈ C n , and a real t.Then surely to Λ (µ).Due to the concentration result of section III.C, what remains actually to be established is the convergence of E • μn to Λ (µ).