Fluctuations of the extreme eigenvalues of finite rank deformations of random matrices

Consider a deterministic self-adjoint matrix X_n with spectral measure converging to a compactly supported probability measure, the largest and smallest eigenvalues converging to the edges of the limiting measure. We perturb this matrix by adding a random finite rank matrix with delocalized eigenvectors and study the extreme eigenvalues of the deformed model. We give necessary conditions on the deterministic matrix X_n so that the eigenvalues converging out of the bulk exhibit Gaussian fluctuations, whereas the eigenvalues sticking to the edges are very close to the eigenvalues of the non-perturbed model and fluctuate in the same scale. We generalize these results to the case when X_n is random and get similar behavior when we deform some classical models such as Wigner or Wishart matrices with rather general entries or the so-called matrix models.


Introduction
Most of the spectrum of a large matrix is not much altered if one adds a finite rank perturbation to the matrix, simply because of Weyl's interlacement properties of the eigenvalues.But the extreme eigenvalues, depending on the strength of the perturbation, can either stick to the extreme eigenvalues of the non-perturbed matrix or deviate to some larger values.This phenomenon was made precise in [9], where a sharp phase transition, known as the BBP transition [34,27,38,29], was exhibited for finite rank perturbations of a complex Gaussian Wishart matrix.In this case, it was shown that if the strength of the perturbation is above a threshold, the largest eigenvalue of the perturbed matrix deviates away from the bulk and has then Gaussian fluctuations, otherwise it sticks to the bulk and fluctuates according to the Tracy-Widom law.The fluctuations of the extreme eigenvalues which deviate from the bulk were studied as well when the non-perturbed matrix is a Wishart (or Wigner) matrix with non-Gaussian entries; they were shown to be Gaussian if the perturbation is chosen randomly with i.i.d.entries in [7], or with completely delocalised eigenvectors [18,19], whereas in [12], a non-Gaussian behaviour was exhibited when the perturbation has localised eigenvectors.The influence of the localisation of the eigenvectors of the perturbation was studied more precisely in [13].
In this paper, we also focus on the behaviour of the extreme eigenvalues of a finite rank perturbation of a large matrix, this time in the framework where the large matrix is deterministic whereas the perturbation has delocalised random eigenvectors.We show that the eigenvalues which deviate away from the bulk have Gaussian fluctuations, whereas those which stick to the bulk are extremely close to the extreme eigenvalues of the non-perturbed matrix.In a one-dimensional perturbation situation, we can as well study the fluctuations of the next eigenvalues, for instance showing that if the first eigenvalue deviates from the bulk, the second eigenvalue will stick to the first eigenvalue of the non-perturbed matrix, whereas if the first eigenvalue sticks to the bulk, the second eigenvalue will be very close to the second eigenvalue of the non-perturbed matrix.Hence, for a one dimensional perturbation, the eigenvalues which stick to the bulk will fluctuate as the eigenvalues of the non-perturbed matrix.We can also extend these results beyond the case when the non-perturbed matrix is deterministic.In particular, if the non-perturbed matrix is a Wishart (or Wigner) matrix with rather general entries, or a matrix model, we can use the universality of the fluctuations of the extreme eigenvalues of these random matrices to show that the pth extreme eigenvalue which sticks to the bulk fluctuates according to the pth dimensional Tracy-Widom law.This proves the universality of the BBP transition at the fluctuation level, provided the perturbation is delocalised and random.The reader should notice however that we do not deal with the asymptotics of eigenvalues corresponding to critical deformations.This probably requires a case-by-case analysis and may depend on the model under consideration.
Let us now describe more precisely the models we will be dealing with.We consider a deterministic self-adjoint matrix X n with eigenvalues λ n 1 ≤ • • • ≤ λ n n satisfying the following hypothesis.
Hypothesis 1.1.The spectral measure µ n := n −1 n l=1 δ λ n l of X n converges towards a deterministic probability measure µ X with compact support.Moreover, the smallest and largest eigenvalues of X n converge respectively to a and b, the lower and upper bounds of the support of µ X .
We study the eigenvalues λ n 1 ≤ • • • ≤ λ n n of a perturbation X n := X n + R n obtained from X n by adding a finite rank matrix R n = r i=1 θ i u n i u n * i .We shall assume r and the θ i 's to be deterministic and independent of n, but the column vectors (u n i ) 1≤i≤r chosen randomly as follows.Let ν be a probability measure on R or C satisfying Assumption 1.2.The probability measure ν satisfies a logarithmic Sobolev inequality, is centred and has variance one.If ν is not supported on R, we assume moreover that its real part and its imaginary part are independent and identically distributed.
We consider now a random vector v n = 1 √ n (x 1 , . . ., x n ) T with (x i ) 1≤i≤n i.i.d.real or complex random variables with law ν.Then (1) Either the u n i 's (i = 1, . . ., r) are independent copies of v n (2) Or (u n i ) 1≤i≤r are obtained by the Gram-Schmidt orthonormalisation of r independent copies of a vector v n .
We shall refer to the model (1) as the i.i.d.model and to the model (2) as the orthonormalised model.
Before giving a rough statement of our results, let us make a few remarks.We first recall that a probability measure ν is said to satisfy a logarithmic Sobolev inequality with constant c if, for any differentiable funtion f in L 2 (ν), It is well known that a logarithmic Sobolev inequality implies sub-gaussian tails and concentration estimates.The concentration properties of the measure ν that will be useful in the proofs are detailed in Section 6.2 of the Appendix.
In the orthonormalised model, if ν is the standard real (resp.complex) Gaussian law, (u n i ) 1≤i≤r follows the uniform law on the set of orthogonal random vectors on the unit sphere of R n (resp.C n ) and by invariance by conjugation, the model coincides with the one studied in [10].
For a general probability measure ν, the r i.i.d.random vectors obtained are not necessarily linearly independent almost surely, so that the orthonormal vectors described in (2) are not always almost surely well defined.However, as the dimension goes to infinity, they are well defined with overwhelming probability when ν satisfies Assumption 1.2 .This means the following: we shall say that a sequence of events (E n ) n≥1 occurs with overwhelming probability 1 if there exist two constants C, η > 0 independent of n such that Consequently, in the sequel, we shall restrict ourselves to the event when the model ( 2) is well defined without mentioning it explicitly.
In this work, we study the asymptotics of the eigenvalues of X n outside the spectrum of X n .
It has already been observed in similar situations, see [9], that these eigenvalues converge to the boundary of the support of X n if the θ i 's are small enough, whereas for sufficiently large values of the θ i 's, they stay away from the bulk of X n .More precisely, if we let G µ X be the Cauchy-Stieltjes transform of µ X , defined, for z < a or z > b, by the formula then the eigenvalues of X n outside the bulk converge to the solutions of Indeed, if we let ≤ 0 and then we have the following theorem.Let r 0 ∈ {0, . . ., r} be such that −→ ρ θ i .Moreover, for all i > r 0 (resp.for all i ≥ r − r 0 ) independent of n, The uniform case was proved in [10, Theorem 2.1] and we will follow a similar strategy to prove Theorem 1.3 under our assumptions in Section 2.
The main object of this paper is to study the fluctuations of the extreme eigenvalues of X n .Precise statements will be given in Theorems 3.2, 3.4, 4.3, 4.4 and 4.5.For any x such that x ≤ a or x ≥ b, we denote by I x the set of indices i such that ρ θ i = x.The results roughly state as follows.
Theorem 1.4.Under additional hypotheses, ∈ {a, b} and denote, for each j, k j = |I ρα j | and q 0 the largest index so that α q 0 < 0.Then, the law of the random vector converges to the law of the eigenvalues of (c α j M j ) 1≤j≤q with the M j 's being independent matrices following the law of a k j × k j matrix from the GUE or the GOE, depending whether ν is supported on the complex plane or the real line.The constant c α j is explicitly defined in Equation ( 6).
(2) If none of the θ i 's are critical (i.e.equal to θ or θ), with overwhelming probability, the extreme eigenvalues converging to a or b are at distance at most n −1+ of the extreme eigenvalues of X n for some > 0.
(3) If r = 1 and θ 1 = θ > 0, we have the following more precise picture about the extreme eigenvalues: ) vanishes in probability as n goes to infinity for any fixed i ≥ 1 and some > 0.
vanishes in probability as n goes to infinity for any fixed i ≥ 1 and some > 0.
• For any fixed j ≥ 1, n 1− ( λ n j − λ j ) vanishes in probability as n goes to infinity for some > 0.
These different behaviours are illustrated in Figure 1 below.
The first part of this theorem will be proved in Section 3, whereas Section 4 will be devoted to the study of the eigenvalues sticking to the bulk, i.e. to the proof of the second and third parts of the theorem.Moreover, our results can be easily generalised to non-deterministic self-adjoint matrices X n that satisfy our hypotheses with probability tending to one.This will allow us to study in Section 5 the deformations of various classical models.This will include the study of the Gaussian fluctuations away from the bulk for rather general Wigner and Wishart matrices, hence providing a new proof of the first part of [18,Theorem 1.1] and of [5,Theorem 3.1] but also a new generalisation to non-white ensembles.The study of the (b) Case where θ = 1.5 Figure 1.Comparison between the largest eigenvalues of a GUE matrix and those of the same matrix perturbed: the abscises of the vertical segments correspond to the largest eigenvalues of X, a GUE matrix with size 2.10 3 (under the dotted line) or to those of X n = X + diag(θ, 0, . . ., 0) (above the dotted line).In the left picture, θ = 0.5 < θ = 1 and as predicted, λ 1 ≈ b = 2, whereas in the right one, θ = 1.5 > θ, which indeed implies that λ 1 ≈ ρ θ = θ + 1 θ = 2.17 and λ 2 ≈ b.Moreover, in the left picture, we have, for all i, λ i ≈ λ i , with some deviations deviation of λ i from its limit 2.
In the same way, in the right picture, i, λ i+1 ≈ λ i , with some deviations At last, here, in the right picture, we have λ 1 ≈ 2.167, which gives ≈ 0.040, reasonable value for a standard Gaussian variable.
eigenvalues that stick to the bulk requires a finer control on the eigenvalues of X n in the vicinity of the edges of the bulk, which we prove for random matrices such as Wigner and Wishart matrices with entries having a sub-exponential tail.This result complements [18,Theorem 1.1], where the fluctuations of the largest eigenvalue of a non-Gaussian Wishart matrix perturbed by a delocalised but deterministic rank one perturbation was studied.One should remark that our result depends very little on the law ν (only through its fourth moment in fact).
Our approach is based upon a determinant computation (see Lemma 6.1), which shows that the eigenvalues of X n we are interested in are the solutions of the equation where •, • denotes the usual scalar product in C n .By the law of large numbers for i.i.d.vectors, by [10,Proposition 9.3] for uniformly distributed vectors or by applying Theorem 6.4 (with A n = (z − X n ) −1 ), it is easy to see that for any z outside the bulk, and hence it is clear that one should expect the eigenvalues of X n outside of the bulk to converge to the solutions of G µ X (z) = θ −1 i if they exist.Studying the fluctuations of these eigenvalues amounts to analyse the behavior of the solutions of (3) around their limit.Such an approach was already developed in several papers (see e.g [7] or [12]).However, to our knowledge, the model we consider, with a fixed deterministic matrix X n , was not yet studied and the fluctuations of the eigenvalues which stick to the bulk of X n was never achieved in such a generality.
For the sake of clarity, throughout the paper, we will call "hypothesis" any hypothesis we need to make on the deterministic part of the model X n and "assumption" any hypothesis we need to make on the deformation R n .Moreover, because of concentration considerations that are developed in the Appendix of the paper, the proofs will be quite similar in the i.i.d. and orthonormalised models.Therefore, we will detail each proof in the i.i.d.model, which is simpler and then check that the argument is the same in the orthonormalised model or detail the slight changes to make in the proofs.
Notations.For the sake of clarity, we recall here the main notations of the paper: n are the eigenvalues of the perturbed matrix X n = X n + r i=1 θ i u n i u n * i , where r and the θ i 's are independent of n and deterministic and the column vectors u n i are random and defined above, • for z out of the support of µ, G µ X (z) = • for any non null θ, • p + is the number of i's such that ρ θ i > b, p − is the number of i's such that ρ θ i < a and α 1 < • • • < α q are the different values of the θ i 's such that ρ θ i / ∈ {a, b} (so that q ≤ p − + p + , with equality in the particular case where the θ i 's are pairwise distinct), • γ n 1 , . . . . . .γ n p − +p + are the rescaled differences between the eigenvalues with limit out of [a, b] and their limits: • for any j = 1, . . ., q, k j is the number of indices i such that θ i = α j , i.e. k j = |I ρα j |.

Almost sure convergence of the extreme eigenvalues
For the sake of completeness, in this section, we prove Theorem 1.3.In fact, we shall even prove the more general following result.Let us fix, independently of n, an integer i ≥ 1 and V , a neighborhood of ρ θ i if i ≤ r 0 and of a if i > r 0 .Then λ n i ∈ V with overwhelming probability.The analogue result exists for largest eigenvalues: for any fixed integer i ≥ 0 and V , a neighborhood of ρ θ r−i if i < r − r 0 and of b if i ≥ r − r 0 , λ n n−i ∈ V with overwhelming probability.
By Lemma 6.1, the eigenvalues of X n which are not in the spectrum of X n are the solutions of the equation det(M n (z)) = 0, with ), the functions G n s,t (•) being defined in (4).For z out of the support of µ X , let us introduce the r × r matrix The key point, to prove Theorem 2.1, is the following lemma.For A = [A i,j ] r i,j=1 and r × r matrix, we set |A| ∞ := sup i,j |A i,j |.Lemma 2.2.Assume that Hypothesis 1.1 and Assumption 1.2 are satisfied.For any δ, ε > 0, with overwhelming probability, In the case where the θ i 's are pairwise distinct, Theorem 2.1 follows directly from this lemma, because the z's such that det(M (z)) = 0 are precisely the z's such that for some i, G µ X (z) = 1 θ i and because close continuous functions on an interval have close zeros.The case where the θ i 's are not pairwise distinct can then be deduced by an approximation procedure similar to the one of Section 6.2.3 of [10].
Then since the support of µ X is contained in [a, b] and for n large enough, the eigenvalues of X n are all in [a − δ/2, b + δ/2], it suffices to prove that with overwhelming probability, and n large enough.By Proposition 6.2 with A = (z − X n ) −1 , whose operator norm is bounded by 2δ −1 , we find that for any > 0, there exists c > 0 such that It follows that there are c, η > 0 such that for all z such that |z| ≤ R, d(z, [a, b]) > δ, ) ≤ e −cn η .As a consequence, since the number of z's such that |z| ≤ R and nz have integer real and imaginary parts has order n 2 , there is a constant C such that where the supremum is taken over complex numbers z = k n + i l n , with k, l ∈ Z, such that |z| ≤ R, d(z, [a, b]) > δ.Now, note that for n large enough so that the eigenvalues of which insures that for n large enough, This concludes the proof for the i.i.d.model.
The orthonormalised model can be treated similarly, by writing U n = W n G n with √ nW n a matrix converging almost surely to the identity by Proposition 6.3.

3.
Fluctuations of the eigenvalues away from the bulk 3.1.Statement of the results.Let p + be the number of i's such that ρ θ i > b and p − be the number of i's such that ρ θ i < a.In this section, we study the fluctuations of the eigenvalues of X n with limit out of the bulk, that is ( λ n 1 , . . ., λ n p − , λ n n−p + +1 , . . ., λ n n ).We shall assume throughout this section that the spectral measure of X n converges to µ X faster than 1/ √ n.More precisely, Our theorem deals with the limiting joint distribution of the variables γ n 1 , . . ., γ n p − +p + , the rescaled differences between the eigenvalues with limit out of [a, b] and their limits: The limiting behaviour of the eigenvalues with limit outside the bulk will depend on the law ν through the following quantity, called the fourth cumulant of ν κ 4 (ν) := x 4 dν(x) − 3 in the real case, |z| 4 dν(z) − 2 in the complex case.
The definitions of the α j 's and of the k j 's have been given in Theorem 1.4 and recalled in the Notations gathered at the end of the introduction above.Theorem 3.2.Suppose that Assumption 1.2 holds with κ 4 (ν) = 0, as well as Hypotheses 1.1 and 3.1.Then the law of converges to the law of (λ i,j , 1 ≤ i ≤ k j ) 1≤j≤q , with λ i,j the ith largest eigenvalue of c α j M j with (M 1 , . . ., M q ) being independent matrices, M j following the GUE(k j ) (resp.GOE(k j )) distribution if ν is supported on the complex plane (resp.the real line).The constant c α is given by When κ 4 (ν) = 0, we need a bit more than Hypothesis 3.1, namely We then have a similar result.
Theorem 3.4.In the case when Assumption 1.2 holds with κ 4 (ν) = 0, under Hypotheses 1.1, 3.1 and 3.3, Theorem 3.2 stays true, replacing the matrices c α j M j by matrices c α j M j + D j where the D j 's are independent diagonal random matrices, independent of the M j 's, and such that for all j, the diagonal entries of D j are independent centred real Gaussian variables, with variance −l(ρ α j )κ 4 (ν)/G µ X (ρ α j ).

3.2.
Proof of Theorems 3.2 and 3.4.We prove hereafter Theorem 3.2 and we will indicate briefly at the end of this section the minor changes to make to get Theorem 3.4.The main ingredient will be a central limit theorem for quadratic forms, stated in Theorem 6.4 in the appendix.
For i ∈ {1, . . .q} and x ∈ R, we denote by M n (i, x) the r × r (but no longer symmetric) matrix with entries given by The first step of the proof will be to get the asymptotic behavior of M n (i, x).Lemma 3.5.Let i ∈ {1, . . .q} and x ∈ R be fixed.Under the hypotheses of Theorem 3.2, M n (i, x) converges weakly, as n goes to infinity, to the matrix M(i, x) with entries with (n s,t ) s,t=1,...,r a family of independent Gaussian variables with n s,s ∼ N (0, 2) and n s,t ∼ N (0, 1) when s = t in the real case (resp.n s,s ∼ N (0, 1) and (n s,t ), (n s,t ) ∼ N (0, 1/2) and independent in the complex case).
Proof.From ( 5), we know that for s / Let s ∈ I ρα i .We write the decomposition where The asymptotics of the first term is given by Theorem 6.4 with a variance given by lim As ρ α i is at distance of order one from the support of X n , we can expand Finally, by Hypothesis 3.1, we have Equations ( 8), ( 9), ( 10) and ( 11) prove the lemma (using the fact that the distribution of the Gaussian variables n s,s and n s,t are symmetric).
The next step is to study the behaviour of (M n (i, x)) x∈R as a process on R. We will show in particular that the dependence in the parameter x is very simple.Let (n s,t ) s,t=1,...,r be a family of Gaussian random variables as in Lemma 3.5 and define the random process M(i, •) from R to M n (C) with [M(i, x)] s,t defined as in (7) (where we emphasize that (n s,t ) s,t=1,...,r do not depend on x).Then we have Lemma 3.6.Let i ∈ {1, . . .q} be fixed.The random process (M n (i, x)) x∈R converges weakly, as n→∞, to M(i, •) in the sense of finite dimensional marginals.
Proof.This is a direct application of Remark 6.5, as it is easy to check that for any x, x ∈ R, The last point to check is a result of asymptotic independence, from which the independence of the matrices M 1 , . . ., M q will be inherited.
Lemma 3.7.For any (x 1 , . . ., x q ) ∈ R q , the random variables where the remaining term is uniformly small as x varies in any compact of R.
Then, as the set of indices I ρα 1 , . . ., I ρα q are disjoint, the submatrices involved in the main terms are independent in the i.i.d case and asymptotically independent in the orthonormalised case.
Let us now show (12).Firstly, note that by the convergence of M n s,t (i, x) obtained in the proof of the Lemma 3.5, we have for all s, t ∈ {1, . . ., r} such that s = t or s ∈ I ρα i , for all κ < 1/2, By the formula it suffices to prove that for any σ ∈ S r such that for some i 0 ∈ {1, . . ., r}\I ρα i , σ(i 0 ) = i 0 , It follows immediately from (13) since for any κ < 1/2, in the above product, all the terms with index in I ρα i are of order at most n −κ , giving a contribution n −k i κ , and i 0 is not in I ρα i and satisfies σ(i 0 ) = i 0 , yielding another term of order at most n −κ .Hence, the other terms being bounded because ρ i n (x) stays bounded away from [a, b], the above product is at most of order n −κ(k i +1) and so taking κ ∈ ( k i 2(k i +1) , 1 2 ) proves (14).Now as we have that, for i ∈ {1, . . ., q} and x ∈ R, we can deduce from the lemmata above the following Proposition 3.8.Under the hypothesis of Theorem 3.2, the random process x∈R converges weakly, as n goes to infinity to the random process in the sense of finite dimensional marginals, with the constants c α i and the joint distribution of (M 1 , . . ., M q ) as in the statement of Theorem 3.2.
From there, the proof of Theorem 3.2 is straightforward.
), be fixed.Since, by Theorem 2.1, for all ε > 0, for n large enough, f n vanishes exactly at To prove Theorem 3.4, the only substantial change to make is in the definition (7), in the case when s ∈ I ρα i , we have to put The convergence of [M n (i, x)] s,t to [M(i, x)] s,t is again obtained by applying Theorem 6.4.

Statement of the results.
To study the fluctuations of the eigenvalues which stick to the bulk, we need a more precise information on the eigenvalues of X n in the vicinity of their extremes.More explicitly, we shall need the following additional hypothesis, which depends on a positive integer p and a real number α ∈ (0, 1).Note that this hypothesis has two versions: Hypothesis 4.1[p, α, a] is adapted to the study of the smallest eigenvalues (it is the version detailed below) and Hypothesis 4.1[p, α, b] is adapted to the study of the largest eigenvalues (this version is only outlined below).
and there exist η 2 > 0 and η 4 > 0, so that for n large enough and For rank one perturbations and in the i.i.d.model, we will only require the two first conditions ( 15) and ( 16) whereas for higher rank perturbations, we will need in addition (17) to control the off-diagonal terms of the determinant.
In fact, Assumption 4.2 can be weakened into: for all i, θ i = θ (resp.θ i = θ) if we only study the smallest (resp.largest) eigenvalues.
The fact that the eigenvalues of the non-perturbed matrix are sufficiently spread at the edges to insure the above hypothesis allow the eigenvalues of the perturbed matrix to be very close to them, as stated in the following theorem.
with overwhelming probability.
Moreover, in the case where the perturbation has rank one, we can locate exactly in the neighborhood of which eigenvalues of the non-perturbed matrix the eigenvalues of the perturbed matrix lie.
Note that if p − (p − + r) ≤ 0 (resp.if p − (p + + r) < 0), then the statement of the theorem is empty as far as i's (resp.j's) are concerned.The same convention is made throughout the proof.

4.2.
Proofs.Let us first prove Theorem 4.3.Let us choose i 0 ∈ I a and study the behaviour of λ n i 0 (the case of the largest eigenvalues can be treated similarly).We assume throughout the section that Hypotheses 1.1, 4.1 [r, α, a] and Assumptions 1.2 and 4.2 are satisfied.We also fix α > α.
We know, by Lemma 6.1, that the eigenvalues of X n which are not eigenvalues of X n are the z's such that det(M n (z)) = 0, (18) where and for all s, t, Recall that by Weyl's interlacing inequalities (see [1,Th. A.7]) Let ζ be a fixed constant such that max 1≤i≤p − ρ θ i < ζ < a.By Theorem 2.1, we know that Lemma 4.6.With overwhelming probability, λ n i 0 > ζ.
We want to show that ( 18) is not possible on The following lemma deals with the asymptotic behaviour of the off-diagonal terms of the matrix M n (z) of (19).Lemma 4.7.For s = t and κ > 0 small enough, with overwhelming probability.
The following lemma deals with the asymptotic behaviour of the diagonal terms of the matrix M n (z) of (19).Lemma 4.8.For all s = 1, . . ., r, for all δ > 0, any δ > 0, with overwhelming probability.
Let us assume these lemmas proven for a while and complete the proof of Theorem 4.3.By these two lemmas, for z ∈ Ω n , we find by expanding the determinant that with overwhelming probability, where the O(n −κ ) is uniform on z ∈ Ω n .Indeed, in the second term of the right hand side of each diagonal term is bounded and each non diagonal term is O(n −κ ).
Since for all i, θ i = θ, (22) and Lemma 4.8 allow to assert that with overwhelming probability, for all z ∈ Ω n , det(M n (z)) = 0.It completes the proof of the theorem.
We finally prove the two last lemmas.
Proof of Lemma 4.7.Let us consider z ∈ Ω n (z might depend on n, but for notational brevity, we omit to denote it by z n ).We treat simultaneously the orthonormalised model and the i.i.d.model (in the i.i.d.model, one just takes W n = I and replaces The first step is to show that for any > 0, with overwhelming probability, max l≤n,s≤r Indeed, with O l the lth row vector of O and using the notations of Section 6.2, But g → O l , g n s is Lipschitz for the Euclidean norm with constant one.Hence, by concentration inequality due to the log-Sobolev hypothesis (see e.g.[1, section 4.4]), there exists c > 0 such that for all δ > 0, From Proposition 6.3, we know that with overwhelming probability, (G n (W n ) T ) s 2 is bounded below by √ nn − and the entries of W n are of order one.This gives therefore (23).
We now make the following decomposition (23), for any > 0, with overwhelming probability, we have, uniformly on z ∈ Ω n , We choose 0 < ≤ (α − α)/4 and now study B n (z) which can be written with P the orthogonal projection onto the linear span of the eigenvectors of X n corresponding to the eigenvalues λ n mn+1 , . . ., λ n n .By the second point in Proposition 6.2, with z ∈ Ω n , for all s = t, Moreover, by Hypothesis 4.1, for n large enough, for all z ∈ Ω n , We deduce that there is C, η > 0 such that for all z ∈ Ω n , A similar control is verified for s = t since we have, by Proposition 6.2, whereas Hypothesis 4.1 insures that the term 1 n Tr(P (z − X n ) −1 ) is bounded uniformly on Ω n .Thus, up to a change of the constants C and η, there is a constant M such that for all z ∈ Ω n , Therefore, with Proposition 6.3 and developing the vectors u n s 's as the normalised column vectors of G n (W n ) T , we conclude that, up to a change of the constants C and η, for all z ∈ Ω n , Hence, we have proved that there exists κ > 0, C and η > 0 so that for all z ∈ Ω n , P G n s,t (z) ≥ n −κ ≤ Ce −n η .We finally obtain this control uniformly on z ∈ Ω n by noticing that z→G n s,t (z) is Lipschitz on Ω n , with constant bounded by (min |z − λ i |) −2 ≤ n 2−2α .Thus, if we take a grid (z n k ) 0≤k≤cn 2 of Ω n with mesh ≤ n −2+2α −κ (there are about n 2 such z n k 's) we have sup Since there are at most cn 2 such k and n 2 possible i, j, we conclude that which completes the proof.
Proof of Lemma 4.8.We shall use the decomposition with P as above the orthogonal projection onto the linear span of the eigenvectors of X n corresponding to the eigenvalues λ n mn+1 , . . ., λ n n , and then prove that for z ∈ Ω n , Let us now give a formal proof.Again, we first prove the estimate for a fixed z ∈ Ω n , the uniform estimate on z being obtained by a grid argument as in the previous proof (a key point being that the constants C and η of the definition of overwhelming probability are independent of the choice of z ∈ Ω n ).
First, observe that (15) implies that for any sequence ε n tending to zero, Indeed, for all > 0, for n such that λ p n and a − ε n are both ≥ a − , we have, for all imply (28).
So let us consider z ∈ Ω n (z might depend on n, but for notational brevity, we omit to denote it by z n ).By the inequality |z − λ n k | > n −1+α for all 1 ≤ k ≤ m n and ( 27), we have (29) But as in the previous proof, we have with, by (24), the off diagonal terms t = v of order n −η 2 ∧η 4 /8 with overwhelming probability, whereas the diagonal terms are close to 1 n Tr(P (z − X n ) −1 ) with overwhelming probability by (25).Hence, we deduce with Proposition 6.2 that for any δ > 0, with overwhelming probability.Hence, by (28), for any δ > 0 with overwhelming probability.On the other hand W n s,t W n s,v (1 − P )g n t , (1 − P )g n v By Proposition 6.3, the denominator is of order n with overwhelming probability, whereas by Proposition 6.2, the numerator is of order m n + n √ m n (since Tr(1 − P ) = m n ) with overwhelming probability.As W n is bounded by Proposition 6.3 we conclude that with overwhelming probability.Putting together Equations ( 29), ( 30) and ( 31), we have proved that for any z ∈ Ω n , any δ > 0, with overwhelming probability, the constants C and η of the definition of overwhelming probability being independent of the choice of z ∈ Ω n We do not detail the grid argument used to get a control uniform on z because this argument is similar to what we did in the proof of the previous lemma.
Proof of Theorem 4.4.In the one dimensional case, the eigenvalues of X n which do not belong to the spectrum of X n are the zeroes of with ε n (g) = 1 or g 2 2 /n according to the model we are considering.A straightforward study of the function f n tells us that the eigenvalues of X n are distinct from those of X n as soon as X n has no multiple eigenvalue and (matrix of the eigenvectors of X n ) * × g has no null entry, which we can always assume up to modify X n and g so slightly that the fluctuations of the eigenvalues are not affected.We do not detail these arguments but the reader can refer to Lemmas 9.3, 9.4 and 11.2 of [11] for a full proof in the finite rank case.Therefore, (32) characterises all the eigenvalues of X n .Moreover, by Weyl's interlacing properties, for θ < 0,

and 4.3 thus already settle the study of λ n
1 which either goes to ρ θ or is at distance O(n −1+α ) of λ n 1 depending on the strength of θ.We consider α > α and i ∈ {2, . . ., p} and define Note first that if Λ n is empty, then the eigenvalue of X n which lies between λ n i−1 and λ n i is within n −1+α to both λ n i−1 and λ n i , so we have nothing to prove.Now, we want to prove that f n does not vanish on Λ n and that according to the sign of 1 θ − 1 θ , it vanishes on one side or the other of Λ n in ]λ n i−1 , λ n i [.This will prove (i) and (ii) of the theorem.Part (iii) can be proved in the same way, proving that with overwhelming probability, f n does not vanish in λ n n−i−1 + n −1+α , λ n n−i − n −1+α .The proof of this fact will follow the same lines as the proof of Lemma 4.8 and we recall that P was defined above as the orthogonal projection onto the linear span of the eigenvectors of X n corresponding to the eigenvalues λ n mn+1 , . . ., λ n n .Then, exactly as for (30), we can show that for all δ > 0, with overwhelming probability.Moreover, for any z ∈ Λ n , for any j = 1, . . ., m n , we have By Proposition 6.2, we deduce that for any > 0, with overwhelming probability.We choose in such a way that the latter right hand side goes to zero.Therefore, we know that uniformly on Λ n , Theorem 5.1.Let (X n ) be a sequence of random matrices independent of the u n i 's.Under Assumption 1.2,This result follows from the results with deterministic sequences of matrices X n .Indeed, to prove that a sequence converges to a limit in a metric space, it suffices to prove that any of its subsequences has a subsequence converging to .If the convergences of the hypotheses hold in probability, then from any subsequence, one can extract a subsequence for which they hold almost surely.Then up to a conditioning by the σ-algebra generated by the X n 's, the hypotheses of the various theorems hold.
The remaining of this section is devoted to showing that such results hold if X n , independent of (u n i ) 1≤i≤r , is a Wigner or a Wishart matrix or a random matrix which law has density proportional to e − Tr V for a certain potential V .In each case, we have to check that the hypotheses hold in probability.
5.1.Wigner matrices.Let µ 1 be a centred distribution on R (respectively on C) and µ 2 be a centred distribution on R, both having a finite fourth moment (in the case where µ 1 is not supported on the real line, we assume that the real and imaginary part are independent).We define σ 2 = z∈C |z| 2 dµ 1 (z).
Let (x i,j ) i,j≥1 be an infinite Hermitian random matrix which entries are independent up to the condition x j,i = x i,j such that the x i,i 's are distributed according to µ 2 and the x i,j 's (i = j) are distributed according to µ 1 .We take , which is said to be a Wigner matrix.For certain results, we will also need an additional hypothesis, which we present here: Hypothesis 5.2.The probability measures µ 1 and µ 2 have a sub-exponential decay, that is there exists positive constants C, C such that if X is distributed according to µ 1 or µ 2 , for all t ≥ C , Moreover, µ 1 and µ 2 are symmetric.
The following Proposition generalises some results of [36,18,12,13] which study the effect of a finite rank perturbation on a non-Gaussian Wigner matrix.In particular, it includes the study of the eigenvalues which stick to the bulk.Proposition 5.3.Let X n be a Wigner matrix.Assume that Assumption 1.2 holds.The limits of the extreme eigenvalues of X n are given by Theorem 2.1 and the fluctuations of the ones which limits are out of [−2σ, 2σ] are given by Theorem 3.2, where the parameters a, b, ρ θ , c α are given by the following formulas : b = −a = 2σ, in the orthonormalized model.
Assume moreover that, for all i, θ i ∈ {−σ, σ} and Hypothesis 5.2 holds.If the perturbation has rank one, we have the following precise description of the fluctuations of the sticking eigenvalues : )) converges in law to the pth Tracy Widom law.If the perturbation is rank more than one and Assumption 4.2 holds, the extreme eigenvalues of X n are at distance less than n −1+ for any > 0 to the extreme eigenvalues of X n , which have Tracy-Widom fluctuations.We can localize exactly near which eigenvalue of X n they lie by using Theorem 4.5 in the i.i.d model.
Remark 5.4.All the Tracy-Widom laws involved in the statement of the proposition above, are the ones corresponding respectively to the GOE if µ 1 is supported on R and to the GUE if µ 1 is supported on C.
According to Theorem 5.1, it suffices to verify that the hypotheses hold in probability for (X n ) n≥1 .We study separately the eigenvalues which stick to the bulk and those which deviate from the bulk.

•Deviating eigenvalues.
If X n is a Wigner matrix (that is, with our terminology, with entries having a finite fourth moment), the fact that X n satisfies Hypothesis 1.1 in probability is a well known result (see for example [4, Th. 5.2]) for µ X the semicircle law with support [−2σ, 2σ].The formulas for ρ θ and c α can be checked with the well known formula [1, Sect.2.4]: Moreover, [5,Th. 1.1] shows that Tr(f (X n )) − n f (x)dσ(x) converges in law to a Gaussian distribution for any function f which is analytic in a neighborhood of [−2σ, 2σ].For any fixed z / ∈ [−2σ, 2σ], applied for ) converges in probability to zero, so that Hypothesis 3.1 holds in probability.

•Sticking Eigenvalues.
We now assume moreover that the laws of the entries satisfy Hypothesis 5.2.In order to lighten the notation, we shall now suppose that σ = 1.Let us first recall that by [41,39], the extreme eigenvalues of the non-perturbed matrix X n , once re-centred and renormalised by n 2/3 , converge to the Tracy-Widom law (which depends on whether the entries are complex or real).We need to verify that Hypothesis 4.1[p,α] for any finite p and an α < 1/3 is fulfilled in probability.By [41], the spacing between the two smallest eigenvalues of X n is of order greater than n −γ for γ > 2/3 with probability going to one and therefore, by the inequality it is sufficient to prove the first point of Hypothesis 4. 1[p,α].We shall prove it by replacing first the smallest eigenvalue by the edge −2 thanks to a lemma that Benjamin Schlein [40] kindly communicated to us.We will then prove that the sum of the inverse of the distance of the eigenvalues to the edge indeed converges to the announced limit, thanks to both Soshnikov paper [41] (for sub-Gaussian tails) or [39] (for finite moments), and Tao and Vu article [42].
Lemma 5.5 (B.Schlein).Suppose the entries of X n have a uniform sub-exponential tail.
Then for all δ > 0, for all integer number p, Proof.We write .
Hence for any K 1 > 0, Now, for any K 2 > K 1 , on the event {|λ n p + 2| < K 1 n −2/3 }, for any κ > 0, we have where Note that, from the upper bound on the density of eigenvalues in microscopic intervals, due to [15, Theorem 4.6], we know that for any κ < 1, there is a constant M independent of n so that for all ≥ 1 Let us fix κ ∈ ( 2 3 , 1).It follows that the first term of the r.h.s. of (39) can be estimated by Let us now estimate the second term of the r.h.s. of (39).For any positive integer K 3 , we have From ( 38), ( 39), ( 41) and ( 42), we conclude that +P min √ δ for arbitrary 0 < K 1 < K 3 and K 3 ≥ 1. Taking the limit n → ∞, the last two terms disappear, because by [42,Th. 1.16], the distribution of the smallest K 3 eigenvalues lives on scales of order n −2/3 n −5/6 .Therefore, still for any 0 < K 1 < K 3 and K 3 ≥ 1.Now, note that for K 1 large enough, the first term can be made as small as we want.Then, keeping K 1 fixed, K 2 can be chosen in such a way to make the second term as small as we want too.At last, keeping K 2 fixed, one can choose K 3 large enough to make the third term as small as we want (as can be computed since the limit is given by the K 3 correlation function of the Airy kernel).
To complete the proof of Hypothesis 4.1, we therefore need to show that Lemma 5.6.Assume that the entries of X n satisfy Hypothesis 5.2.Then, for any δ > 0, any finite integer number p, Proof.Notice that by [41,39] we know that the p smallest eigenvalues of X n converge in law towards the Tracy-Widom law, so that lim Thus, for any finite p, with large probability, and therefore it is enough to prove the lemma for any particular p.As in the previous proof, we choose p large enough so that λ n p ≥ −2 + n − 2 3 with probability greater than 1 − δ(p) with δ(p) going to zero as p goes to infinity.We shall prove that with high probability This is enough to prove the statement as for any which converges as γ goes to zero to (2 + x) −1 dµ X (x) = 1 (by e.g. ( 37)).To prove (43), we choose ρ ∈ (2/3, 2/3) and write, on the event For the first term, we use Sinai-Soshnikov bound, which under the weakest hypothesis are given in [39,Theorem 2.1].It implies that with probability going to one with M going to infinity, for s n = o(n 2/3 ) going to infinity, This implies, by Tchebychev's inequality and taking Consequently we deduce that which goes to zero as ρ > 2/3.For the second term B n , note that by [42, Theorem 1.10], for any > 0 small enough, , we deduce for small enough that for all ≥ 1, This allows to bound B n by which goes to zero as n goes to infinity and then γ goes to zero.

Coulomb Gases.
We can also consider random matrices X n which law is invariant under the action of the unitary or the orthogonal group and with eigenvalues with law given by with a polynomial function V of even degree and positive leading coefficient and β = 1, 2 or 4. We assume moreover that V is such that the limiting spectral measure µ V of (X n ) is connected and compact and that its smallest and largest eigenvalues converge to the boundaries of the support.This set of hypotheses is often referred to as the "one-cut assumption".It holds in particular if V is strictly convex and this includes the classical Gaussian ensembles GOE and GUE (with V (x) = x 2 /4 and β = 1, 2).
Proposition 5.7.Under the above hypothesis on V, the extreme eigenvalues of X n converge to the boundary of the support.The convergence of the extreme eigenvalues of X n is given by Theorem 2.1.These eigenvalues have Gaussian fluctuations as stated in Theorem 3.2 if they deviate away from the bulk.Suppose moreover that Assumption 4.2 holds.If the perturbation is of rank one and is strong enough so that the largest eigenvalues deviates from the bulk, for all k ≥ 2, the rescaled kth largest eigenvalue n If the perturbation is of rank more than one, the extreme eigenvalues of X n sticking to the bulk are at distance less than n −1+ for any > 0 from the eigenvalues of X n .In the i.i.d model, Theorem 4.5 prescribes exactly in the neighborhood of which eigenvalues of X n each of them lie.
Proof.As explained above, it suffices to verify that the hypotheses hold in probability for (X n ) n≥1 .
Note that the convergence of the spectral measure, of the edges and the fluctuations of the extreme eigenvalues were obtained in [47].The fact that √ n(G µn (z) − G sc (z)) converges in probability to zero is a consequence of [28] so that Hypothesis 3.1 holds.
We next check Hypothesis 4.1[p,α] for the matrix model P n .We shall prove it for any α > 1/3 and any integer p.We first show that Indeed, the joint distribution of (λ n 1 , . . ., λ n n ) is 1 , by integration by parts.Equation (45) follows, since λ n p converges almost surely to a V (and concentration inequalities insures V (λ n p ) is uniformly integrable).But, for any > 0, with, by convergence of the spectral measure and of λ n p , the right hand side converging to −G µ X (−a V − ) which converges as decreases to zero to −G µ X (−a V ) = −V (a V ).Hence, is bounded below by −V (a V ) with large probability for large n, and converges in expectation to −V (a V ), and therefore converges in probability to −V (a V ).Moreover, by [47] (see [45] in the Gaussian case), the joint law of . ., n 2/3 (λ n p − a V ) converges weakly towards a probability measure which is absolutely continuous with respect to Lebesgue measure.As a consequence, we also deduce from the first point that n −1 i<mn (λ n p − λ n i ) −1 vanishes as n goes to infinity in probability for m n n 1/3 and therefore (45) proves the lacking point of Hypothesis 4.1.
For the two other points, observe that [47] implies that for any > 0, P(|λ in the orthonormalised model.
Assume now that the law of the entries satisfy Hypothesis 5.2.If the perturbation has rank one, we have the following precise description of the fluctuations of the extreme eigenvalues of X n : ) converges in law to the pth Tracy Widom law.
If the perturbation has rank more than one and for all i, θ i / ∈ {c+ √ c, c− √ c}, the extreme eigenvalues of X n are at distance less than n −1+ for any > 0 to the extreme eigenvalues of X n , which have Tracy-Widom fluctuations.
Before getting into the proof, let us make a remark.The Proposition above generalizes some results first appeared in [9,19].In these papers, the authors consider models with multiplicative perturbations (in the sense that the population covariance Σ matrix is assumed to be a perturbation of the identity).Here, we consider additive perturbations but the two models are in fact similar, since a Wishart matrix can be written as a sum of rank one matrices m i=1 σ i Y i Y * i , with σ i the eigenvalues of Σ and Y i n-dimensional vectors with i.i.d.entries.So, adding our perturbation r i=1 θ i U i U * i boils down to change m into m + r (the limit of m/n is not changed) and to extend Σ with some new eigenvalues θ 1 , . . ., θ r .
Proof.Again, it suffices to verify that the hypotheses hold in probability for (X n ) n≥1 .
It is known, [32], that the spectral measure of X n converges to the so-called Marčenko allows to compute ρ θ and c α .Moreover, by [3, Th. 1.1] or [4, Th. 9.10], we also know that a central limit theorem holds for the linear statistics of Wishart matrices, giving Hypothesis 3.1 as in the Wigner case.
For Hypothesis 4.1, the proof is similar to the Wigner case.The convergence to the Tracy-Widom law of the non-perturbed matrix is due to S. Péché [37] (see [33] and [20] for the Gaussian case).The approximation of the eigenvalues by the quantiles of the limiting law can be found in [17,Theorem 9.1] whereas the absolute continuity property needed to prove Lemma 5.5 is derived in [17,Lemma 8.1].This allows to prove Hypothesis 4.1 in this setting as in the Wigner case, we omit the details.5.4.Non-white ensembles.In the case of non-white matrices, we can only study the fluctuations away from the bulk (since we do not have the appropriate information about the top eigenvalues to prove Hypothesis 4.1).We illustrate this generalisation in a few cases, but it is rather clear that Theorem 3.2 applies in a much wider generality.5.4.1.Non-white Wishart matrices.The first statement of Proposition 5.8 can be generalised to matrices X n of the type , where G n is an n×m real (or complex) matrix with i.i.d.centred entries with law µ such that zdµ(z) = 0, |z| 2 dµ(z) = 1 and |z| 4 dµ(z) < ∞ and T n is a positive non random Hermitian n × n matrix with bounded operator norm, with a converging empirical spectral law and with no eigenvalues outside any neighborhood of the support of the limiting measure for sufficiently large n.Indeed, in this case, everything, in the proof, stays true (use [2, Th.1.1]and [4,Th. 5.11]).However, when the limiting empirical distribution of T n is not a Dirac mass, the computation of the ρ θ 's and the c α 's is not easy.5.4.2.Non-white Wigner matrices.There are less results in the literature about the central limit theorem for band matrices (with centring with respect to the limit) and the convergence of the spectrum.We therefore concentrate on a special case, namely a Hermitian matrix X n with independent Gaussian centred entries so that E In [31], matrices of the form with some independent matrices X (n) j from the GUE and self-adjoint matrices a j were studied.Taking a j = ( p, + ,p )σ p, or i( p, − ,p )σ p, with p, the matrix with null entries except at (p, ) and 1 ≤ p ≤ ≤ k, we find that X n = S n .Then it was proved [31, (3.8)] that there exists α, , γ > 0 so that for z with imaginary part greater than n −γ for some γ > 0, which entails the convergence of the spectrum of X n towards the support of the limiting measure [31, Proposition 11] with exponential speed by [31, Proof of Lemma 14].Thus X n satisfies Hypothesis 1.1.Hypothesis 3.1 can be checked by modifying slightly the proof of (46) which is based on an integration by parts to be able to take z on the real line but away from the limiting support.Indeed, as in [23, Section 3.3], we can add a smooth cut-off function in the expectation which vanishes outside of the event A n that X n has all its eigenvalues within a small neighborhood of the limiting support.This additional cut-off will only give a small error in the integration by parts due to the previous point.Then, ( 46), but with an expectation restricted to this event, is proved exactly in the same way, except that z can be replaced by the distance of z to the neighborhood of the limiting support where the eigenvalues of X n lives.Finally, concentration inequalities, in the local version [22, Lemma 5.9 and Part II], insure that on A n , is at most of order n −1+ with overwhelming probability.This completes the proof of Hypothesis 3.1.
5.5.Some models for which our hypothesis are not satisfied.
We gather hereafter a few remarks about some models for which the hypothesis we made on X n are not satisfied.For sake of simplicity, we present hereafter only the case of i.i.d.perturbations (1).5.5.1.I.i.d.eigenvalues with compact support.We assume that X n is diagonal with i.i.d.entries which law µ is compactly supported.As in the core of the paper, we denote by a (resp.b) the left (resp.right) edge of the support of µ.We also denote by F µ its cumulative distribution function and assume that there is κ > 0 such that for all c > 0, lim In this situation, it is easy to check that Hypothesis 1.1 holds in probability with µ X = µ.But Hypothesis 3.1 is not satisfied.Indeed, by classical CLT, we have, for Nevertheless, Theorem 3.2 holds for this model.Indeed, the whole proof of this theorem goes through in this context, except the proof of Lemma 3.5, where we have to make the following decomposition M n s,t (i, x) = M n,1 s,t (i, x)+M n,2 s,t (i, x)+M n,3 s,t (i, x) with the difference that this time M n,3 s,t does not go to zero but converges towards W α i .Hence, the eigenvalues fluctuate according to the distribution of the eigenvalues of (c j M j + W α j I k j ) 1≤j≤q , with c j and M j as in the statement of Theorem 3.2 and I k j denotes the k j × k j identity matrix.
Let us now consider the fluctuations near the bulk.We first detail the fluctuations of the extreme eigenvalues of X n .According to [26], the fluctuations of the largest eigenvalues of X n are determined by the parameter κ defined in (47), that is, if v n = F µ (b − 1/n), then the law of b−λ n n b−vn converges weakly to the law with density proportional to e −x κ on R + .Otherwise stated, the fluctuations of λ n n are of order n −1/κ with asymptotic distribution the Gumbel distribution of type 2. One can check that if κ ≤ 1, then θ = 0.One can show that, for any fixed p, for Hypothesis 4.1[p, α] to hold, we need α > 1 κ − 1 2 and we then obtain that the distance of the extreme eigenvalues of the deformed matrix is at distance less that n −1+α for any α > α.Therefore if κ > 4/3, this theorem allows us to deduce that the fluctuations of the extreme eigenvalues of the deformed matrix are the same as those of the non-deformed matrix.
5.5.2.Coulomb gases with non-convex potentials.In [35], Pastur showed that for a Coulomb gas law (44) with a potential V so that the equilibrium measure has a disconnected support, the central limit theorem does not hold in the sense that the variance may have different limits according to subsequences (see [35, (3.4)].Moreover the asymptotics of √ n(Tr(X n ) − µ(x)) can be computed sometimes and do not lead to a Gaussian limit.We might expect then that also √ n(G µn (x) − G µ (x)) converges to a non-Gaussian limit, which would then result with non-Gaussian fluctuations for the eigenvalues outside of the bulk.Proposition 6.2.Under Assumption 1.2, there exists a constant c > 0 so that for any matrix A := (a jk ) 1≤j,k≤n with complex entries, for any δ > 0, for any g = (g 1 , . . ., g n ) T with i.i.d.entries (g i ) 1≤i≤n with law ν, if C 2 = Tr(AA * ) and if g is an independent copy of g, for any δ, κ > 0, P | g, Ag | > δ Tr(AA * ) + κ Tr((AA * ) 2 ) ≤ 4e −cδ 2 + 4e −c min{κ,κ 2 } .
Proof.The first point is due to Hanson-Wright Theorem [24], see also [15,Proposition 4.5].For the second, we use concentration inequalities, see e.g.[1, Lemma 2.3.3],based on the remark that for any fixed g, g → g, Ag is Lipschitz with constant g, AA * g and therefore, conditionally to g, for any δ > 0, As a consequence, we deduce the second point of the proposition.
Let G n = g n 1 • • • g n r be an n × r matrix which columns g n 1 , . . ., g n r , are independent copies of an n × 1 matrix with i.i.d.entries with law ν and define and, for j , with γ n,j k,l = V n k,l , if l = j, −V n k,i , if l = j.On det[V n k,l ] i−1 k,l=1 = 0, we give to W n i,j an arbitrary value, say one.Putting W n ii = 1 and W n ij = 0 for j ≥ i + 1, it is a standard linear algebra exercise to check that the column vectors W n i,j g n j = ith column of G n (W n ) T are orthogonal in C n .Let us introduce, for M an r × r matrix, M ∞ = sup 1≤i,j≤r |M i,j |.

We next prove
Then the distribution of G n converges weakly to the distribution of a real symmetric (resp.Hermitian) random matrix G = [g s,t ] r s,t=1 such that the random variables {g s,t ; 1 ≤ s ≤ t ≤ r} (resp.{g s,s ; 1 ≤ s ≤ r} ∪ { (g s,t ) ; 1 ≤ s < t ≤ r} ∪ { (g s,t ) ; 1 ≤ s < t ≤ r}) are independent and for all s, g s,s ∼ N (0, 2σ 2 s,s + κ 4 (ν)ω s ) (resp.g s,s ∼ N (0, σ 2 s,s + κ 4 (ν)ω s )) and for all s = t, g s,t ∼ N (0, σ 2 s,t ) (resp.(g s,t ), (g s,t ) ∼ N (0, σ 2 s,t /2)).Remark 6.5.Note that if the matrices A n (s, t) depend on a real parameter x in such a way that for all s, t, for all x, x ∈ R, i,j is bounded.Hence up to the extraction of a subsequence, one can suppose that it converges to a limit τ s,t ∈ C. Since the conclusion of the theorem does not depend on the numbers τ s,t and the weak convergence is metrisable, one can ignore the fact that these convergences are only along a subsequence.In the case where κ 4 (ν) = 0, we can in the same way add the part of the hypothesis related to ω s .
We have to prove that for any real symmetric (resp.Hermitian) matrix B := [b s,t ] r s,t=1 , the distribution of Tr(BG n ) converges weakly to the distribution of Tr(BG).Note that where C n is the rn × rn matrix and U n is the rn × 1 random vector defined by In the real (resp.complex) case, let us now apply Theorem 7.1 of [7]

Hypothesis 4 . 1 .
[p, α, a] There exists a sequence m n of positive integers tending to infinity such that m be the set of indices corresponding to the eigenvalues λ n i (resp.λ n n−r+i ) converging to the lower (resp.upper) bound of the support of µ X .Let us suppose Hypothesis 1.1, Hypothesis 4.1 [r, α, a] (resp.Hypothesis 4.1 [r, α, b]) and Assumptions 1.2 and 4.2 to hold.Then for any α > α, we have, for all i ∈ I a (resp.i ∈ I b ),

2 3 2 3
( λ n n−k+1 − b V ) converges weakly towards the k − 1-th Tracy Widom law.If the perturbation is of rank one and is weak enough, for all k ≥ 1, the rescaled kth largest eigenvalue n ( λ n n−k+1 − b V ) converges weakly towards the k-th Tracy Widom law.

1 5 . 3 .
so that by(45) and Markov's inequality, Hypothesis 4.1 holds in probability for any η < 1/3, η 4 < 1 and α > 1/3.Wishart matrices.Let G n be an n×m real (or complex) matrix with i.i.d.centred entries with law µ such that zdµ(z) = 0, |z| 2 dµ(z) = 1 and |z| 4 dµ(z) < ∞.Let X n = G n G * n /m.Proposition 5.8.Let n, m tend to infinity in such a way that n/m → c ∈ (0, 1).The limits of the extreme eigenvalues of X n are given by Theorem 2.1 and the fluctuations of those which limits are out of [a, b] are given by Theorem 3.2, where the parameters a, b, ρ θ , c α are given by the following formulas: a x)(x − a)1 [a,b] (x)dx, where a = (1 − √ c) 2 and b = (1 + √ c) 2 .It is known, [4, Th. 5.11], that the extreme eigenvalues converge to the bounds of this support.The formula

6 . Appendix 6 . 1 .
Determinant formula.We here state formula (3), which can be deduced from the well known formula det A B C D = det(D) det(A − BD −1 C).

)
Hypothesis 4.1.[p, α, b] is the same hypothesis where we replace λ n For many matrix models, the behaviors of largest and smallest eigenvalues are similar, and Hypothesis 4.1 [p, α, a] is satisfied if and only if Hypothesis 4.1 [p, α, b] is satisfied.In such cases, we shall simply say that Hypothesis 4.1 [p, α] is satisfied.
in probability as n goes to infinity, (iii) if, instead of (15) and (16) in Hypothesis 4.1 [p, α, a], one supposes (15) and (16) in Hypothesis 4.1 [p, α, b] to hold, then n 1−α ( λ n n−i − λ n n−i ) 0≤i<p vanishes in probability as n goes to infinity.Theorem 4.5.Consider the i.i.d.model and let ( λ n i ) i≥1 be the eigenvalues of X n + r i=1 θ i u i u * i .Let p − (resp.p + ) be the number of indices i so that ρ θ i < a (resp.ρ θ i > b).We assume that Assumptions 1.2 and 4.2, Hypothesis 1.1, and (15) and ( n Tr(A n (s, t)(x) − A n (s, t)(x )) 2 −→ then it follows directly from Theorem 6.4 and from a second moment computation that each finite dimensional marginal of the process√ n u n s , A n (s, t)(x s,t )u n t − 1 s=t 1 n Tr(A n (s, s)(x s,s ))1≤s,t≤r , xs,t∈R , xs,t=xt,s converges weakly to the law of a limit process [g s,t ] 1≤s,t≤r , xs,t∈R , xs,t=xt,s where there is no dependence in the variables x s,t (1 ≤ s, t ≤ r).Proof.• Let us first consider the model where the ( √ nu n s ) 1≤s≤r are i.i.d.vectors with i.i.d.entries with law ν satisfying Assumption 1.2.Note that for all s, t = 1, . . ., r, by (48), the sequence 1 n (s, t)2 in the case K = 1.It follows that the distribution of Tr(BG n ) = s,s + κ 4 (ν)ω s ) + 1≤s<t≤r (2b s,t ) 2 σ 2 s,t in the real case, r s=1 b 2 s,s (σ 2 s,s + κ 4 (ν)ω s ) + 1≤s<t≤r (2 (b s,t )) 2 σ 2 s,t 2 + (2 (b s,t )) 2 σ 2 s,t2in the complex case.It completes the proof in the i.i.d.model.