Outlier eigenvalues for non-Hermitian polynomials in independent i.i.d. matrices and deterministic matrices

We consider a square random matrix of size $N$ of the form $P(Y,A)$ where $P$ is a noncommutative polynomial, $A$ is a tuple of deterministic matrices converging in $\ast$-distribution, when $N$ goes to infinity, towards a tuple $a$ in some $\mathcal{C}^*$-probability space and $Y$ is a tuple of independent matrices with i.i.d. centered entries with variance $1/N$. We investigate the eigenvalues of $P(Y,A)$ outside the spectrum of $P(c,a)$ where $c$ is a circular system which is free from $a$. We provide a sufficient condition to guarantee that these eigenvalues coincide asymptotically with those of $P(0,A)$.


Previous results
introduced the basic non-Hermitian ensemble of random matrix theory. A so-called Ginibre matrix is a N × N matrix comprised of independent complex Gaussian entries. More generally, an i.i.d. random matrix is a N × N random matrix X N = (X ij ) 1 i,j N whose entries are independent identically distributed complex-valued random variables with mean 0 and variance 1.
For any N × N matrix B, denote by λ 1 (B), . . . , λ N (B) the eigenvalues of B and by µ B the empirical spectral measure of B: δ λi (B) . √ N + A N in {z ∈ C : |z| 1 + 2 } and after labeling these eigenvalues properly, as N goes to infinity, for each 1 i j, Two different ways of generalization of this result were subsequently considered. Firstly, [10] investigated the same problem but dealing with full rank additive perturbations. The main terminology related to free probability theory which is used in the following is defined in Section 3 below. Consider the deformed model: (1.1) where A N is an N × N deterministic matrix with operator norm O(1) and such that A N , as a noncommutative variable from the set of N × N matrices with complex entries endowed with the normalized trace (M N (C), tr N ), converges in * -moments to some noncommutative random variable a in some C * -probability space (A, ϕ). According to Dozier and Silverstein [17], for any z ∈ C, almost surely the empirical spectral measure of (S N − zI N )(S N − zI N ) * converges weakly to a nonrandom distribution µ z which is the distribution of (c + a − z)(c + a − z) * where c is a circular operator which is * -free from a in (A, ϕ). On the other hand, in [16], the authors investigate the outliers of several types of bounded rank perturbations of the product of m independent random matrices X N,i , i = 1, . . . , m with i.i.d. entries. More precisely, they study the eigenvalues outside the unit disk, of the following three deformed models where A N and the A N,j 's denote N × N deterministic matrices with rank O(1) and norm O(1): 1.
m k=1 eigenvalues of P i (Y N , A N ) and P i (0, A N ) outside the unit disk coincide asymptotically. Note that the unit disk is equal to the spectrum of each P i (c, 0), i = 1, 2, 3, where c is a free m-circular system.

Assumptions and results
To begin with, we introduce some notations.
• M p (C) is the set of p × p matrices with complex entries, M sa p (C) the subset of self-adjoint elements of M p (C) and I p the identity matrix. • Tr p denotes the trace and tr p = 1 p Tr p the normalized trace on M p (C). • · denotes the operator norm on M p (C). • id p denotes the identity operator from M p (C) to M p (C).
In this paper we generalize the previous results from [10] to non-Hermitian polynomials in several independent i.i.d. matrices and deterministic matrices. Note that our results include in particular the previous results from [16]. Here are the matricial models we deal with. Let t and u be fixed nonzero natural numbers independent from N . N ) converges in * -distribution to a t-tuple a = (a (1) , . . . , a (t) ) in some C * -probability space (A, ϕ) where ϕ is faithful and tracial.
(X1) We consider u independent N × N random matrices X Note that we do not need any assumption on the convergence of the empirical spectral measure of M N . Let c = (c (1) , . . . , c (u) ) be a free noncommutative circular system in (A, ϕ) which is free from a = (a (1) , . . . , a (t) ). According to the second assertion of Proposition 5.2 below, for any z ∈ C, almost surely, the empirical spectral measure of (M N − zI N )(M N − zI N ) * converges weakly to µ z where µ z is the distribution of [P (c, a) − z1] [P (c, a) − z1] * . Since we can assume that ϕ is faithful and tracial, we have by Remark 1.4 that spect(P (c, a)) = {z ∈ C : 0 ∈ supp(µ z )}. N ), where 0 N denotes the N × N null matrix. Throughout the whole paper, we will call outlier any eigenvalue of M N or M (0) N outside C \ spect(P (c, a)). We are now interested in describing the individual eigenvalues of M N outside B(spect(P (c, a)), ) for some > 0.
In the lineage of [10], our main result gives a sufficient condition to guarantee that To state this formally: consider an arbitrary open set G which is relatively compact in C \ spect(P (c, a)) (that is, the closure G ⊂ C \ spect(P (c, a)) is compact). Then one expects that an outlier of M N occurs in G if and only if an outlier of M (0) N occurs in G. For technical reasons, namely the need to be able to apply Rouché's Theorem, we assume that G is sufficiently "nice": specifically, we require that Γ := G is a compact set in C \ spect(P (c, a)), that the closure of any connected component of G is a connected component of Γ, that the fundamental group of each component of G equals the fundamental group of its closure, and that the topological boundary of Γ is a finite union of rectifiable curves (in particular, G has finitely many connected components). Let us see what kinds of situations these conditions are intended to exclude: • Say spect(P (c, a)) = {z ∈ C : |z| ≤ 1, |z − n −1 | ≥ 4 −n , n ≥ 3}, a unit disk with countably many small disks removed from it. Then the set G = ∞ n=3 {z ∈ C : |z − n −1 | < 8 −n } is included in C \ spect(P (c, a)), but its closure is not. Γ = G is not compact in C \ spect(P (c, a)), and ∂Γ is a countable, but not finite, union of rectifiable curves; • Say spect(P (c, a)) = {z ∈ C : |z − 100| ≤ 1}. The set G = {z ∈ C : − 1 < z < 1, −1 < z < 1, |z ± 1| > 1} has two connected components (above and below zero), while Γ = G is connected; • For the same spect(P (c, a)), the set G = {z ∈ C : |z| < 1, |z − 1/2| > 1/2} is simply connected, but Γ isn't.
While this is a long list of restrictions, they hardly constitute a reduction in generality.
To begin with, a statement regarding outliers of M N and M N and wishes to identify the outliers of the other, it is effectively enough to consider the case when G is a finite union of open disks with mutually disjoint closures. Theorem 1.10. Assume that hypotheses (A1), (X1) hold. Let G be an open relatively compact subset of C\spect(P (c, a)), and let Γ = G. We assume that (i) ∂Γ is a finite union of rectifiable curves; (ii) the closure of any connected component of G is a connected component of Γ; and (iii) the fundamental group of each connected component of G coincides with the fundamental group of its closure. Assume moreover that • for any k = 1, . . . , t, sup If for some > 0, for all large N , then almost surely for all large N , the numbers of eigenvalues of M (0) The next statement is an easy consequence of Theorem 1.10. Corollary 1.11. Assume that (X1) holds and that, for k = 1, . . . , t, A in {z ∈ C, d(z, spect(P (c, 0))) 2 } and after labeling these eigenvalues properly, for

Remark 1.12.
It is sufficient to prove Theorem 1.10 and Corollary 1.11 for a noncommutative polynomial P with no constant term, that is, such that P (0, . . . , 0) = 0; the general result follows easily by translation.
We will first prove Theorem 1.10 in the case when all ranks r k (N ), 1 k t, are equal to zero. N ) and Γ ⊂ C\spect(P (c, a)) a compact set. Then, a.s. for all N large enough, M N has no eigenvalue in Γ.
In particular, if assumptions of Theorem 1.10 hold with, for any k = 1, . . . , t, (A (k) N ) and Γ = C\spect(P (c, a)) then for any ε > 0, a.s. for all N large enough, all eigenvalues of M N are in B(spect(P (c, a)), ε).
While Theorem 1.10 requires supplementary hypotheses on the compact Γ, those hypotheses are clearly not necessary in Theorem 1.13.
To prove Theorems 1.13 and 1.10, we make use of a linearization procedure which brings the study of the polynomial back to that of the sum of matrices in a higher dimensional space. Then, this allows us to follow the approach of [10]. But for this purpose, we need to establish substantial operator-valued free probability results.
In Section 2, we present our theoretical results and corresponding simulations for four examples of random polynomial matrix models. Section 3.2 provides required definitions and preliminary results on operator-valued free probability theory. Section 4 describes the fundamental linearization trick as introduced in [1, Proposition 3]. In Sections 5 and 6, we establish Theorems 1.13 and 1.10 respectively.

Related results and examples
Recall that we do not need any assumption on the convergence of the empirical spectral measure of M N . However, the convergence in * -distribution of to (c, a) = (c (1) , . . . , c (u) , a (1) , . . . , a (t) ) (see Proposition 5.2) implies the EJP 26 (2021), paper 100. convergence in * -distribution of to P (c, a). In this situation, a good candidate to be the limit of the empirical spectral distribution of M N is the Brown measure µ P (c,a) of P (c, a) (see [14]). Unfortunately, the convergence of the empirical spectral distribution of M N to µ P (c,a) is still an open problem for an arbitrary polynomial. In the following three examples, we will consider the particular situation where we can decompose N a Ginibre matrix and Q an arbitrary polynomial. Indeed, in this case, a beautiful result ofŚniady [33] ensures that the empirical spectral distribution of M N converges to µ P (c,a) . Thus, the description of the limiting spectrum of M N inside supp(µ P (c,a) ) is a question of computing explicitly µ P (c,a) (a quite hard problem, which can be handled numerically by [7]). On the other hand, Theorem 1.10 explains the behaviour of the spectrum of M N outside spect(P (c, a)). Thus, we have a complete description of the limiting spectrum of M N , except potentially in the set spect(P (c, a)) \ supp(µ P (c,a) ) which is not necessarily empty (even if it is empty in the majority of the examples known, see [13]). For an arbitrary polynomial, we only know that any limit point of the empirical spectral distribution of M N is a balayée of the measure µ P (c,a) (see [13,Corollary 2.2]), which implies that the support of any such limit point is contained in supp(µ P (c,a) ), and in particular is contained in spect(P (c, a)).

Example 1
We consider the matrix N , X N are i.i.d. Gaussian matrices and The matrix M N converges in * -distribution to 3 2 c, where c is a circular variable, and the empirical spectral measure of M N converges to the Brown measure of c, which is the uniform law on the centered disk of radius 3/2 by [13]. This disk is also the spectrum

Example 2
We consider the matrix N , X N are i.i.d. Gaussian matrices, N is a realization of a G.U.E. matrix.
The matrix M N converges in * -distribution to the elliptic variable 1 2 (c + s), where c is a circular variable and s a semicircular variable free from c. The empirical spectral measure of M N converges to the Brown measure of 1 2 (c + s), which is the uniform law on the interior of the ellipse { 3 2 √ 2 cos(θ) + i 1 2 √ 2 sin(θ) : 0 θ < 2π} by [13]. The interior of this ellipse is also the spectrum of 1 2 (c + s). Our theorem says that, outside this ellipse, the outliers of M N are closed to the outliers of P 2 (0 N , 0 N , 0 N , A Figure 2). Moreover, the outliers of A (1) N are those of an additive perturbation of a G.U.E. matrix, and converges to 2.125 and −2.6 by [29].
N , A N ).

Example 3
We consider the matrix N , X N are i.i.d. Gaussian matrices, is a matrix whose empirical spectral distribution converges to 1 The matrix M N converges in * -distribution to the random variable c + a, where c is a circular variable and a is a self-adjoint random variable, free from c, and whose distribution is 1 2 (δ 1 +δ −1 ). The empirical spectral measure of M N converges to the Brown measure of c + a, which is absolutely continuous and whose support is the region inside the lemniscate-like curve in the complex plane with the equation {z ∈ C : |z 2 + 1| 2 = |z| 2 + 1} by [13]. This region is also the spectrum of c + a. Our theorem says that, outside this region, the outliers of M N are closed to the outliers 2.5 and −1 + 2i of Figure 3).

Scalar-valued free probability theory
For the reader's convenience, we recall the following basic definitions from free probability theory. For a thorough introduction to free probability theory, we refer to [43].
• A C * -probability space is a pair (A, ϕ) consisting of a unital C * -algebra A and a state ϕ on A (i.e. a linear map ϕ : A → C such that ϕ(1 A ) = 1 and ϕ(aa * ) 0 for all a ∈ A). ϕ is a trace if it satisfies ϕ(ab) = ϕ(ba) for every (a, b) ∈ A 2 . A trace is said to be faithful if ϕ(aa * ) > 0 whenever a = 0. An element of A is called a noncommutative random variable.
• The * -noncommutative distribution of a family a = (a 1 , . . . , a k ) of noncommutative random variables in a C * -probability space (A, ϕ) is defined as the linear functional µ a : P → ϕ(P (a, a * )) defined on the set of polynomials in 2k noncommutative indeterminates, where (a, a * ) denotes the 2k-tuple (a 1 , . . . , a k , a * 1 , . . . , a * k ). For any self-adjoint element a 1 in A, there exists a probability measure ν a1 on R such that, for every polynomial P, we have µ a1 (P ) = P (t)dν a1 (t).
Then, we identify µ a1 and ν a1 . If ϕ is faithful then the support of ν a1 is the spectrum of a 1 and thus a 1 = sup{|z|, z ∈ supp(ν a1 )}. • A family of elements (a i ) i∈I in a C * -probability space (A, ϕ) is free if for all k ∈ N and all polynomials p 1 , . . . , p k in two noncommutative indeterminates, one has semicircular variable if x = x * and for any k ∈ N, (t)dt is the semicircular standard distribution. • Let k be a nonnull integer number. Denote by P the set of polynomials in 2k noncommutative indeterminates. A sequence of families of variables (a n ) n 1 = (a 1 (n), . . . , a k (n)) n 1 in C * -probability spaces (A n , ϕ n ) converges, when n goes to infinity, in distribution if the map P ∈ P → ϕ n (P (a n , a * n )) converges pointwise.

Basic definitions
Operator-valued distributions and the operator-valued version of free probability were introduced by Voiculescu in [39] with the main purpose of studying freeness with amalgamation. Thus, an operator-valued noncommutative probability space is a triple Elements in M are called free with amalgamation over B if the algebras generated by B and the elements are also so.
We will only need the more restrictive context in which M is a finite von Neumann algebra which is a factor, B is a finite-dimensional von Neumann subalgebra of M (and hence isomorphic to an algebra of matrices), and E is the unique trace-preserving conditional expectation from M to B. The B-valued distribution of an element X ∈ M w.r.t. E is defined to be the family of multilinear maps called the moments of µ X : with the convention that the first moment (corresponding to n = 1) is the element E[X] ∈ B, and the zeroth moment (corresponding to n = 0) is the unit 1 of B (or M ). The distribution of X is encoded conveniently by a noncommutative analytic transform defined for certain elements b ∈ B, which we agree to call the noncommutative Cauchy transform: , which completely encodes µ X -see [42]; since we do not need this extension, we shall not discuss it any further, but refer the reader to [42,40,41,30] 2i is the imaginary part of b. It follows quite easily that G X (H + (B)) ⊆ H + (B) -see [41].
We warn here the reader that we have changed conventions in our paper compared to [40,41,42], namely we have chosen G Among many other results proved in [39], one can find a central limit theorem for random variables which are free with amalgamation. The central limit distribution is called an operator-valued semicircular, by analogy with the free central limit for the usual, scalar-valued random variables, which is Wigner's semicircular distribution. It has been shown in [39] that an operator-valued semicircular distribution is entirely described EJP 26 (2021), paper 100. by its operator-valued free cumulants: only the first and second cumulants of an operatorvalued semicircular distribution may be nonzero (see also [34,42]). For our purposes, we use the equivalent description of an operator-valued semicircular distribution via its noncommutative Cauchy transform, as in [23]: S is a B-valued semicircular if and only if for some m 1 = m * 1 ∈ B and completely positive map η : is the operator-valued variance. The above equation is obviously a generalization of the quadratic equation determining Wigner's semicircular Here m 1 is the -classical -first moment of S, and σ 2 its classical variance, which, as a linear completely positive map, is the multiplication with a positive constant. Unless otherwise specified, we shall from now on assume our semicirculars to be centered, i.e. m 1 = 0.
Note that we do allow our scalars to be zero. This is a particular case of a result from [31], and its proof can be found in great detail in [26].
An important fact about semicircular elements, both scalar-and operator-valued, is that the sum of two free semicircular elements is again a semicircular element (this follows from the fact that a semicircular is defined by having all its cumulants beyond the first two equal to zero -see [39]). In particular, if {s 2,2 } are centered all semicirculars of variance one, and in addition we assume them to be free from each other, then is also an M 2 (C)- 1,2 ± is 1,2 and s 1,2 free from each other, has precisely the same law. Thus, the following definition, due to Voiculescu, is natural.

Preliminary results
We first establish preliminary results in free probability theory that we will need in the following sections.
Proof. Let us prove that s 1 , . . . , s u is free from We already now (see [26,Chapter 9]) that s 1 , . . . , s u are semicircular variables over M 2 (C) which are free from M 2 (B), with respect to id 2 ⊗ϕ. Moreover, the covariance mapping of s 1 , . . . , s u is the function (η Using [27,Theorem 3.5], the freeness of s 1 , . . . , s u from M 2 (B) over M 2 (C) gives us the free cumulants of s 1 , . . . , s u over M 2 (B). More concretely, we get that s 1 , . . . , s u are semicircular variables over M 2 (B), with a covariance mapping (η Because of the previous computation, we know that η • (tr 2 ⊗ϕ). As a consequence, using again [27, Theorem 3.5], s 1 , . . . , s u are semicircular variables over C free from M 2 (B) with respect to (tr 2 ⊗ϕ), and the covariance mapping η C i,j is given by the restriction of the covariance  ζ j ⊗ c (j) + y| 2 has the same distribution as | u j=1 ζ j ⊗ s j + (I m ⊗ ) · y| 2 for any scalar matrices ζ 1 , . . . ζ u ∈ M m (C), where is a selfadjoint {−1, +1}-Bernoulli variable in A, independent from the entries of y, and s 1 , . . . , s u are free semicircular variables in A, free from and the entries of y.
In the lemma above, we consider the symmetric version y of y, thanks to a noncommutative random variable which is tensor-independent from the entries of y, in the sense that commutes with the entries of y and ϕ( Proof. Let n 0. We compute the n-th moment of | u j=1 ζ j ⊗ c (j) + y| 2 with respect to id m ⊗ ϕ, and compare it to the n-th moment of | u j=1 ζ j ⊗ s j + y| 2 with respect to id m ⊗ ϕ.
Let us set a 0 = y and a j = ζ j ⊗ c (j) . We compute Similarly, where b 0 = (I m ⊗ ) · y and b j = ζ j ⊗ s j . In order to conclude, it suffices to prove that, for all 0 i 1 , . . . , i 2n u, Let us fix 0 i 1 , . . . , i 2n u. Note that a 0 is free over M m (C) from a j with respect to id m ⊗ ϕ (see [26,Chapter 9]). Let us fix S = {j : i j = 0} ⊂ {1, . . . , 2n} and use the moment cumulant formula (see [34, page 36]): where π c is the largest partition of S c such that π ∪ π c is noncrossing andĉ andφ are the M m (C)-valued cumulant function and the M m (C)-valued moment function associated to the conditional expectation id m ⊗ ϕ. We use here the notation of [34, Notation 2.1.4] which defines (ĉ ∪φ)(π ∪ π c ) as some M m (C)-valued multiplicative function that acts on the blocks of π likeĉ and on the blocks of π c likeφ.
Recall that the cumulants of ζ j ⊗ c (j) are vanishing if π is not a pairing and if π is not alternating (which means that π links two indices with the same parity). Now, let us remark that if π is a pairing which is alternating, then π c is even (each blocs of π c is EJP 26 (2021), paper 100. even). Thus, Similarly, the cumulants of ζ j ⊗ s (j) are vanishing if π is not a pairing and that the moment of b 0 is vanishing if π c is odd. Moreover, if π is a pairing and π c is even, then π is alternating. As a consequence, In order to conclude, it suffices to remark that y and y have the same even M m (C)valued moments and ζ j ⊗ c (j) and ζ j ⊗ s (j) have the same alternating M m (C)-valued cumulants.
It follows from [4] that the support in M sa m (C) of the addition of a semicircular s of variance η and a selfadjoint noncommutative random variable y ∈ (M m (A), id m ⊗ ϕ) which is free with amalgamation over M m (C) with s, is given via its complement in terms of y and the functions H(w) = w − η(G y (w)) and ω(b) = b + η(G y (ω(b)), It follows quite easily that spect(η • G y (ω(b))) ⊂ D. Generally, all conditions on the derivatives of ω and H follow from the two functional equations above.
Proof. Assume that y − w is invertible and spect(η • G y (w)) ⊂ D \ {1}. Since w = w * , the derivative G y (w) is completely positive, so η • G y (w) is completely positive. This means according to [18,Theorem 2.5] that the spectral radius r of η • G y (w) is reached at a positive element ξ ∈ M m (C), so that necessarily r ≥ 0. Since 1 ∈ σ(η • G y (w)) by hypothesis, it follows that r < 1, and thus spect(η • G y (w)) ⊆ rD D.
This forces the derivative of H(w), H (w) = id m − η • G y (w), to be invertible as a linear operator from M m (C) to itself. By the inverse function theorem, H has an analytic EJP 26 (2021), paper 100.
inverse on a small enough neighborhood of H(w) onto a neighborhood of w. Since H preserves the selfadjoints near w, so must the inverse. On the other hand, the map v → H(w) + η(G y (v)) sends the upper half-plane into itself and has w as a fixed point. Since its derivative has all its eigenvalues included strictly in D (recall that the spectral radius r < 1), it follows that w is actually an attracting fixed point for this map. Since for any b in the upper half-plane, ω(b) is given as the attracting fixed point of v → b+η(G y (v)), it follows that ω coincides with the local inverse of H on the upper half-plane, so the local inverse of H is the unique analytic continuation of ω to a neighborhood of H(w). This proves that ω extends analytically to a neighborhood of H(w) and the extension maps selfadjoints from this neighborhood to M sa m (C). In particular, ω(H(v)) = v and G s+y (H(v) Conversely, say b = b * and s + y − b is invertible. Then G s+y is analytic on a neighborhood of b and maps selfadjoints from this neighborhood into M sa m (C). Since ω(b) = b + η(G s+y (b)), the same holds for ω. Since, by [4,Proposition 4.1], spect(ω (v)) ⊂ { z > 1/2} for any v in the upper half-plane, the analyticity of ω around b = b * implies spect(ω (b)) ⊂ { z ≥ 1/2}. Thus, ω is invertible wrt composition around b by the inverse function theorem. As argued above, H is its inverse, and extends analytically to a small enough neighborhood of ω(b), with selfadjoint values on the selfadjoints. Composing with H to the left in Voiculescu's subordination relation G s+y (v) = G y (ω(v)) yields G y+s (H(w)) = G y (w), guaranteeing that G y is analytic on a neighborhood of ω(b), with selfadjoint values on the selfadjoints, and so y − ω(b) must be invertible. The following lemma is a particular case of the above proposition. Proof. Note that our hypotheses that all entries of the selfadjoint y are symmetric and that s is centered imply automatically that H(iM m (C) Assume that y is invertible and spect(η • G y (0)) ⊆ D \ {1}. In particular, G y is analytic on a neighborhood of zero in M m (C). Proposition 3.5 implies that s + y − H(0) is invertible. Since H(iM m (C) + ) ⊆ iM m (C) + , it follows from the formula of H that H(0) = 0. Thus, s + y is invertible.
Conversely, assume that s + y is invertible, so that G s+y extends analytically to a small neighborhood of zero in such a way that it maps selfadjoints to selfadjoints. Since ω(b) = b + η(G s+y (b)), it follows that ω does the same. According to Proposition 3.5, y − ω(0) is invertible. Since ω(iM m (C) + ) ⊆ iM m (C) + , we again have that ω(0) = 0, so that y is invertible.
A powerful tool to deal with noncommutative polynomials in random matrices or in operators is the so-called "linearization trick." Its origins can be found in the theory of automata and formal languages (see, for instance, [32]), where it was used to conveniently represent certain categories of formal power series. In the context of operator algebras and random matrices, this procedure goes back to Haagerup and Thorbjørnsen [21,22] (see [26]). We use the version from [1, Proposition 3], which has several advantages for our purposes, to be described below.
We denote by C X 1 , . . . , X k the complex * -algebra of polynomials in k noncommuting indeterminates X 1 , . . . , X k . The adjoint operation is given by the anti-linear extension of (X i1 X i2 · · · X i l ) * = X * i l · · · X * i2 X * i1 , (i 1 , . . . , i l ) ∈ {1, . . . , k} l , l ∈ N \ {0}. We will sometimes assume that some, or all, of the indeterminates are selfadjoint, i.e. X * j = X j . Unless we make this assumption explicitly, the adjoints X * 1 , . . . , X * k are assumed to be algebraically free from each other and from X 1 , . . . , X k . Given a polynomial P ∈ C X 1 , . . . , X k , we call linearization of P any L P ∈ M m (C) ⊗ C X 1 , . . . , X k such that 3. u * is a row vector and v is a column vector, both of length m − 1, with entries in C X 1 , . . . , X k , 4. the polynomial entries in Q, u and v all have degree 1, and 5.
We refer to Anderson's paper [1] for the -constructive -proof of the existence of a linearization L P as described above for any given polynomial P ∈ C X 1 , . . . , X k . It turns out that if P is selfadjoint, then L P can be chosen to be self-adjoint.The well-known result about Schur complements yields then the following invertibility equivalence.

Lemma 4.1.
[26, Chapter 10, Corollary 3] Let P ∈ C X 1 , . . . , X k and let L P ∈ M m (C X 1 , . . . , X k ) be a linearization of P with the properties outlined above. Let e 11 be the m × m matrix whose single nonzero entry equals one and occurs in the row 1 and column 1. Let y = (y 1 , . . . , y k ) be a k-tuple of operators in a unital C * -algebra A. Then, for any z ∈ C,

Lemma 4.2.
Let P ∈ C X 1 , . . . , X k and let L P ∈ M m (C X 1 , . . . , X k ) be a linearization of P with the properties outlined above. There exist two polynomials T 1 and T 2 in k commutative indeterminates, with nonnegative coefficients, depending only on L P , such that, for any k-tuple y = (y 1 , . . . , y k ) of operators in a unital C * -algebra A, for any z ∈ C such that z1 A − P (y) is invertible, (ze 11 ⊗ 1 A −L P (y)) −1 T 1 ( y 1 , . . . , y k ) × (z1 A − P (y)) −1 +T 2 ( y 1 , . . . , y k ) . Proof. The proof is similar to the proof of [5,Lemma 4.4]. The linearization of P can be written as Now, a matrix calculation in which we suppress the variable y shows that (ze 11 Since v, u * , and Q −1 are polynomials in y 1 , . . . , y k , the result readily follows. In Section 5.3, we will provide an explicit construction of a linearization that is best adapted to our purposes. In this construction, it is clear that we can always find a linearization such that, for any k-tuple y of matrices, det Q(y) = ±1. Then, for any z in Γ, there exists γ z > 0, such that almost surely, for all large N , s N (M N − zI N ) γ z . Consequently, there exists γ Γ > 0 such that almost surely, for all large N , inf z∈Γ s N (M N − zI N ) γ Γ .

Ideas of the proof
The proof of Proposition 5.1 is based on the two following key results.

Proposition 5.2.
Assume that (X1) holds. Let K be a polynomial in u+t noncommutative variables. Define N , j = 1, . . . , t} be a set of noncommutative random variables in (A, ϕ) which is free from a free circular system c = (c (1) , . . . , c (u) ) in (A, ϕ) and such that the * -distribution of (A with respect to ϕ. If the interval [x, y], x < y, is such that there exists δ > 0 such that for all large N , (x − δ, y + δ) ⊂ R \ supp(τ N ), then, we have P [for all large N, spect(K N K * N ) ⊂ R \ [x, y]] = 1.
• Assume that (A1) holds. Then, almost surely, the sequence of u + t-tuples converges in * -distribution towards (c, a) where c = (c 1 , . . . , c u ) is a free circular system which is free with a = (a (1) , . . . , a (t) ) in (A, ϕ). Proposition 5.3. Consider a polynomial P (Y 1 , Y 2 ), where Y 1 is a tuple of noncommuting nonselfadjoint indeterminates, Y 2 is a tuple of selfadjoint indeterminates, and no selfadjointness is assumed for P . We evaluate P in (c, a) and (c, a N ), where c is a tuple of free circulars, which is * -free from the tuples a and a N . We assume that a N → a in moments and that there exists a number τ > 0 such that sup N a N τ .

We assume that there exists
Then, there exists z0 > 0 for which there exists an

Proof of Proposition 5.2
Note that so that the spectrum of K N K * N coincides with the spectrum of where the m i 's are monomials and the b i 's are complex numbers. Define Q 1 = where the W (i) 's, i = 1, . . . , u, are 2N × 2N independent so called Wigner matrices satisfying assumptions of [6]. Now, note that as noticed by [7] for any monomial x 1 · · · x k , 0 Indeed, this can be proved by induction noting that Now, define for j = 1, . . . , t, a (j) It readily follows that, the spectrum of K N K * N coincides with the spectrum ofK Q 1 , Q 2 , R, R * , A Now, it is straightforward to see that the * -distribution of (q 1 , q 2 , r, a (j) tr 2N ). Moreover, by Lemma 3.3, it turns out that the s i 's are free semicircular variables which are free with (q 1 , q 2 , r, a  (c, a)) .
The second assertion of Proposition 5.2 follows.

Proof of Proposition 5.3
We prove this using linearization and hermitization. Our linearization of a nonselfadjoint polynomial will naturally not be selfadjoint, so the results from [5] do not apply directly to it, but some of the methods will. Before we analyze this linearization, let us lay down the steps that we shall take in order to prove the above result. Let L be our linearization of P (Y 1 , Y 2 ) − z 0 ; in this section we use a slight modification of the linearization introduced in [1] -see below.
where C is a selfadjoint matrix containing only circular variables and their adjoints.
It will be clear that contains at most one nonzero element per row/column, except possibly for the first row/column.

We use Lemma 3.4 to conclude that the lhs of the previous item is invertible if and only if
is, where S is obtained from C by replacing each circular entry with a semicircular from the same algebra (and hence free from a N ), and is a {−1, 1}-Bernoulli distributed random variable which is independent from a N and free from S. As noted in Example 3.1, since C = C * , S is indeed a matrix-valued semicircular random variable.
5. We apply Lemma 3.7 to the above item in order to determine under what conditions the sum in question has a spectrum uniformly bounded away from zero.
6. Finally, we use the convergence in moments of a N to a in order to conclude that the conditions obtained in the previous item are satisfied by Since our variables live in a II 1 -factor, the two nonzero entries of the right hand side have the same spectrum. Part (2) requires a careful analysis of the linearization we use. The construction from [1] proceeds by induction on the number of monomials in the given polynomial. If EJP 26 (2021), paper 100. P = X i1 X i2 X i3 · · · X i −1 X i , where ≥ 2 and i 1 , . . . , i ∈ {1, . . . , k}, we set n = and However, unlike in [1,5], we choose here L to be That is, we apply the procedure from [1], but to P = 1X i1 X i2 X i3 · · · X i −1 X i 1. If = 1, we simply complete X to 1X1. Even if we have a multiple of 1, we choose here to proceed the same way. The lower right ( + 1) × ( + 1) corner of this matrix has an inverse of degree in the algebra M +1 (C X 1 , . . . , X k ). (The constant term in this inverse is a selfadjoint matrix and its spectrum is contained in {−1, 1}.) The first row contains only zeros and ones, and the first column is the transpose of the first row. Suppose now that p = P 1 + P 2 , where P 1 , P 2 ∈ C X 1 , . . . , X k , and that linear polynomials linearize P 1 and P 2 . Then we set n = n 1 + n 2 − 1 and observe that the matrix is a linearization of P 1 + P 2 . L is built so that (ze 1,1 − L) −1 1,1 = (z − P ) −1 , z − P is invertible if and only if (ze 1,1 − L) is invertible, and each row/column of the matrix L, except possibly for the first, contains at most one nonzero indeterminate (i.e. nonscalar). By applying the linearization process to 1X i1 X i2 X i3 · · · X i −1 X i 1 instead of X i1 X i2 X i3 · · · X i −1 X i , we have insured that there is at most one nonzero indeterminate in each row/column. An important side benefit is that with this modification, we may assume that, with the notations from item 5 of Section 4, v = u, and all entries of this vector are either 0 or 1.
While this linearization is far from being minimal, and should not be used for practical computations, it turns out to simplify to some extent the notations and arguments of our proofs.
In our arguments below we use several times the following equivalences regarding inequalities involving operators and their norms: let A be a bounded linear operator on a Hilbert space and let 1 denote the identity operator on the same Hilbert space. Then A 2 = A * 2 = A * A = AA * = |A| 2 and EJP 26 (2021), paper 100.
In particular, if A = A * , then − A · 1 A A · 1. If A > 0 (that is, A = A * and the spectrum of A is included in (0, +∞)), then A ≥ 1 A −1 · 1. All these relations follow from functional calculus in C * -algebras and the definition of positivity for operators (as the reader has by now noticed, we use the same symbol to denote inequalities between real numbers and inequalities between operators). In the future, we will sometimes suppress the identity in our notations and write, for instance, A A instead of A A · 1.
The concrete expression of the inverse of ze 1,1 −L in terms of L = 0 u * u Q is provided by the Schur complement formula as It follows easily from this formula that z − P is invertible if and only if ze 1,1 − L is invertible. It was established in [5, Lemma 4.1] that Q, and hence Q −1 , is of the form T (1 + N ) for some permutation scalar matrix T and nilpotent matrix N , which may contain non-scalar entries. Let us establish a version of [5,Lemma 4.3] suitable for our purposes.
Lemma 5.6. Assume that P ∈ C Y 1 , Y 2 is an arbitrary polynomial in the non-selfadjoint indeterminates Y 1 and selfadjoint indeterminates Y 2 . Let L be a linearization of P constructed as above. Given tuples of noncommutative random variables c and a, for all δ > 0 such that |P (c, a)| 2 > δ, there exists e > 0 such that |L(c, a)| 2 > e, and the number e only depends on δ > 0, P, and the supremum of the norms of c, a. Conversely, for all e > 0 such that |L(c, a)| 2 > e, there exists q > 0 such that |P (c, a)| 2 > q > 0 and q depends only on e, P, and the supremum of the norms of c, a.
Proof. With the decomposition L = 0 u * u Q , we have |L| 2 = u * u u * Q * Qu uu * +QQ * . Recall that |P | 2 = u * Q −1 uu * (Q −1 ) * u. Now consider these expressions evaluated in the tuples of operators mentioned in the statement of the lemma. In order to save space, we will nevertheless suppress them from the notation. We assume that |P | 2 > δ. Strangely enough, it will be more convenient to estimate an upper bound for |L| −2 rather than a lower bound for |L| 2 . The entries of |L| −2 expressed in terms of the above decomposition are (|L| −2 ) 1,1 = u * u − u * Q * (uu * + QQ * ) −1 Qu −1 , We only need to estimate the norms of the above elements in terms of δ, P , and the norms of the variables in which we have evaluated the above. It is clear that Similarly, (uu * + QQ * ) −1 (QQ * ) −1 Q −1 2 . We obtain this way the following majorizations for each of the entries, which will allow us to estimate e (these majorizations EJP 26 (2021), paper 100.
are not optimal, but close to): We shall not be much more explicit than this, but let us nevertheless comment on why the above satisfies the corresponding conclusion of our lemma. As noted before, u is a vector of zeros and ones. It follows immediately from the construction of L that the number of ones is dominated by the number of monomials of P , quantity clearly depending only on P . Recall that Q is of the form T (1 + N ), with T a permutation matrix, and N a nilpotent matrix. The norm of T is necessarily one. The nilpotent matrix corresponding to Q is simply a block upper diagonal matrix (i.e. a matrix which has on its diagonal a succession of blocks, each block being itself an upper diagonal matrix) with entries which are operators from the tuples a and c in which we evaluate P (and L). Its norm is trivially bounded by the supremum of all the norms of the operators involved times the supremum of all the scalar coefficients.
where m is the size of the linearization, we obtain an estimate for Q −1 from above by (m + 1)(1 + Q ) m . Finally, |P | −2 ≤ δ −1 . This guarantees that |L| −2 is bounded from above, so that |L| 2 is bounded from below, by a number e depending on δ, P , and the norms of the entries of P .
Part (3) is a simple formal step.
Step (4) becomes a direct consequence of Lemma 3.4. Now, in step (5), we finally involve our variables c, a, a N directly. We have assumed that |P (c, a) − z 0 | 2 > δ z0 > 0, so that, according to steps (1) and (2) for all (sufficiently large) N ∈ N, so that |Y N | 2 > ζ for a ζ that only depends on P, δ z0 , and the supremum of the norms of a N , which is assumed to be bounded. Thus, |Y N | 2 is uniformly bounded from below as N → ∞. In order to insure the invertibility of S + Y N , we also need that spect(η • G Y N (0)) ⊂ D \ {1}, for all N sufficiently large. The existence of G Y N (0) is guaranteed by the hypothesis of invertibility of Y N . Since we only need to remember that all entries of L(0, a N ) −1 are products of polynomials in a N and (P (0, a N ) − z 0 ) −1 in order to conclude that the convergence in moments of a N to a implies the convergence in norm of G Y N (0) to G Y (0) (recall that, according to hypothesis 2. in the statement of our proposition, |P (0, a N ) − z 0 | 2 > δ z0 > 0 uniformly). Thus, for N sufficiently large, all eigenvalues of η • G Y N (0) are included in (1 − r 2 )D. This guarantees the invertibility of all S + Y N for N sufficiently large.
To prove item (6) and conclude our proof, we only need to show that for N sufficiently large, |S + Y N | 2 > ι 2 . There is a simple abstract shortcut for this: as Proposition 3.5 shows, the endpoint of the support of the (scalar) distribution of S + Y N is given by that Thus, x N is bounded away from zero uniformly in N as N → ∞. A second application of the convergence of G Y N allows us to conclude.

Stable outliers; proof of Theorem 1.10
Making use of a linearization procedure, the proof closely follows the approach of [10]. The most significant novelty is Proposition 6.1 which substantially generalizes Theorem 1.3. A. in [15] (see also Proposition 2.1 in [10]) and whose proof relies on operator-valued free probability results established in Section 3.2.2. Nevertheless, we provide all arguments for the reader's convenience. Let be a linearization of P (y 1 , . . . , y u+t ) with coefficients in M m (C) such that, for any u + ttuple y of matrices, | det Q(y)| = 1 (see (4.3)). Let G be a relatively compact set in C \ spect(P (c, a)) satisfying the hypotheses of Theorem 1.10, and Γ = G. Note that  since | det Q(y)| is constant. Now, following the proof of Lemma 4.3 in [5], one can see that this is also equivalent to min z∈∂Γ det(ze 11 . (6.2) According to Lemma 4.1, the eigenvalues of M N are the zeroes of z → det(ze 11 N )). By Assumption (A 2 ), Proposition 5.1 and Lemma 4.1, almost surely for all large N , for any z ∈ Γ, we can define Recall the Weinstein-Aronszajn identity: if P, Q ∈ M d1,d2 (C), where Q denotes the transpose of Q. Using this identity, it is clear that, almost surely for all large N , the eigenvalues of M N in Γ are precisely the zeros of the random analytic function z → det(I p − Q N R N (z)P N ) in that set.
Similarly, for any z in Γ, . (6.4) Thus, the zeroes of z → det(I p − Q N R N (z)P N ) in Γ are the zeroes of z → det(ze 11 The rest of the proof is devoted to establish that det( Step 1: Iterated resolvent identities.
Using repeatedly the resolvent identity, we find that, for any integer number K 2, The following two steps will be of basic use to prove the uniform convergence in Γ of the right hand side of (6.5) towards zero.
Step 2: Study of the spectral radius of R N (z)Y N .
The aim of this second step is to prove Lemma 6.
be a linearization of P (y 1 , . . . , y u+t ) with coefficients in M m (C). Set Let be some selfadjoint {−1, +1}-Bernoulli variable in A independent from the entries of y z . Let s 1 , . . . , s u be free semicircular variables in A free from and the entries of y z .
Proof. Now, assume that λ = 0 is an eigenvalue of R N (z)Y N . Then there exists v ∈ C N m , v = 0 such that (ze 11 or equivalently that 0 is a singular value of By Lemma 6.4, we can deduce that almost surely for all large N , the nonnull eigenvalues of R N (z)Y N must satisfy 1/|λ| > ρ. The result follows.
Step 3: Study of the moments of R N (z)Y N . Proposition 6.6. Let Γ be a compact subset in {z ∈ C, 0 / ∈ supp(µ z )}. Assume that (A 2 ) and (1.5) hold. There exists 0 < 0 < 1 and C > 0 such that almost surely for all large N , for any k 1, sup Proof. For z ∈ Γ, we set T N (z) = R N (z)Y N . Let 0 be as defined by Lemma 6.5 and ρ be as defined in Lemma 6.4. Choose 0 < < min( 0 , 1 − 1 ρ ). Therefore, according to Lemma 6.5 and using Dunford-Riesz calculus, we have almost surely for all large N , for any z in Γ, Now, note that, for any w such that |w| = 1 − , we have 1 where η is defined in Lemma 6.4.It readily follows from (6.7), Lemma 4.2, (6.8), (1.5) and Bai-Yin's theorem that there exists C > 0 such that we have almost surely for all large N , for any z in Γ, sup (6.9) Proposition 6.6 follows from (6.6) and (6.9).
We will use the following proposition from [10] to establish Lemma 6.8 below.
Proposition 6.7 ([10]). Let n 1 be an integer and Q ∈ C X 1 , · · · , X n such that the total exponent of X n in each monomial of Q is nonzero. We consider a sequence (B  Using (6.10) and Bai-Yin's theorem, we deduce from Proposition 6.7 that v * i (R N (z)Y N ) k × R N (z)u j converges almost surely to zero. The result follows by applying the dominated convergence theorem thanks to Proposition 6.6.
We are going to prove that, assuming (X 1 ), (1.3) and (A 2 ), we have for any z in Γ, almost surely, as N → ∞, Q N R N (z)P N − Q N R N (z)P N → 0. (6.11) Let C > 0 such that P N Q N C . According to Proposition 5.1 and (4.2), for any z ∈ Γ, there existsγ z > 0 such that almost surely for all large N R N (z) 1 γ z . (6.13) Then using also Proposition 6.6 and (6.10), for any k 1, we have Let η > 0. Choose K 1 such that CC γz (1 − 0 ) K < η/2 and k K CC ηz (1 − 0 ) k < η/2. Thus, using (6.5), we have that, for any η > 0, k R N (z)P N < η and then, letting η going to zero, that we have (6.14) Applying Lemma 6.8, we obtain (6.11). Proposition 6.9. Let Γ be a compact subset of C \ spect(P (c, a)) which satisfies the hypotheses of Theorem 1.10. Assume (X 1 ), (1.3) and (A 2 ). Then, almost surely, det(I p − Q N R N (z)P N ) − det(I p − Q N R N (z)P N ) converges to zero uniformly on Γ, when N goes to infinity.