Eigenvectors and controllability of non-Hermitian random matrices and directed graphs

We study the eigenvectors and eigenvalues of random matrices with iid entries. Let $N$ be a random matrix with iid entries which have symmetric distribution. For each unit eigenvector $\mathbf{v}$ of $N$ our main results provide a small ball probability bound for linear combinations of the coordinates of $\mathbf{v}$. Our results generalize the works of Meehan and Nguyen as well as Touri and the second author for random symmetric matrices. Along the way, we provide an optimal estimate of the probability that an iid matrix has simple spectrum, improving a recent result of Ge. Our techniques also allow us to establish analogous results for the adjacency matrix of a random directed graph, and as an application we establish controllability properties of network control systems on directed graphs.


Introduction
Let u ∈ C n be a random vector uniformly distributed on the unit sphere. It follows that u has the same distribution as 1 n i=1 |ξ i | 2 (ξ 1 , . . . , ξ n ) T , where ξ 1 , . . . , ξ n are independent and identically distributed (iid) standard complex Gaussian random variables. From this representation one can prove that 1 T u converges in distribution to a standard complex Gaussian random variable, where 1 ∈ C n is the all-ones vector. We refer the reader to the survey [70] for additional properties of u. Let N be a random matrix of size n × n whose entries are iid random variables. When the entries of N are iid copies of a standard complex Gaussian random variable, N is rotationally invariant, and the individual eigenvectors of N have the same distribution as u above. When the entries of N are non-Gaussian, much less is known about the distribution of the eigenvectors. In view of the universality phenomenon in random matrix theory, it is natural to conjecture that some of the properties that u possesses should also hold for the eigenvectors of N .
In this note, we quantify some of these properties of the eigenvectors for iid random matrices. The properties we focus on in this note are motivated by control theory, which we discuss in more detail in Section 1.4 below.
The results in [59,67,69,68] are the most closely related to the present work. The following result is established by Meehan and Nguyen in [59]. Theorem 1.1 (Follows from Theorem 1.5 in [59]). Let ξ be a real-valued symmetric random variable with mean zero and unit variance so that P(|ξ| ≥ t) ≤ K 1 exp −t 2 /K 2 for all t > 0 for some constants K 1 , K 2 > 0. Let W = (w ij ) be an n × n real symmetric random matrix whose entries w ij , 1 ≤ i ≤ j ≤ n are iid copies of ξ. Then there exist constants C, δ, δ ′ > 0 such that P(∃ unit eigenvector v of W such that |1 T v| ≤ ε) ≤ C n δ ε + e −n δ ′ for all ε > 0, where 1 is the all-ones vector.
Similar results are also established in [67,69,68], and the results in [59] greatly generalize the results in [69]. In fact, the results in [59] are more general than what is stated in Theorem 1.1 and apply to a large class of vectors (not just the all-ones vector).
Intuitively, Theorem 1.1 provides a non-asymptotic bound which shows that the eigenvectors have a similar behavior as the uniform vector u introduced above. The goal of this work is to establish a version of Theorem 1.1 for non-Hermitian random matrices. Indeed, all the results in [59,67,69,68] only apply to Hermitian random matrices. When the random matrix is no longer Hermitian, the eigenvectors need not be orthogonal and new difficulties arise. In this note, we develop upon the techniques introduced by Ge [37] in order to overcome these difficulties.
1.1. Notation. Before stating our main results, we introduce some notation. For a matrix M , we let M denote the operator norm. M T is the transpose and M * is the conjugate transpose of M . We write M − z to denote M − zI, where I is the identity matrix. J will denote the all-ones matrix. For any square matrix, we will use the term eigenvector to denote a unit eigenvector unless stated otherwise.
We use bold letters to denote complex and real vectors. For a vector v, v is the Euclidean norm. For two vectors v = (v i ) n i=1 ∈ C n and u = (u i ) n i=1 ∈ C n , we let v ⊙ u denote the Hadamard product of v and u defined as the vector v ⊙ u = (v i u i ) n i=1 . 1 denotes the all-ones vector.
We use asymptotic notation under the assumption that n → ∞. In particular, the notations X n = O(Y n ), Y n = Ω(X n ), X n ≪ Y n , or Y n ≫ X n denote the bound |X n | ≤ C|Y n | for some constant C > 0 independent of n and all n > C. If the constant C depends on a parameter (e.g., C = C k ), we indicate this with subscripts (e.g., X n = O k (Y n )). The notation X n = o(Y n ) denotes the bound |X n | ≤ c n Y n for some sequence c n that converges to zero as n tends to infinity.
In our proofs, we often use C, C ′ , c, c ′ , etc. to represent universal positive constants that can change from line to line.
[n] denotes the discrete interval {1, . . . , n} and B(z, s) denotes a ball of radius s centered at z.

1.2.
Eigenvector results. In our main results below we focus on non-Hermitian random matrices with iid entries. Definition 1.2 (iid random matrix). Let ξ be a real-valued random variable. We say the n × n matrix N is an iid random matrix with atom variable (or atom distribution) ξ if the entries of N are iid copies of ξ.
We will often assume that the atom variable ξ has mean zero. In addition, we will sometimes need to assume that ξ is a symmetric random variable, i.e., that ξ has the same distribution as −ξ. In the most general case, we will only need the following assumption. Assumption 1.3. Assume ξ is a real-valued random variable. In addition, assume there exists constants q ∈ (0, 1) and T > 0 so that sup u∈R P(|ξ − u| < 1) ≤ 1 − q, (1.1) where ξ ′ is an independent copy of ξ.
Remark 1.4. Assumption (1.1) guarantees that ξ is non-degenerate. All three conditions (1.1), (1.2), and (1.3) hold (for some T and q) when ξ has finite variance of at least 1. Many of our results will have constants that implicitly depend on q and T . We will suppress this dependence in the notation and statements of the theorems.
Our first main result is the analogue of Theorem 1.1 for iid random matrices.
Theorem 1.5. Let N be an n × n iid random matrix with real-valued symmetric atom variable ξ which satisfies Assumption 1.3, and let K > 1 be a constant. Then there exist constants C, c > 0 (depending only on the constant K and the atom variable ξ) such that P(∃ unit eigenvector u of N such that |1 T u| ≤ t) ≤ Cnt + P( N > K √ n) for any t ≥ e −cn . Here 1 denotes the all-ones vector.
A bound on the operator norm can be controlled by additional moment assumptions on ξ. For instance, when ξ has finite fourth moment there exists K > 1 so that (1.4) P( N > K √ n) = o(1), and when ξ satisfies a sub-Gaussian assumption where the constants and rate of convergence in these bounds depend on the fourth moment or sub-Gaussian moment of ξ (see [94] and [97]). More generally, we have the following theorem.
Theorem 1.6. Let N be an n × n iid random matrix with real-valued symmetric atom variable ξ which satisfies Assumption 1.3, and let B, K > 1 be constants. Then there exist constants C, c, ν, ν ′ > 0 (depending only on the constants K, B and the atom variable ξ) such that the following holds. Let m ≤ ν √ n and b ∈ C n be a vector such that B −1 ≤ |b i | ≤ B for all but m coordinates of b. Then for any t ≥ e −ν ′ n/m .

Eigenvalue Gaps.
Tail bounds between gaps of eigenvalues of random matrices were originally studied in [4] in the GUE case and in [64,89] for a large class of Hermitian random matrices. In his thesis [37], Ge proves a similar result for iid matrices. Let λ 1 (N ), . . . , λ n (N ) be the eigenvalues of a matrix N . Let ∆ ≡ ∆(N ) := min i =j |λ i (N ) − λ j (N )|. Ge obtained the following theorem.
Theorem 1.7 (Theorem 3.1.1, [37]). Let N be an n × n iid random matrix whose atom variable satisfies Assumption 1.3 and has mean zero. For every C > 0 and δ ≥ s ≥ n −C where the implied constant depends only on the parameters in Assumption 1.3 and C.
One immediate consequence is that with high probability, the random matrix has simple spectrum. 37]). Let N be an n × n iid random matrix whose atom variable has mean zero, unit variance, and finite fourth moment. Then Building on the techniques in [37], we greatly extend the range of the tail bound for the eigenvalue gaps and also improve the probability bound for simple spectrum. Theorem 1.9. Let N be an n × n iid random matrix whose atom variable satisfies Assumption 1.3. Then there exist constants C, c > 0 such that for s ≥ 0, While the right-hand side appears non-optimal, we can deduce an immediate corollary.
Corollary 1.10. If in addition to the assumptions of Theorem 1.9 we assume the entries of N are subgaussian with mean zero, then there exist constants C, c > 0 such that This corollary is of independent interest and clearly optimal up to the constants for subgaussian entries, while Ge's result only guarantees a polynomially small probability. The simple spectrum probability bound is also an important technical tool for the results of the next section.
For a directed graph G = ([n], E) with vertex set [n] and edge set E, we let the adjacency matrix A be defined by We define the directed Erdős-Rényi random graph to be the random digraph on vertex set [n] such that each edge (i, j) appears independently with probability p, for a constant p ∈ (0, 1). The adjacency matrix A is random but does not fall under the purview of Theorem 1.9 as A = Ω(n) with high probability (so P( A > K √ n) = 1 − o(1)). In addition, our results apply to both the model where loops are allowed (so that (i, i) is an edge with probability p) as well as the case where loops are not allowed (so that the adjacency matrix has zeros along the diagonal with probability one). For the adjacency matrix A for either model, we are able to prove the following weaker conclusion.

1.4.
Connection to control theory. Our main results are related to a large collection of works on controllability of network control systems [1,2,40,52,54,62,63,67,68,69,72,85]. Unlike many of these previous works, in this note we take a stochastic approach. In this section we provide a brief overview of linear control theory and its connection to our main results above. For additional details concerning control of linear systems, the reader is advised to see [40,46] and references within.
We consider a discrete-time linear state-space system formed from an n × n matrix A (called the state transition matrix) and a vector b ∈ R n (the given input vector). The system's state at time k is a vector x(k) which evolves according to the constraint: where each u(k) is a scalar. The sequence (u(k)) k≥0 is the control of the system.
Roughly speaking, the system is controllable if we can find the control values u(·) based on arbitrary state values x(·). Following [40,59] we observe that since Thus, we can find the control values u(·) based on the state values x(·) if and only if the matrix on the right-hand side of (1.5) has full rank. This leads immediately to the following definition (known as Kalman's rank condition) for controllability. Definition 1.12. Let A be an n × n matrix, and let b be a vector in R n . We say the pair has full rank (that is, rank n). Here the matrix in (1.6) is the matrix with columns b, Ab, Given the state transition matrix A, two important problems are: (1) (Minimal controllability) What is the sparsest nonzero binary vector b ∈ {0, 1} n such that (A, b) is controllable? (2) (Uniform controllability) If 1 is the all-ones vector, is (A, 1) controllable?
Our main results above allow us to study versions of these problems when A is a random matrix. Loosely speaking, our results show that "most" systems are controllable, which confirms a similar phenomenon that was observed previously for systems with Hermitian transition matrices [59,67,69,68]. In addition, we also consider the case when the vector b is random.
As corollaries to our main results above, we obtain the following. Corollary 1.13. Let ξ be a real-valued symmetric random variable with mean zero, unit variance, and finite fourth moment. Let N be the n×n iid random matrix with atom variable ξ. Then (N, 1) is controllable with probability 1 − o(1), where 1 is the all-ones vector.
Corollary 1.14. Let ξ be a real-valued random variable with mean zero, unit variance, and finite fourth moment. Let N be the n × n iid random matrix with atom variable ξ. Let ψ be a real-valued random variable that satisfies Assumption 1.3, and assume b ∈ R n is a random vector with entries that are iid copies of ψ. Then, with probability We note that Corollary 1.14 does not require symmetric random variables.
Remark 1.15. If instead of a bounded fourth moment, we assume the entries of N are subgaussian in Corollaries 1.13 or 1.14, we can improve the probability bound to 1 − Ce −cn for some constants C, c > 0.
Corollary 1.16. Let ξ be a real-valued random variable with mean zero, unit variance, and finite fourth moment. Let N be the n × n iid random matrix with atom variable ξ. Then Corollary 1.13 and 1.16 answer the uniform controllability and minimal controllability questions from above for non-Hermitian random matrices.
We have corresponding corollaries for the adjacency matrix of directed random graphs.  Unlike Corollary 1.17, Corollaries 1.18 and 1.19 do not demand that p = 1/2.

1.5.
Overview and outline. In Section 2, we isolate the key structural result which guarantees that any vector near the kernel of an iid random matrix (shifted by a complex number) is unstructured. The investigation of the structure of vectors as they relate to their anticoncentration has a long history in random matrix theory beginning with the infamous singularity problem for discrete random matrices [49,45,86,22,20,65,34,33,91]. Strong bounds on the least singular value in both the symmetric and non-symmetric setting used similar tools [26,84,77,78,95,90,87,55,73,43,37]. The quantitative estimates in Section 2 build on this rich history of anti-concentration in random matrix theory. In particular, our quantitative estimates improve on those in [37]. The proof uses a delicate covering argument to exclude structured vectors. The primary obstacle that appears in the non-Hermitian setting is that the eigenvectors can now reside in the complex unit sphere which has doubled the dimension of the space that must be covered. The key geometric insight that resolves this issue is expounded on in Section 2.4. In Section 3, we use an approximation argument to extend the structural result to eigenvectors of a non-Hermitian matrix. We utilize a multi-scale argument to extend our structural result to small-ball probability bounds on all scales. The arguments in Sections 2 and 3 do not immediately apply to the adjacency matrix of a random directed graph because the operator norm of the adjacency matrix is Ω(n) with high probability. In Section 4, we describe the method to generalize the structural result to directed graphs. The key observation is that the matrix of expectations is lowrank so the covering arguments from the previous sections can be extended as the size of nets do not incur many new dimensions. We then utilize previous results on the spectrum of rank-1 perturbations of random matrices which state that the eigenvalues of perturbed matrix are all contained in the centered disk with radius determined by the spectral norm of the unpertrubed matrix, except for one outlier. To understand the structure of the eigenvector corresponding to the outlier, we use the Perron-Frobenius theorem for nonnegative matrices. In Section 5, we show that for a fixed vector b, even b ⊙ u has no structure, where ⊙ denotes the Hadamard product and u is an eigenvector.
Finally, in Section 6, we complete the proofs of our main results and deduce the control theory corollaries from our eigenvector structure results. To relate the structure of eigenvectors to the controllability of the matrix requires the introduction of auxiliary random signs in the matrix that preserve the distribution of the matrix and only alter the signs of the entries in the eigenvectors. The first condition will require symmetric entries in the random matrix for some of the control theory results.
In Appendix A we include the proof of Theorem 1.9 and in Appendix B we complete the proof of Theorem 1.11. They are similar to previous arguments in this article and in [37].
Acknowledgements. We thank Hoi H. Nguyen for pointing out reference [37]. The second author thanks Behrouz Touri for introducing him to the problem and answering numerous questions.

Arithmetic Structure of Approximate Null Vectors
In this section, we study the arithmetic structure of approximate eigenvectors. We let E K denote the event that N ≤ K √ n. The goal of this section is to prove the following result.
Theorem 2.1. Let N denote the n × n matrix with entries that are iid copies of a random variable ξ that satisfies Assumption 1.3. There exist constants c 2.1 , c ′ 2.1 , c ′′ 2.1 , c ⋆ , µ > 0 such that the following holds. We let M denote the matrix N − λ √ nI where λ is a fixed complex number with |λ| ≤ K and δ = Im(λ) ≥ e −c⋆n . If

n
then with probability at least 1 − e −c 2.1 n , on the event E K , any complex vector, z ∈ S n−1 C , such that M z ≤ Kµn/ D has d(z) ≥ c 2.13 δ and D(z) ≥ D. d(z) and D(z) denote the real-imaginary correlation and the LCD respectively and are defined formally in Definitions 2.12 and 2.7 below. Some aspects of the proofs below are inspired by arguments from [80,37], but we have introduced several modifications and novelties to handle our current setting.
Definition 2.2. For two constants a, b ∈ (0, 1), we say a vector x ∈ S n−1 C is compressible if there is a an-sparse vector x ′ such that x − x ′ ≤ b. We denote the set of compressible vectors as Comp C (a, b). Let Incomp C (a, b) denote the incompressible vectors, which are those on the unit sphere that are not compressible. The same definitions apply to real vectors, in which case, we use Comp R and Incomp R .
The following is a well-known result that follows from tensorizing a crude estimate for fixed vectors and taking a union bound. Lemma 2.3. There exist constants a, b, c 2.3 ∈ (0, 1) and K > 2 such that We fix the constants a, b, c 2.3 for the remainder of the argument. The next lemma from [37] demonstrates that an approximate null-vector cannot have mass exclusively confined to the real or imaginary parts.
Therefore, as z is a unit vector, From the above, we can conclude that for a small enough c and c ′ , depending on K. Finally, we can set c 2.4 to be the smaller of c and c ′ .
Remark 2.5. Note that M z = M e iθ z so the above lemma applies to any rotation of z.

2.1.
Excluding Vectors with Real Compressible Part.
There exist constants a, b, c 2.6 such that for Proof. Case I: We assume that α ≥ Cδ where C is a large constant to be determined. Recall that we let M = N + λ √ nI. Again, we examine the real part of the inequality To complete the proof in this case, it suffices to show that The intuition is that as x is close to sparse, N x should be on the order of x √ n. Thus, Therefore, by a simple union bound, for a small constant c ′ > 0 after choosing a small enough. For any x ∈ T α , there exists Choosing b small enough so that c − Kb > 0 and then choosing C large enough, we have that this implies that Therefore, Case II: We utilize the real and imaginary parts of the inequality Let us define for an index set I ⊂ [n] with |I| = an, For concreteness, let us assume for now that I = {1, . . . , an}. Similar to Case I, we can find a 2bα-net, N , of T α,I such that |N | ≤ 3 b an 2 b . Conditioning on the first an columns of M , we have the deterministic inequality We construct a random net, depending on the first an columns of M , for the imaginary part of the vectors. We use N x δ √ n to approximate the imaginary part of the complex vectors in S α . Note that since x is only supported on the first an coordinates, N x δ √ n depends on only the first an columns of M . Define for some large constant C ′ where in the last line we have used the assumption that α ≤ Cδ.
Since z is incompressible and x is compressible, we must have that y ≥ b 2 after reducing b if necessary. We write y ′ = y 1 y 2 where y 1 is the vector formed by the first an coordinates and y 2 are the remaining coordinates. Since z is incompressible and y ′ − y ≤ Cb, we can choose b small enough such that y 2 ≥ b/4. By the standard tensorization argument, where the probability is taken over the randomness of the last n − an columns of M and the lower bound on y 2 . Thus, by a union bound, after reducing b if necessary. Finally, taking a union bound over the n an possible I and then choosing a small enough shows that for a small enough c 2.6 .

LCD and Structure Theorem.
We import several definitions to quantify the structure, or absence thereof, of a vector, a matrix and a complex vector.
• For a vector v ∈ R n , we define the least common denominator (LCD) of v to be • For a complex vector z = x + iy ∈ C n with x, y ∈ R n , we define the LCD of z to be the LCD of the matrix ρ is a parameter that is not normally included in the definition, but we will need this extra flexibility in the appendix when we handle directed adjacency matrices, which does not have iid entries. For any fixed ρ, the only effect is to slightly alter the constants in the following theorems. For the remainder of the paper we set ρ = 1 for convenience and only utilize this general ρ in Section 4.
Our first lemma shows that the LCD of a complex vector is invariant under rotations by a complex phase.
Note that The next lemma shows that one can always rotate a complex vector so that the LCD of z is exhibited by the real component of the rotated vector.
Proof. Recall from Definition 2.7, which corresponds to the defining relation for the LCD of the real part of e iφ z. By Lemma 2.8 and its proof, taking the infimum proves the result.
The crucial relationship between structure and small-ball probability is quantified in the next theorem.
Theorem 2.10 ( [80]). Consider a random vector ξ = (ξ 1 , . . . , ξ n ) where ξ i are i.i.d. copies of a real random variable ξ that satisfy Assumption 1.3. Let U ∈ R m×n be fixed. Then for every L ≥ 8m q (where q is the parameter from Assumption 1.3) and t ≥ 0, we have We fix a constant L ≥ 16/q for the remainder of the proof since we will only apply the above theorem for m ≤ 2. A simple argument shows that if we restrict our attention to incompressible vectors, the smallest value the LCD can take is on the order of √ n.
2.3. Small ball Probabilities depending on real-imaginary correlations. We adapt the notions of LCD to handle complex vectors. This section follows previous developments in this direction [80,55,37].
We define the real-imaginary correlation of z to be Proof. We first prove the claim that the extremal values of ℜ(e iθ z) are the singular values of the matrix (x y). These can be calculated from the eigenvalues of which are the solutions of Solving the quadratic equation and choosing the larger root yields the claim. By Lemma 2.4, the real part of any vector that satisfies the requirements of the lemma has norm bounded below by c 2.4 δ, so we must have that Simplifying this inequality gives for a small enough constant c.
In the remainder of this section, we use a covering argument to exclude vectors with small LCD. For real matrices, this type of argument appeared in [77]. However, the main difficulty in the current setting, is that we must consider complex spheres, which have dimension 2(n − 1) when embedded into the real Euclidean space. On the other hand, we are left with the same amount of randomness as in the real case. To handle this difficulty, we divide the remaining vectors into two classes, genuinely complex and essentially real. For genuinely complex vectors, the small-ball probabilities are greatly improved as the real and imaginary components are uncorrelated. This is enough to compensate for the added dimensionality. Essentially real vectors have highly correlated real and imaginary parts and so can be thought of as residing in a lower-dimensional space. For this class of vectors, a variant of the original covering argument from [77] suffices. This two-class approach is due to [80] and has been expanded upon in [37].
• (Essentially real z) Define The next proposition establishes a strong small-ball probability for genuinely complex vectors.
where we recall from Definition 2.12 that V = V (z) = x T y T and N j is the j-th column of N , where by assumption each entry is i.i.d. Specializing Theorem 2.10 to our setting we arrive at A quick change of variables from √ 2t to t puts the single coordinate bound into the desired form. To extend this bound to the entire vector, we use a standard tensorization argument, which completes the proof.
The following proposition is proved analogously and is again a simple consequence of tensorization and our definition of d 0 .

Nets.
In this section, we construct discrete nets of various level sets partitioned by real-complex correlation and LCD. where C 2.17 is an absolute constant.
Proof. By the definition of LCD, there exists a p ∈ Z n such that Therefore, Since z is a unit vector, at least one of x or y has norm greater than 1/ √ 2. Since, From Weyl's inequality we can deduce that We write det(W W T ) 1/2 in two ways via the product of singular values and the volume of a parallel piped. In particular, where P p ⊥ is the operator that projects onto the subspace orthogonal to p. Since s 1 (W ) ≤ p + D(z)y) , Recalling that p ≥ cDα and D(z)y ≤ 2D, we find that for another universal constant C ′ > 0. As we are in the genuinely complex case, We now have the estimates to construct a µ √ n D -net of S D,d,α . For any x + iy ∈ S D,d,α there exists p ∈ Z n ∩ B(0, CDα) such that where the last inequality follows from D 0 ≤ Le µ 2 n/L 2 from (2.2). We work with at most CD/µ √ n discrete multiples of p that approximate p/D(z) up to an accuracy of µ √ n/D. Therefore, to bound the number of discrete multiples we have to consider, we multiply the number of lattice points in B(0, CDα) by the number of discrete multiples to get a bound of For each discrete scaling of a lattice point p, we have by (2.5) that so y must lie in a cylinder of radius C ′′ d/α in the direction of p. This crucial observation severely restricts the space of potential y. Using the standard volume argument gives a µ √ n/D-net of this cylinder with size bounded by Combining these bounds yields the result since d/α ≥ √ n/D by the assumption that d 0 ≥ √ nα/D.
Proof. We begin with the case where d(z) < L D log + Dα L . We can recycle many of the estimates from the genuinely complex case. However, the estimate for the projection of y onto the subspace orthogonal to p changes. Now we have Dα where the last inequality follows from choosing c ⋆ small enough in the definition of D 0 in (2.2). Again, using the discrete multiples of lattice points to approximate x with cardinality bounded by (2.6). For each discrete multiple of a lattice point, we can match it with a net of size where the inequality follows from our bound on d 0 . Therefore, the total net size is bounded by .
Suppose that z ′ ∈ N and z − z ′ ≤ µ √ n/D. Then the event M z ≤ Kµn D implies that Taking t = 2Kµ √ n/D, we have by Proposition 2.15, where the second to last inequality follows from the bound on D 0 in (2.2) and the last line follows from choosing µ small enough.
Proof. Suppose that we are in the event that for some z = x + iy ∈ S D,d0,α , The real part of the inequality gives By Proposition 2.18, there exists a net N with cardinality bounded by In the last line, we used the fact that α ≥ cδ. We therefore have that for t = Kµ √ n/D in Proposition 2.16, Choosing dyadic points α k = 2 k c 2.4 δ in [c 2.4 δ, 1] for k ∈ N, we can take a union bound to conclude that where in the second inequality we invoked Lemma 2.6 and in the last line we noted that the number of non-zero summands is bounded by n from the lower bound on δ. We direct our attention to vectors with incompressible real part. By Lemmas 2.8 and 2.9, it suffices to consider vectors whose LCD's are attained by their real component, or in other words z = x+iy such that D(z) = D(x). Now, we gradually exclude level sets by LCD, norm of the real component, and real-imaginary correlation. By Lemma 2.4, Lemma 2.11 and Lemma 2.13, we need only consider vectors z = x + iy such that x ≥ c 2.4 δ, D(z) ≥ c 2.11 √ n and d(z) ≥ c 2.13 δ. We define Then we denote where ξ · z = ξ T z is the dot product of ξ and z.
Finally, we quote a well-known reslult for our random matrix shifted by a real value.
Remark 2.23. The exact form of the above theorem does not appear in the literature but can easily be deduced from the proofs in [77,56].

Structure of Eigenvectors
We now have the tools to prove the following eigenvector structure theorem. We begin with a technical preliminary result.
Proof. To extend our previous results to eigenvectors, it is natural to discretize the complex ball of radius K √ n, since we are assuming the eigenvalues are bounded by K √ n. Any eigenvector will then be an approximate null vector for some complex number in the ball. However, the difficulty is that our small-ball probability bound in Theorem 2.10 depends on the real-imaginary correlation of our shift λ, which in turn is lower bounded by the imaginary component of λ. Therefore, our upper bound on the Lévy probability of approximate nullvectors degrades significantly as we near the real line. The first step of our strategy is to control the Lévy probability of approximate null vectors with corresponding approximate eigenvalues near the real line by comparing them to approximate eigenvectors of real shifts and invoking Theorem 2.22, which naturally has no dependence on the imaginary component. Taking a fine enough net of the real line thereby proves our theorem for eigenvalues inside a neighborhood of the real line. In the next step, we work on the ball with a strip around the real line excluded. This gives us some control on the imaginary component of the eigenvalues and allows us to use the results from Section 2.
We proceed with the first step. Let β = √ n/D. There exists a β/2-net, N of the real interval [−K √ n, K √ n] ∈ C with |N | ≤ 10K/β. At every point in N , we place a ball of radius β. The union of these balls necessarily contains a cβ neighborhood of the real interval [−K, K]. On the event that there exists an eigenvalue, λ, within the strip with eigenvector v, there must exist a λ ′ ∈ N such that Therefore, by Theorem 2.22, after reducing c ⋆ if necessary. It is worth pointing out that any reduction in c ⋆ will alter the constant in the error probability of Theorem 2.1, but there is no circular dependence of constants. Now, let S ′ denote the centered disk of radius K √ n after removing the strip of width β around the real line. There exists a β-net, N ′ of S ′ of size at most Cn/β 2 . Again, for an eigenvalue λ ∈ S ′ , there exists a λ ′ ∈ N ′ such that Note that by our choice of β, D will satisfy the requirements of D in Theorem 2.1. Thus, by Theorem 2.1, with probability at least 1 − |N ′ |e −c 2.1 n ≥ 1 − e −cn , any eigenvector, v ∈ S ′ will have D(v) ≥ D 0 and d(v) ≥ cβ, since the imaginary component of any element in S ′ is bounded below by cβ. By applying Theorem 2.10, we obtain that for such a vector v, Combining (3.1) and (3.2) completes the proof.
As stated, the above theorem applies to a single choice of D. The previous proofs can be restructured to show that in fact the statement holds for the whole range of D simultaneously. However, to preserve clarity, we simply deduce this as a corollary of the previous theorem.
Proof. Let d k := c 2.22 √ n2 k . By applying Theorem 3.2 with D = d k , we have that with probability at least 1 − e −c 3.2 n , any eigenvector v of N is such that On this event, for any d k ≤ D < d k+1 , which shows that to extend the event in Theorem 3.2 on D = d k to the entire interval [d k , d k+1 ) at the cost of a universal constant. Therefore, to extend the result to the entire range c 2.22 √ n ≤ D ≤ e c⋆n , we simply take a union bound over all k ∈ N with The number of such k is clearly bounded by n so by the union bound, our event of interest holds with probability at least 1 − ne −c 3.2 n .
For a fixed D, the bound However, as we have an identical bound for all D simultaneously, we can allow D to vary with t to combine these scales into a single bound.
Proof of Theorem 3.1. We let D = √ n/t. By Corollary 3.3, with probability at least 1 − e −c 3.3 n , for any eigenvector v of N , where the last inequality follows from our choice of D. There are a variety of simpler results depending on our choice of t and D. For example, setting t = 0 and D = e c⋆n in Theorem 3.2 yields the following notable consequence.

Directed Erdős-Rényi Random Graphs
For a directed graph G = ([n], E) with vertex set [n] and edge set E, we recall that the adjacency matrix A is defined by We define the directed Erdős-Rényi random graph to be the random graph on vertex set [n] such that each edge (i, j) appears independently with probability p, for a constant p ∈ (0, 1). For this model, the adjacency matrix is a random matrix with expectation pJ or p(J − I) where J is the n × n matrix of ones depending on whether or not we exclude the possibility of loops. The extra pI factor does not affect our arguments as it simply shifts the spectrum slightly. The results from the previous section are not immediately relevant as this matrix model has large norm and so E K actually occurs with probability o(1). However, due to the low rank structure of pJ, we can extend the covering arguments to handle this case (cf. [7,57,53]). We let E K denote the event that As A − EA has centered, subgaussian entries, it is well known that P(E c K ) ≤ e −cpn . Since A − EA is a mean-zero random matrix, the previous arguments apply to this matrix. The intuition is now to apply a covering argument to the range of pJ, which is a lowdimensional subspace and therefore will not require many elements to construct an epsilon net. We demonstrate this argument in its entirety for compressible vectors.
Lemma 4.1. There exist constants a, b, c 4.1 ∈ (0, 1) and K > 2 such that for λ ∈ C n with |λ| ≤ K √ n, Proof. Let u ∈ C n . Since A − EA is mean-zero, we have that Note that we have added a shift by a fixed vector u. This version is well known and can be found in [95,Proposition 4.2]. Now, let N be an (c/2) √ n-net of {t(1, . . . , 1) T : t ∈ [−n, n]} of size at most C √ n. On the event that there is a z ∈ Comp(a, b) such that (A − λ)z ≤ (c/2) √ n, we must have that for z ′ ∈ N such that z ′ − Jz ≤ (c/2) √ n, By a union bound, the above event happens with probability at most |N ′ |e −cn ≤ e c ′ n .
This trick of discretizing the range of J can be applied to all the covering arguments from the previous section. We leave the details to the reader. Note that in the analogous covering argument for the complex disk, we still require that the complex shifts to be of norm at most K √ n. This allows us to conclude that eigenvectors of A with corresponding eigenvalues in that disk have no arithmetic structure. The analogous multi-scale argument then allows us to conclude the following. One small complication that we have glossed over is that since we allow the possibility that the adjacency matrix be defined with zero diagonal, not all the entries are iid. It is easy to show that this does not alter the argument much. We show that removing a single coordinate of a vector cannot alter the LCD significantly.  C (a, b) and z ′ be the vector z with any coordinate set to zero. There exists a constant c > 0 such that Proof. Recall Definition 2.7. By incompressibility, z ′ ≥ b and D(z; L, 1) ≥ c √ n. Let D = D(z ′ / z ′ ; 2L, 1/4). Fix ε > 0. There exists θ such that D ≤ θ ≤ D + ε and Note that we must have As this is true for any ε > 0, we therefore have Applying Theorem 2.10 completes the proof.
Having established this, we leave it as an exercise to verify that all the structural results follow with only a slight change in the constants.
Remark 4.4. Using this same technique, all the structural results in Section 3 can be extended to random matrices with zero diagonal. We omit the obvious modifications.
Although the above structural results only apply to eigenvectors with corresponding eigenvalues in the centered disk of radius K √ n in the complex plane, it is known that with high probability, this disk contains all the eigenvalues of A but one. In other words, our structural results apply to all eigenvectors but one with high probability. Theorem 4.5 (Follows from Theorem 2.8 in [66]). Let N be an iid random matrix whose entries are centered and have unit variance and finite fourth moment. LetÑ be the matrix N with the diagonal entries replaced with zeros. Then for any p ∈ (0, 1) and any δ > 0, with probability 1 − o(1), all the eigenvalues ofÑ + pJ and N + pJ are contained in the disk {z ∈ C : |z| ≤ (1 + δ) √ n} with a single exception which takes the value pn + o( √ n).
To deduce some structural properties for the eigenvector of the lone eigenvalue outside the disk, we use the Perron-Frobenius theorem. Recall the following definition.

Structure of Scaled Eigenvectors
5.1. Eigenvector structure. The structure of eigenvectors from the previous section do not immediately apply to the Hadamard product of our eigenvectors with a fixed vector. There are two issues that need to be overcome in this section. The first is to deal with the possibly inhomogeneous values of the entries of b. In other words, although we have shown that any eigenvector u has no arithmetic structure, to handle the most general version, we must show that for our fixed vector b, b ⊙ u has no arithmetic structure. Here, we recall that b ⊙ u denotes the Hadamard product of b and u. The second difficulty is that there is a small set of uncontrolled coordinates in b. In this section, we demonstrate how to deduce our main theorem from the arguments in the eigenvector structure theorem, but this requires repeating most of the steps from the previous section so we only sketch the argument here.
By absorbing the error probabilities of both Lemma 2.3 and Lemma 2.6 into our final error bound, we can assume that our approximate null-vectors are incompressible and have incompressibe real part.
We recall the following condition on our fixed complex vector b.
Definition 5.1. Let B ≥ 1 be a constant. We say our vector b is (B, m)-delocalized if we have: for all but m entries of b This is a more general definition than that used in [69] as we do not require the entries to be rational.
Definition 5.2. Let n 0 = n − m. For u ∈ C n , we let u ∈ C n0 denote the vector formed by the first n 0 coordinates of u. We define the function F which takes u ∈ C n0 to We restrict our attention to those vectors in C n that have no zero coordinates. This poses no difficulty as we can infinitesimally shift any of our net points to avoid this measure-zero set. Therefore, on this slightly restricted domain, our mapping F is one-to-one so we can meaningfully speak of the inverse map F −1 .
Without loss of generality, we assume that the first n 0 = n − m entries of b satisfy (5.1). Therefore, we can assume that where the first inequality follows from the incompressibility of u and by assuming that ν is smaller than a/2, say, where we remind the reader that ν is in the statement of Theorem 1.6 and a is the constant from Lemma 2.3. This assumption also guarantees that Having established the notation, we briefly summarize the proof idea. We condition on the event that all our potential eigenvectors lie in the unstructured subset of the sphere. We consider the set The goal is to show that for any eigenvector u, F (u) has no arithmetic structure. This is done with a similar covering argument as in Section 2. In fact, we have already constructed fine nets of the structured vectors on the unit sphere. We then show that F −1 maps this net to a fine net of our potential eigenvectors. If there exists an eigenvector u ∈ U such that F (u) is structured, then there exists a vector v in our net such that v is structured (due to its proximity to F (u)) and F −1 (v) is an approximate eigenvector since F −1 (F (v)) ≈ u. This can be converted into a statement about being an approximate eigenvector by discretizing the possible eigenvalues and tensorizing as we have already seen. Finally, the probability that F −1 (v) is an approximate eigenvector is small enough to survive the union bound over all possible v in our net.
Several subtleties have been overlooked in this description of our proof. F (u) does not necessarily have norm 1, but typically this only adds a single dimension to our epsilon nets. Additionally, our notion of structure actually encompasses several parameters (e.g. compressibility, LCD, real-imaginary correlation), so our argument needs to deal with these separately as in Section 2. Fortunately, many of the calculations can be recycled. Due to the similarities, we will only provide full details for a few representative lemmas.
For now we fix a complex number λ with |λ| ≤ K √ n.
Lemma 5.3. There exist constants a ′ , b ′ ∈ (0, 1) such that Proof. We use I to denote Incomp C (a, b). We consider the event that F (u)/ F (u) 2 ∈ Comp C (a ′ , b ′ ). By the standard volume argument, there exists a b ′ /8T 2 -net,N of Comp(a ′ , b ′ ) of size at most By (5.2), it suffices to consider We use a union of discrete scalings ofN to create a net of C. Let To see that this is a b ′ /4B-net of C, take w ∈ C and let v ∈N such that w With a simple trick, we can modify N so that N ∈ C at the cost of changing N to a b ′ /2B-net. The procedure is as follows. For every v ∈ N , if there is an element of C within a distance of b ′ /8, replace v with that element, choosing one arbitrarily if there are multiple options. If there is no element of C within b ′ /8, then remove v from N . It is easy to verify that this modified N is a b ′ /2B net of C and is of size at most We use F −1 (N ) to approximate the first n 0 coordinates. We combine this with a simple volume net. There exists a b ′ /2 net, N ′ , of K · B m (0) (where B m (0) is the unit ball in C m ) of size at most (CK/b ′ ) 2m . We define our final net which is of size at most exp a ′ n 0 log(e/a ′ )+3a ′ n 0 log( ≤ exp a ′ n 0 log(e/a ′ ) + 3a ′ n 0 log(CB 2 /b ′ ) + 2m log(CK/b ′ ) .
By the triangle inequality, N ′′ is a b-net of the eigenvectors u. Therefore, since (N −λ)u ≤ Kb ′ √ n, On the other hand, by a standard tensorization argument (c.f. [78, Lemma 3.2]), for any v ∈ N ′′ , for small enough b ′ . Thus, by a union bound, where the last line follows from choosing a ′ , b ′ small enough and noting that m = o(n).
The same approximation procedure yields analogues of all the lemmas in Section 2. We illustrate this with one more example.
√ n/α, D 0 ] be such that m ≤ νn/ log D. Recall the definition of S D,d,α in Definition 2.14.
Proof. The proof follows the same strategy as the previous proof. We generate a net of the set F (u) such that u ∈ Incomp C (a, b) with F (u)/ F (u) ∈ S D,d,α . By Proposition 2.17, there exists a µ √ n/D-net, N ′ , of S D,d,α of size at most Therefore, we define a net that is composed of discrete scalings of N ′ . Let Observe that We use a trivial net to estimate the remaining m coordinates. There is a µ √ n/2D-net of size at most (CBD/µ √ n) 2m for K · B m (0). We combine this with F −1 (N ) to create a 4µ √ n/D-net of those approximate null-vectors with F (u)/ F (u) 2 ∈ S D,d,α . We call this netN . For any vector u such that F (u)/ F (u) ∈ S D,d,α and M u ≤ Kµn/2D, there exists a v ∈N from our net such that M v ≤ Kµn/D.

By Theorem 2.19 and the proof therein,
The small-ball probability follows from Proposition 2.15 and Proposition 2.19. The third to last inequality is the crucial line that determines the trade-off between m and D.
Combining the analogous propositions and lemmas yield the analogous strucutre theorem for approximate null-vectors. Finally, to conclude the same structure theorem for eigenvectors, we use the approximation argument from Section 3. Ultimately, this leads to the following structural theorem.
We provide one specific choice of m and t to demonstrate possible consequences of this theorem.
An identical series of theorems can be proved for the adjacency matrix case using the approximation techniques of Section 3.
Again, we would like to extend the range of effective bounds by combining our bounds at different scales as we did at the end of Section 3. Due to the dependence of m on D, we will have an extra complication.
Proof. Let d k = c 5.5 √ n2 k . By our choice of m, for any d k with k ∈ N such that d k ∈ [c 5.5 √ n, D ′ ], we can apply Theorem 5.5 to conclude that with probability at least 1−e −c 5.5 n , for b a (B, m)-delocalized vector and for any eigenvector v of N will be such that for t ≥ 0, On this event, for any D ∈ [d k , d k+1 ), Taking a union bound over k ∈ N with d k ∈ [c 5.5 √ n, D ′ ] concludes the proof. Now, we allow D to vary with t to boost our result to all scales. Proof. This results follows from applying Corollary 5.7 with D ′ = e νn/m , D = √ n/t and restricting t so that m ≤ νn/ log D as required in Corollary 5.7.

Completing the Proofs and Deducing Controllability
This section is devoted to the proofs of our main results and their corollaries. The key tool is the following proposition. Proposition 6.1. Let N be an iid matrix with symmetric atom variable ξ that satisfies Assumption 1.3. Fix constants B, K ≥ 1. Then there exist positive constants c ⋆ , ν, C 6.1 , c 6.1 depending on B, K, and ξ such that the following holds. Let m ≤ ν √ n. For a (B, m)delocalized vector b ∈ C n and for any t ≥ e −ν ′ n/m , P(∃ a unit eigenvector v of N such that |b T v| ≤ t) ≤ C 6.1 nt + P(E c K ). Remark 6.2. The above proposition also applies when the matrix N is an iid matrix except with zeros along the diagonal. This is a simple consequence of Remark 4.4 and the proof of Proposition 6.1. We omit the details.
We prove Proposition 6.1 in Section 6.4 below. Theorem 1.6 follows immediately from Proposition 6.1. Theorem 1.5 is a consequence of Theorem 1.6 since the all-ones vector is (B, 0)-delocalized for any B ≥ 1.
6.1. Controllability. While Definition 1.12 gives Kalman's rank condition for the pair (A, b) to be controllable, it is not the most useful criteria to check. Instead, in this section, we will focus on the Popov-Belevitch-Hautus (PBH) test. This test was introduced independently by Popov [71], Belevitch [8], Hautus [42], Rosenbrock [75], Hahn [47, p. 27], Johnson [44], Ford and Johnson [35], and Gilbert [39]. The version presented below appears as Theorem 2.4-8 in [46]. In order to study the probability that (A, b) is controllable, the PBH test allows us to study the probability that a left eigenvector of A is orthogonal to b. In fact, if A is an iid matrix (or the adjacency matrix of a directed Erdős-Rényi random graph), A and A T have the same distribution, and it suffices to study the probability that a (right) eigenvector of A is orthogonal to b. In order to do so, we will apply Proposition 6.1.
In view of Theorem 6.3, by taking t as small as possible, Proposition 6.1 allows us to bound the probability that (A, b) is uncontrollable. Indeed, we immediately obtain the following corollary for an iid matrix. Corollary 6.4. Let N be an iid matrix with symmetric atom variable ξ that satisfies Assumption 1.3. Fix constants B, K ≥ 1. Then there exist positive constants ν, C 6.4 , c 6.4 depending on B, K, and ξ such that the following holds. Let m ≤ ν √ n. For a (B, m)delocalized vector b ∈ C n , P ((N, b) is uncontrollable) ≤ C 6.4 e −c 6.4 n + P(E c K ). Corollary 1.13 now follows immediately from Corollary 6.4 and (1.4). We finish this subsection with a proof of Corollary 1.17.
Proof of Corollary 1.17. Recall that 1 is the all-ones vector. Let B := A − 1 2 J, where J = 11 T is the all-ones matrix. If v is an eigenvector of A that is orthogonal to 1, then v must also be an eigenvector of B (since Jv = 0). We will work with the matrix N := 2(B + 1 2 I), where I is the identity matrix. The matrix N has the same eigenvectors as B (since shifting by a multiple of the identity matrix and scalar multiplication do not change the eigenvectors), and the entries of N are iid Rademacher random variables, except for the diagonal entries which are identically zero.
By Proposition 6.1 and Remark 6.2, N is uncontrollable with probability at most Ce −cn for some C, c > 0 since the entries of N are subgaussian. Hence, B is uncontrollable with the same probability. From the controllability of B we can conclude the controllability of A due to following chain of implications: which completes the proof.
6.2. Random Vectors: Proofs of Corollaries 1.14 and 1.18. In order to prove Corollaries 1.14 and 1.18, we will need the following lemma.
Lemma 6.5. Let ξ be a real-valued random variable with mean zero, unit variance, and finite fourth moment. Let N be the n × n iid random matrix with atom variable ξ. Let ψ be a real-valued random variable that satisfies Assumption 1.3, and assume b ∈ R n is a random vector with entries that are iid copies of ψ. Then Proof. In view of (1.4), it follows that there exists a constant K > 1 so that E K holds with probability 1 − o(1). We say the eigenvalues of N are simple if N has n distinct eigenvalues (each with multiplicity one). Let S denote the event that the eigenvalues of N are simple. It follows from Theorem 1.9 that S holds with probability 1 − o(1). Let E denote the event that there exists a unit eigenvector v of N with ρ(v, 0) > e −c 3.4 n .
It follows from Corollary 3.4 that P(E) ≤ P(E ∩ E K ) + P(E c K ) = o(1). Therefore, we conclude that On the event S, N has n distinct eigenvectors, determined uniquely up to sign. Let v 1 , . . . , v n denote the unit eigenvectors of N on the event S. Since the choice of sign for each eigenvector does not effect whether b T v i is zero or not, we adopt the convention that each eigenvector v i is multiplied by a random sign, independent of all other sources of randomness. We obtain On the event E c , ρ(v i , 0) ≤ e −c 3.4 n for all i ∈ [n]. So by the union bound, The proof of the lemma is complete. Corollary 1.14 now follows from Lemma 6.5 and Theorem 6.3. Similarly Corollary 1.18 follows from the following lemma. Lemma 6.6. Let A be the n × n adjacency matrix of an Erdős-Rényi directed graph with constant edge probability p ∈ (0, 1). Let ψ be a real-valued random variable that satisfies Assumption 1.3, and assume b ∈ R n is a random vector with entries that are iid copies of ψ.
Proof. The argument is identical to the proof of Lemma 6.5 except for the following changes: • One must use Theorem 1.11 instead of Theorem 1.9.
• Instead of Corollary 3.4 one needs to apply Theorem 4.2.
• It only remains to address the eigenvector, v, associated to the largest eigenvalue. By Theorems 4.8 and 4.9, the eigenvector is entirely positive, which in particular implies that each entry is non-zero. We now appeal to an anti-concentration inequality which is a generalization of the classical result of Erdős-Littlewood-Offord.
Therefore, applying the lemma with r = r i = min i v i > 0, where C only depends on p.
6.3. Minimal Controllability. Our eigenvector structure results can quickly lead to a result on minimal controllability.
Proof of Corollary 1.16. By symmetry it suffices to bound the probability that (N, e 1 ) is controllable. In view of (1.4) it suffices to upper bound P((N, e 1 ) is uncontrollable and E K ) for some sufficiently large constant K > 1. Let us decompose our matrix N as where N 11 denotes the (1, 1)-entry of N , X, Y ∈ R n and N ′ is an (n − 1) × (n − 1) matrix. Moreover, N 11 , X, Y, and N ′ are jointly independent. Using Theorem 6.3, we need to upper bound the probability that a unit eigenvector of N is orthogonal to e 1 . The key observation is that if there exists a unit eigenvector v = v 1 v ′ that is orthogonal to e 1 then v ′ is a unit eigenvector of N ′ and X T v ′ = 0. Thus, it suffices to show that Since the entries of X are iid random variables, independent of N ′ , and satisfy Assumption 1.3, the bound in (6.1) follows from Lemma 6.5; the proof is complete.
The proof of Corollary 1.19 follows from a nearly identical argument.
Proof of Corollary 1.19. The proof is identical to that of Corollary 1.16 except for the following points.
• We use Lemma 6.6 instead of Lemma 6.5 to handle all eigenvectors with eigenvalues within a disk of radius 2 √ n. • By Theorem 4.5, it only remains to address the eigenvector, v, associated to the largest eigenvalue. By Theorems 4.8 and 4.9, the eigenvector is entirely positive with high probability which, in particular, implies that e T i v = 0.
6.4. Proof of Proposition 6.1. This section is devoted to the proof of Proposition 6.1.
The main idea is to utilize the symmetry of the atom distribution of N to rewrite the dot product b T v as a small ball probability (in particular, conditioned on the matrix N , we rewrite the dot product as a sum of independent random variables). The same idea was exploited in [69] to study the controllability of real symmetric random matrices.
Proof of Proposition 6.1. Let ξ = (ε 1 , . . . , ε n ), where ε 1 , . . . , ε n are iid Rademacher random variables, independent of N , i.e., each ε i takes the values ±1 with probability 1/2. We say the eigenvalues of N are simple if N has n distinct eigenvalues (each with multiplicity one). Let S be the event that the eigenvalues of N are simple and that E K holds. We have P(∃ a unit eigenvector v of N such that |b T v| ≤ t) ≤ P(∃ a unit eigenvector v of N such that |b T v| ≤ t and S) + P(S c ), and by Theorem 1.9 P(S c ) ≤ Ce −cn + P(E c K ). We now turn our attention to bounding P(∃ a unit eigenvector v of N such that |b T v| ≤ t and S).
On the event S, N has n distinct unit eigenvectors v 1 , . . . , v n , which are determined uniquely up to sign. As the choice of sign does not change the value of |b T v i |, we will simply assume that each eigenvector is multiplied by a random sign, independent of all other sources of randomness. Then P(∃ a unit eigenvector v of N such that |b T v| ≤ t and S) ≤ P(∃i ∈ [n] such that |b T v i | ≤ t and S). We can now exploit the fact that the entries of N = (N ij ) n i,j=1 are symmetric random variables. Indeed, let N ′ = (ε i ε j N ij ) n i,j=1 . A simple calculation shows that the eigenvalues of N ′ are the same as the eigenvalues of N . In addition, when v 1 , . . . , v n are the eigenvectors of N , then v 1 ⊙ ξ, . . . , v n ⊙ ξ are the eigenvectors of N ′ . Here, u ⊙ v denotes the Hadamard product of the vectors u = (u i ) and v = (v i ) defined by u ⊙ v = (u i v i ). Since the atom variable of N is symmetric, it follows that N ′ is an iid matrix and that N ′ has the same distribution as N . This implies that the eigenvectors v 1 ⊙ ξ, . . . , v n ⊙ ξ have the same distribution as v 1 , . . . , v n . Hence, we conclude that P(∃i ∈ [n] such that |b T v i | ≤ t and S) = P(∃i ∈ [n] such that |b T (v i ⊙ ξ)| ≤ t and S).
The probability that |b T (v i ⊙ ξ)| ≤ t can be bounded above by the small ball probability ρ(b ⊙ v i , t), and so we can now apply Theorem 5.8. Indeed, Theorem 5.8 guarantees the existence of an event E, which holds with probability at least 1 − O(e −c 5.8 n ), so that conditioned on this event the eigenvectors v 1 , . . . , v n are such that Returning to (6.2), it suffices to bound P(∃i ∈ [n] such that |b T (v i ⊙ ξ)| ≤ t|S ∩ E).
Applying the union bound and (6.3) yields the desired conclusion.
As a consequence, Proof. We begin with the assumption that λ i = λ j and let v i and v j be two corresponding eigenvectors. Then we can choose v = v i . We will choose w to be orthogonal to v and also in the span of v i and v j . Let us write w = a i v i + a j v j . Therefore, Furthermore, M w − αv = (λ j − z √ n)w ≤ s. If λ i = λ j , but the geometric multiplicity is greater than or equal to two, then the above argument still applies since we can find distinct eigenvectors v i = v j . Thus, the only remaining case is when λ = λ i = λ j and the geometric multiplicity of λ is one. By the Jordan canonical form, there exist v i = v j such that Using the notation w = a i v i + a j v j for a vector orthogonal to v i , we have M w = (λ − z √ n)w + a j v i so we can use α = a j and complete the proof as above.
The next lemma allows us to consider bounding the tails of the least singular value and and the second smallest singular value. Proof. We begin with the first case. We have M v ≤ s and M w ≤ M w − αv + αv ≤ 2s As v and w are orthogonal, we have s n (M ) ≤ s and s n−1 (M ) ≤ 2s. Now we assume that |α| > s. Since we have s n−1 (N ) ≤ 2s. Also, s n (M ) ≤ s 2 (M | span(v,w) ) ≤ dist(M v, span(M w)).
To evaluate the right-hand side of this inequality, we recall that all vectors X ′ orthogonal to H n,n−1 with | X ′ , X n−1 | < t ′ . By Theorem 2.1 and Theorem 2.10, with probability at least 1 − e −cn , for any vector, v, orthogonal to H n , ρ(v, t) ≤ C δ t + e −cn 2 .
We denote this event by E. One can easily check that the proof of Theorem 2.1 applies equally well to vectors orthogonal to H n,n−1 so we also have that any vector x orthogonal to H n,n−1 , with probability at least 1 − e −cn , ρ(x, t) ≤ C δ (t + e −cn ) 2 .
We call this event E ′ . Therefore, P(dist(X n , H n ) < t, dist(X n−1 , H n,n−1 ) < t ′ and E K ) ≤ P(| X, X n | < t, E K and E)P(| X ′ , X n−1 | < t, E K and E ′ ) + P(E c ) + P(E ′c ) where the last line follows from the independence of X n , X n−1 and H n,n−1 . The result follows after reducing c ⋆ if necessary.
Finally, we recall a tail bound for real shifts.
An analogous argument using Proposition A.7 instead of Proposition A.6 yields the second result.