## Electronic Communications in Probability

### A note on the Pennington-Worah distribution

S. Péché

#### Abstract

This paper is concerned with a new expression of the so-called Pennington-Worah distribution, characterizing the asymptotic empirical eigenvalue distribution of some non linear random matrix ensembles. More precisely consider $M= \frac{1} {m} YY^{*}$ with $Y=f(WX)$ where $W$ and $X$ are random rectangular matrices with i.i.d. centered entries. The function $f$ is applied pointwise and can be seen as an activation function in (random) neural networks. The asymptotic empirical distribution of this ensemble has been computed in [16] and [3]. Here it is related to the Marcenko-Pastur distribution and information plus noise matrices.

#### Article information

Source
Electron. Commun. Probab., Volume 24 (2019), paper no. 66, 7 pp.

Dates
Accepted: 21 August 2019
First available in Project Euclid: 31 October 2019

https://projecteuclid.org/euclid.ecp/1572509100

Digital Object Identifier
doi:10.1214/19-ECP262

Zentralblatt MATH identifier
07126981

Subjects
Primary: 60E05: Distributions: general theory

Keywords
random matrices machine learning

#### Citation

Péché, S. A note on the Pennington-Worah distribution. Electron. Commun. Probab. 24 (2019), paper no. 66, 7 pp. doi:10.1214/19-ECP262. https://projecteuclid.org/euclid.ecp/1572509100

#### References

• [1] Akemann, G. Ipsen, J. R. and Kieburg, M. Products of rectangular random matrices: Singular values and progressive scattering. Phys. Rev. E, 88, (2013) 052–118.
• [2] Benaych-Georges, F., On a surprising relation between the Marchenko-Pastur law, rectangular and square free convolutions. Ann. Inst. Henri Poincaré Probab. Stat., 46, no. 3, (2010), 644–652.
• [3] Benigni, L. and Peche, S. Eigenvalue distribution of non linear models of matrix ensembles. arXiv preprint, (2019).
• [4] Capitaine, M. Limiting eigenvectors of outliers for spiked information-plus-noise type matrices. Lecture Notes in Math. Springer, Séminaire de Probabilités, XLIX, (2018), 119–164.
• [5] Capitaine, M. Exact separation phenomenon for the eigenvalues of large information-plus-noise type matrices, and an application to spiked models. Indiana Univ. Math. J., 63, (2014), 1875–1910.
• [6] Capitaine, M. Deformed ensembles, polynomials in random matrices and free probability theory. Habilitation thesis, HAL Id: tel-01978065, version 1, (2017).
• [7] Cébron, G., Dahlqvist, A. and Male C. Universal constructions for spaces of traffics. arXiv preprint, (2016).
• [8] Choromanska, A., Henaff, M., Mathieu, M., Ben Arous, G. and LeCun, Y. The loss surfaces of multilayer networks, Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, AISTATS 2015, (2015).
• [9] Cirac, C., Cranmer,K., Daudet, L., Schuld, M., Vogt-Maranto, L. and Zdeborová, L. Machine learning and the physical sciences. arXiv preprint arXiv:1903.10563, (2019).
• [10] Dozier, B. and Silverstein, J. On the empirical distribution of eigenvalues of large dimensional information-plus-noise-type matrices. Journal of Multivariate Analysis, 98, (2007), 678–694.
• [11] Dupic, T. and Castillo, I. P. Spectral density of products of Wishart dilute random matrices. Part I: the dense case. arXiv preprint, (2014)
• [12] El Karoui, N. The spectrum of kernel random matrices. Ann. Statist., 38, no. 1 (2010), 1–50.
• [13] Hanin, B. and Nica, M. Products of many large random matrices and gradients in deep neural networks. arXiv preprint, (2018).
• [14] Louart, C., Liao, Z. and Couillet, R. A random matrix approach to neural networks. Ann. Appl. Probab., 28, no. 2 (2018), 1190–1248.
• [15] Marčenko, V. A. and Pastur, L. A. Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. (N.S.), 72, no. 114 (1967), 507–536.
• [16] Pennington, J. and Worah, P. Nonlinear random matrix theory for deep learning. Advances in Neural Information Processing Systems, (2017), 2637–2646.