August 2023 Deep stable neural networks: Large-width asymptotics and convergence rates
Stefano Favaro, Sandra Fortini, Stefano Peluchetti
Author Affiliations +
Bernoulli 29(3): 2574-2597 (August 2023). DOI: 10.3150/22-BEJ1553

Abstract

In modern deep learning, there is a recent and growing literature on the interplay between large-width asymptotic properties of deep Gaussian neural networks (NNs), i.e. deep NNs with Gaussian-distributed weights, and Gaussian stochastic processes (SPs). Motivated by empirical analyses that show the potential of replacing Gaussian distributions with Stable distributions for the NN’s weights, in this paper we present a rigorous analysis of the large-width asymptotic behaviour of (fully connected) feed-forward deep Stable NNs, i.e. deep NNs with Stable-distributed weights. We show that as the width goes to infinity jointly over the NN’s layers, i.e. the “joint growth” setting, a rescaled deep Stable NN converges weakly to a Stable SP whose distribution is characterized recursively through the NN’s layers. Because of the non-triangular structure of the NN, this is a non-standard asymptotic problem, to which we propose an inductive approach of independent interest. Then, we establish sup-norm convergence rates of the rescaled deep Stable NN to the Stable SP, under the “joint growth” and a “sequential growth” of the width over the NN’s layers. Such a result provides the difference between the “joint growth” and the “sequential growth” settings, showing that the former leads to a slower rate than the latter, depending on the depth of the layer and the number of inputs of the NN. Our work extends some recent results on infinitely wide limits for deep Gaussian NNs to the more general deep Stable NNs, providing the first result on convergence rates in the “joint growth” setting.

Acknowledgements

The authors are grateful to the Editor (Professor Mark Podolskij), the Associate Editor and three anonymous Referees for all their comments, corrections, and numerous suggestions that improved remarkably the paper. Stefano Favaro received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 817257. Stefano Favaro gratefully acknowledge the financial support from the Italian Ministry of Education, University and Research (MIUR), “Dipartimenti di Eccellenza” grant 2018-2022.

Citation

Download Citation

Stefano Favaro. Sandra Fortini. Stefano Peluchetti. "Deep stable neural networks: Large-width asymptotics and convergence rates." Bernoulli 29 (3) 2574 - 2597, August 2023. https://doi.org/10.3150/22-BEJ1553

Information

Received: 1 December 2021; Published: August 2023
First available in Project Euclid: 27 April 2023

MathSciNet: MR4580928
zbMATH: 07691593
Digital Object Identifier: 10.3150/22-BEJ1553

Keywords: Bayesian inference , deep neural network , depth limit , exchangeable sequence , Gaussian stochastic process , infinitely wide limit , neural tangent kernel , spectral measure , Stable stochastic process , sup-norm convergence rate

JOURNAL ARTICLE
24 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.29 • No. 3 • August 2023
Back to Top