Estimating a regression function in exponential families by model selection

Juntong Chen

doi:10.3150/23-BEJ1649

Abstract

Let $(W_{1}, Y_{1}), \dots, (W_{n}, Y_{n})$ be n pairs of independent random variables. We assume that, for each $i \in {1, \dots, n}$ , the conditional distribution of $Y_{i}$ given $W_{i}$ belongs to a one-parameter exponential family with parameter $γ^{⋆} (W_{i}) \in R$ . The statistical goal is to estimate these conditional distributions. We consider a model selection procedure which works based on a general assumption that each of the model is VC-subgraph. We establish a non-asymptotic risk bound for the resulting estimator with respect to a Hellinger-type distance. By leveraging this result, we extend several findings previously explored in Gaussian regression to the regression in exponential families. Specifically, we address the curse of dimensionality by imposing structural assumptions, such as general additive and multiple index structures, on $γ^{⋆}$ . We also study model selection for ReLU neural networks, and provide a concrete example of how ReLU neural networks can achieve a significantly faster convergence rate than traditional models. When $γ^{⋆}$ is close to a composition of several Hölder functions, we show that under a suitable parametrization of the exponential family, our estimator achieves the same rate of convergence as in the Gaussian case. Combining with a lower bound, the rate is minimax optimal up to a logarithmic term. Finally, we apply the model selection procedure to address adaptation and variable selection problems in exponential families.

Funding Statement

The author was supported by European Union’s Horizon 2020 research and innovation program under grant agreement N^o 811017.

Acknowledgments

The author is grateful to her supervisor Prof. Yannick Baraud for helpful discussions and constructive suggestions. The author also thanks the referees and the editors for their suggestions and comments, which have contributed to the improvement of this paper.

Citation

Download Citation

Juntong Chen. "Estimating a regression function in exponential families by model selection." Bernoulli 30 (2) 1669 - 1693, May 2024. https://doi.org/10.3150/23-BEJ1649

Information

Received: 1 April 2022; Published: May 2024

First available in Project Euclid: 31 January 2024

MathSciNet: MR4699568

Digital Object Identifier: 10.3150/23-BEJ1649

Keywords: Generalized additive structure , Model selection , multiple index structure , regression in exponential family , ReLU neural networks , Variable selection

Abstract

Funding Statement

Acknowledgments

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS