The Annals of Statistics
- Ann. Statist.
- Volume 40, Number 2 (2012), 639-665.
Kullback–Leibler aggregation and misspecified generalized linear models
In a regression setup with deterministic design, we study the pure aggregation problem and introduce a natural extension from the Gaussian distribution to distributions in the exponential family. While this extension bears strong connections with generalized linear models, it does not require identifiability of the parameter or even that the model on the systematic component is true. It is shown that this problem can be solved by constrained and/or penalized likelihood maximization and we derive sharp oracle inequalities that hold both in expectation and with high probability. Finally all the bounds are proved to be optimal in a minimax sense.
Ann. Statist. Volume 40, Number 2 (2012), 639-665.
First available in Project Euclid: 17 May 2012
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62G08: Nonparametric regression
Secondary: 62J12: Generalized linear models 68T05: Learning and adaptive systems [See also 68Q32, 91E40] 62F11
Rigollet, Philippe. Kullback–Leibler aggregation and misspecified generalized linear models. Ann. Statist. 40 (2012), no. 2, 639--665. doi:10.1214/11-AOS961. https://projecteuclid.org/euclid.aos/1337268207
- Supplementary material: Minimax lower bounds. Under some convexity and tail conditions, we prove minimax lower bounds for the three problems of Kullback–Leibler aggregation: model selection, linear and convex. The proof consists in three steps: first, we identify a subset of admissible estimators, then we reduce the problem to a usual problem of regression function estimation under the mean squared error criterion and finally, we use standard minimax lower bounds to complete the proof.