The Annals of Statistics

Functional aggregation for nonparametric regression

Anatoli Juditsky and Arkadii Nemirovski

Full-text: Open access


We consider the problem of estimating an unknown function $f$ from $N$ noisy observations on a random grid. In this paper we address the following aggregation problem: given $M$ functions $f_1,\dots, f_M$, find an “aggregated ”estimator which approximates $f$ nearly as well as the best convex combination $f^*$ of $f_1,\dots, f_M$. We propose algorithms which provide approximations of $f^*$ with expected $L_2$ accuracy $O(N^{-1/4}\ln^{1/4} M$. We show that this approximation rate cannot be significantly improved. We discuss two specific applications: nonparametric prediction for a dynamic system with output nonlinearity and reconstruction in the Jones – Barron class.

Article information

Ann. Statist. Volume 28, Number 3 (2000), 681-712.

First available in Project Euclid: 12 March 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression 62L20: Stochastic approximation

Functional aggregation convex optimization stochastic approximation


Juditsky, Anatoli; Nemirovski, Arkadii. Functional aggregation for nonparametric regression. Ann. Statist. 28 (2000), no. 3, 681--712. doi:10.1214/aos/1015951994.

Export citation


  • [1] Breiman, L. (1993). Hinging hyperplanes for regression, classification and function approximation. IEEE Trans. Inform. Theory 39 999-1013.
  • [2] Barron, A. (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39 930-945.
  • [3] Barron, A. (1991). Complexity regularization with application to artificial neural networks. In Nonparametric Functional Estimation and Related Topics (G. Roussas ed.) Kluwer, Netherlands.
  • [4] Barron, A. (1994). Approximation and estimation bounds for artificial neural networks. Machine Learning 14 115-133.
  • [5] Breiman, L., Friedman, J. M., Olshen, J. H. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
  • [6] Donoho, D., Johnstone, I., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage: asymptopia? J. Roy. Statist. Soc. Ser. B 57 301-369.
  • [7] Efroimovich, S. Y., and Pinsker, M. S. (1984). A learning algorithm for nonparametric filtering. Automatika i Telemehanika (in Russian). (English translation in Automat. Remote Control 11 58-65.)
  • [8] Friedman, J. H. and Stuetzle, W. (1981). Projection pursuit regression. J. Amer. Statist. Assoc. 76 817-823.
  • [9] Friedman, J. H. (1991). Multivariate adaptive regression splines. Ann. Statist. 19 1-141.
  • [10] Goldenshluger, A. and Nemirovski, A. On spatial adaptive nonparametric regression. Research Report 5/94, Optimization Lab. Faculty of Industrial Engineering and Management, Technion.
  • [11] H¨ardle, W. (1990). Applied Nonparametric Regression. Cambridge Univ. Press.
  • [12] Huber, P. J. (1985). Projection pursuit (with discussion). Ann. Statist. 13 435-525.
  • [13] Jones, L. K. (1992). A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural networktraining. Ann. Statist. 20 608-613.
  • [14] Lee, W. S., Barlett, P. L. and Williamson, R. C. (1996). Efficient agnostic learning of neural networks with bounded fanin. IEEE Trans. Inform. Theory 42 2118-2132.
  • [15] Morgan, J. N. and Sonquist, J. A. (1963). Problems in the analysis of survey data, and a proposal. J. Amer. Statist. Assoc. 58 415-434.
  • [16] Nemirovski, A. (1992). On nonparametric estimation of functions satisfying differential inequalities. In Advances in Soviet Mathematics (R. Khasminski, ed.) 12 7-43. Amer. Math. Soc. Washington, DC.
  • [17] Nemirovski, A. and Yudin, D. (1983). Problem Complexity and Method Efficiency in Optimization. Wiley, New York.
  • [18] Pisier, G. (1981). Remarques sur un resultat non publie de B. Maurey. In Seminaire d'analyse fonctionelle. 112. Ecole Polytechnique, Palaiseau.
  • [19] Rubinshtein, R. and Shapiro, A. (1995). Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization via the Score Function Method. Wiley, New York.
  • [20] Korostelev, A. and Tsybakov, A. (1991). Minimax Theory of Image Reconstruction. Springer, New York.
  • LMC, 51 rue de Math´ematiques Domaine Universitaire, BPS3 Grenoble, Cedex 9 France E-mail: Faculty of Industrial Engineering and Management at Technion Technion, Israel Institute of Technology Technion City, Haifa 32000 Israel E-mail: