The Annals of Statistics

Spatial aggregation of local likelihood estimates with applications to classification

Denis Belomestny and Vladimir Spokoiny

Full-text: Open access


This paper presents a new method for spatially adaptive local (constant) likelihood estimation which applies to a broad class of nonparametric models, including the Gaussian, Poisson and binary response models. The main idea of the method is, given a sequence of local likelihood estimates (“weak” estimates), to construct a new aggregated estimate whose pointwise risk is of order of the smallest risk among all “weak” estimates. We also propose a new approach toward selecting the parameters of the procedure by providing the prescribed behavior of the resulting estimate in the simple parametric situation. We establish a number of important theoretical results concerning the optimality of the aggregated estimate. In particular, our “oracle” result claims that its risk is, up to some logarithmic multiplier, equal to the smallest risk for the given family of estimates. The performance of the procedure is illustrated by application to the classification problem. A numerical study demonstrates its reasonable performance in simulated and real-life examples.

Article information

Ann. Statist. Volume 35, Number 5 (2007), 2287-2311.

First available in Project Euclid: 7 November 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62G07: Density estimation 62G08: Nonparametric regression 62G32: Statistics of extreme values; tail inference 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Adaptive weights local likelihood exponential family classification


Belomestny, Denis; Spokoiny, Vladimir. Spatial aggregation of local likelihood estimates with applications to classification. Ann. Statist. 35 (2007), no. 5, 2287--2311. doi:10.1214/009053607000000271.

Export citation


  • Belomestny, D. and Spokoiny, V. (2006). Spatial aggregation of local likelihood estimates with applications to classification. SFB 649 Discussion Paper 2006-036.
  • Breiman, L. (1996). Stacked regressions. Machine Learning 24 49--64.
  • Cai, Z., Fan, J. and Li, R. (2000). Efficient estimation and inference for varying-coefficient models. J. Amer. Statist. Assoc. 95 888--902.
  • Catoni, O. (2004). Statistical Learning Theory and Stochastic Optimization. Lecture Notes in Math. 1851. Springer, Berlin.
  • Fan, J., Farmen, M. and Gijbels, I. (1998). Local maximum likelihood estimation and inference. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 591--608.
  • Fan, J. and Zhang, W. (1999). Statistical estimation in varying coefficient models. Ann. Statist. 27 1491--1518.
  • Juditsky, A. and Nemirovski, A. (2000). Functional aggregation for nonparametric estimation. Ann. Statist. 28 681--712.
  • Lepski, O., Mammen, E. and Spokoiny, V. (1997). Optimal spatial adaptation to inhomogeneous smoothness: An approach based on kernel estimates with variable bandwidth selectors. Ann. Statist. 25 929--947.
  • Lepski, O. and Spokoiny, V. (1997). Optimal pointwise adaptive methods in nonparametric estimation. Ann. Statist. 25 2512--2546.
  • Li, J. and Barron, A. (1999). Mixture density estimation. In Advances in Neural Inforamtion Processing Systems 12 (S. A. Sola, T. K. Leen and K. R. Müller, eds.). Morgan Kaufmann Publishers, San Mateo, CA.
  • Loader, C. R. (1996). Local likelihood density estimation. Ann. Statist. 24 1602--1618.
  • Polzehl, J. and Spokoiny, V. (2006). Propagation-separation approach for local likelihood estimation. Probab. Theory Related Fields 135 335--362.
  • Rigollet, Ph. and Tsybakov, A. (2005). Linear and convex aggregation of density estimators. Manuscript.
  • Spokoiny, V. (1998). Estimation of a function with discontinuities via local polynomial fit with an adaptive window choice. Ann. Statist. 26 1356--1378.
  • Staniswalis, J. G. (1989). The kernel estimate of a regression function in likelihood-based models. J. Amer. Statist. Assoc. 84 276--283.
  • Tibshirani, R. and Hastie, T. J. (1987). Local likelihood estimation. Amer. Statist. Assoc. 82 559--567.
  • Tsybakov, A. (2003). Optimal rates of aggregation. In Computational Learning Theory and Kernel Machines (B. Scholkopf and M. Warmuth, eds.) 303--313. Lecture Notes in Artificial Intelligence 2777. Springer, Heidelberg.
  • Yang, Y. (2004). Aggregating regression procedures to improve performance. Bernoulli 10 25--47.