The Annals of Statistics

Smooth discrimination analysis

Enno Mammen and Alexandre B. Tsybakov

Full-text: Open access


Discriminant analysis for two data sets in $\mathbb{R}^d$ with probability densities $f$ and $g$ can be based on the estimation of the set $G = \{x: f(x) \geq g(x)\}$. We consider applications where it is appropriate to assume that the region $G$ has a smooth boundary or belongs to another nonparametric class of sets. In particular, this assumption makes sense if discrimination is used as a data analytic tool. Decision rules based on minimization of empirical risk over the whole class of sets and over sieves are considered. Their rates of convergence are obtained. We show that these rules achieve optimal rates for estimation of $G$ and optimal rates of convergence for Bayes risks. An interesting conclusion is that the optimal rates for Bayes risks can be very fast, in particular, faster than the “parametric” root-$n$ rate. These fast rates cannot be guaranteed for plug-in rules.

Article information

Ann. Statist., Volume 27, Number 6 (1999), 1808-1829.

First available in Project Euclid: 4 April 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties

Discrimination analysis optimal rates empirical risk Bayes risk sieves


Mammen, Enno; Tsybakov, Alexandre B. Smooth discrimination analysis. Ann. Statist. 27 (1999), no. 6, 1808--1829. doi:10.1214/aos/1017939240.

Export citation


  • ALEXANDER, K. S. 1984. Probability inequalities for empirical processes and a law of the
  • ASSOUAD, P. 1983. Deux remarques sur l'estimation. C. R. Acad. Sci. Paris 296 1021 1024. Z.
  • BARRON, A. 1991. Complexity regularization with application to artificial neural networks. In Z. Nonparametric Functional Estimation and Related Topics G. Roussas, ed. 561 576. Kluwer, Dordrecht. Z.
  • BARRON, A. 1994. Approximation and estimation bounds for artificial neural networks. Machine Learning 14 115 133. Z.
  • BARRON, A., BIRGE, L. and MASSART, P. 1999. Risk bounds for model selection via penalization. ´ Probab. Theory Related Fields 113 301 413. Z.
  • BIRGE, L. and MASSART, P. 1993. Rates of convergence for minimum contrast estimators. ´ Probab. Theory Related Fields 97 113 150. Z.
  • BIRGE, L. and MASSART, P. 1998. Minimum contrast estimators on sieves: exponential bounds ´ and rates of convergence. Bernoulli 4 329 375. Z.
  • BLOCH, D. A. and SILVERMAN, B. W. 1997. Monotone discriminant functions and their applications in rheumathology. J. Amer. Statist. Assoc. 92 144 153. Z.
  • BRETAGNOLLE, J. and HUBER, C. 1979. Estimation des densites: risque minimax. Z. Warsch. ´ Verw. Gebiete 47 119 137. Z.
  • DEVROYE, L., GYORFI, L. and LUGOSI, G. 1996. A Probabilistic Theory of Pattern Recognition. ¨ Springer, New York. Z.
  • DUDLEY, R. M. 1974. Metric entropy of some classes of sets with differentiable boundaries. J. Approx. Theory 10 227 236. Z.
  • HARTIGAN, J. A. 1987. Estimation of a convex density contour in two dimensions. J. Amer. Statist. Assoc. 82 267 270. Z.
  • KOROSTELEV, A. P. and TSYBAKOV, A. B. 1993. Minimax Theory of Image Reconstruction. Lecture Notes in Statist. 82. Springer, New York. Z.
  • MAMMEN, E. 1991. Nonparametric regression under qualitative smoothness assumptions. Ann. Statist. 19 741 759. Z.
  • MAMMEN, E. and TSYBAKOV, A. B. 1995. Asymptotic minimax recovery of sets with smooth boundaries. Ann. Statist. 23 502 524. Z.
  • MARRON, J. S. 1983. Optimal rates of convergence to Bayes risk in nonparametric discrimination. Ann. Statist. 11 1142 1155. Z.
  • MULLER, D. W. 1993. The excess mass approach in statistics. Beitrage zur Statistik 3. Inst. ¨ ¨ Math. fur Angewandte, Univ. Heidelberg. ¨
  • MULLER, D. W. 1995. A backward-induction algorithm for computing the best convex contrast ¨ of two bivariate samples. Beitrage zur Satistik 29. Inst. fur Angewandte, Univ. ¨ ¨ Heidelberg. Z.
  • MULLER, D. W. and SAWITZKI, G. 1991. Excess mass estimates and tests for multimodality. ¨ J. Amer. Statist. Assoc. 86 738 746. Z.
  • POLONIK, W. 1995. Measuring mass concentrations and estimating density contour clusters: an excess mass approach. Ann. Statist. 23 855 881. Z.
  • RUDEMO, M. and STRYHN, H. 1994. Approximating the distributions of maximum likelihood contour estimates in two-region images. Scand. J. Statist. 21 41 56. Z.
  • TSYBAKOV, A. B. 1997. On nonparametric estimation of density level sets. Ann. Statist. 25 948 969. Z.
  • VAN DE GEER, S. 1991. The entropy bound for monotone functions. Technical Report 91 100. Univ. Leiden. Z.
  • VAN DE GEER, S. 1995. The method of sieves and minimum contrast estimates. Math. Methods Statist. 4 20 38. Z.
  • VAN DE GEER, S. 1998. Applications of Empirical Process Theory to M-estimation. Unpublished manuscript. Z.
  • VAPNIK, V. N. 1996. The Nature of Statistical Learning Theory. Springer, New York. Z.
  • VAPNIK, V. N. and CHERVONENKIS, A. JA. 1974. Theory of Pattern Recognition. Nauka, Moscow Z. in Russian. Z.
  • WONG, W. H. and SHEN, X. 1995. Probability inequalities for likelihood ratios and convergence rates of sieve MLEs. Ann. Statist. 23 339 362.