Taiwanese Journal of Mathematics

CLASSIFICATION WITH POLYNOMIAL KERNELS AND $l^1-$COEFFICIENT REGULARIZATION

Hongzhi Tong, Di-Rong Chen, and Fenghong Yang

Full-text: Open access

Abstract

In this paper we investigate a class of learning algorithms for classification generated by regularization schemes with polynomial kernels and $l^1-$regularizer. The novelty of our analysis lies in the estimation of the hypothesis error. A Bernstein-Kantorovich polynomial is introduced as a regularizing function. Although the hypothesis spaces and the regularizers in the schemes are sample dependent, we prove the hypothesis error can be removed from the error decomposition with confidence. As a result, we derive some explicit learning rates for the produced classifiers under some assumptions.

Article information

Source
Taiwanese J. Math., Volume 18, Number 5 (2014), 1633-1651.

Dates
First available in Project Euclid: 10 July 2017

Permanent link to this document
https://projecteuclid.org/euclid.twjm/1499706530

Digital Object Identifier
doi:10.11650/tjm.18.2014.3929

Mathematical Reviews number (MathSciNet)
MR3265081

Zentralblatt MATH identifier
1359.62265

Subjects
Primary: 68T05: Learning and adaptive systems [See also 68Q32, 91E40] 62J02: General nonlinear regression

Keywords
classification coefficient regularization polynomial kernels Bernstein-Kantorovich polynomial learning rates

Citation

Tong, Hongzhi; Chen, Di-Rong; Yang, Fenghong. CLASSIFICATION WITH POLYNOMIAL KERNELS AND $l^1-$COEFFICIENT REGULARIZATION. Taiwanese J. Math. 18 (2014), no. 5, 1633--1651. doi:10.11650/tjm.18.2014.3929. https://projecteuclid.org/euclid.twjm/1499706530


Export citation

References

  • P. L. Bartlett, The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network, IEEE Trans. Inform. Theory, 44 (1998), 525-536.
  • E. J. Cand\ptmrs ès, J. Romberg and T. Tao, Robust uncertainty principles: exact signal reconstraction for highly incomplete frequency information, IEEE Trans. Inform. Theory, 52 (2006), 489-509.
  • D. R. Chen, Q. Wu, Y. Ying and D. X. Zhou, Support vector machine soft margin classifiers: error analysis, J. Mach. Learning Res., 5 (2004), 1143-1175.
  • F. Cucker and S. Smale, On the mathematical foundations of learning theory, Bull. Amer. Math. Soc., 39 (2001), 1-49.
  • F. Cucker and D. X. Zhou, Learing Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.
  • L. Devroye, L. Gy\ptmrsörfi and G. Lugosi, A Probabilistic Theory of Pattern Recognition, Springer-Verlag, New York, 1997.
  • Z. Ditzain and V. Totik, Moduli of Smoothness, Springer-Verlag, New York, 1987.
  • K. Jetter, J. St\ptmrsöckler and J. D. Ward, Error estimates for scatterd data interpolation on spheres, Math. Comput., 68 (1999), 733-747.
  • L. Kantorovich, Sur certains developpements suivant les polynomes de la forme de S. Bernstein, I. II, C. R. Acad. Sci. URSS, (1930), 563-568, 595-600.
  • I. Steinwart and C. Scovel, Fast rates for support vector machines using Gaussian kernels, Ann. Statist., 35 (2007), 575-607.
  • H. W. Sun and Q. Wu, Least square regression with indefinite kernels and coefficient regularization, Appl. Comput. Harmonic Anal., 30 (2011), 96-109.
  • J. A. K. Suykens and J. Vandewalle, Least squares support vector machine classifiers, Neural Process. Lett., 9 (1999), 293-300.
  • R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 58 (1996), 267-288.
  • H. Z. Tong, D. R. Chen and F. H. Yang, Support vector machines regression with $l^1-$regularizer, J. Approx. Theory, 164 (2012), 1331-1344.
  • V. Vapnik, Statistical Learning Theory, John Wiley & Sons, New York, 1998.
  • H. Wendland, Local polynomial reproduction and moving least square approximation, IMA J. Numer. Anal., 21 (2001), 285-300.
  • Q. Wu, Y. Ying and D. X. Zhou, Multi-kernel regularized classifiers, J. Complexity, 23 (2007), 108-134.
  • Q. Wu and D. X. Zhou, SVM soft margin classifiers: linear programming versus quadratic programming, Neural Comput., 17 (2005), 1160-1187.
  • Q. Wu and D. X. Zhou, Learning with sample dependent hypothesis spaces, Comput. Math. Appl., 56 (2008), 2896-2907.
  • Q. W. Xiao and D. X. Zhou, Learning by nonsymmetric kernel with data dependent spaces and $l^1$-reqularizer, Taiwanese J. Math., 4 (2010), 1821-1836.
  • T. Zhang, Statistical behavior and consistency of classification methods based on convex risk minimization, Ann. Statist., 32 (2004), 56-85.
  • D. X. Zhou and K. Jetter, Approximation with polynomial kernels and SVM classifiers, Adv. Comput. Math., 25 (2006), 323-344.