Journal of Applied Mathematics

An Interior Point Method for L 1 / 2 -SVM and Application to Feature Selection in Classification

Lan Yao, Xiongji Zhang, Dong-Hui Li, Feng Zeng, and Haowen Chen

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

This paper studies feature selection for support vector machine (SVM). By the use of the L 1 / 2 regularization technique, we propose a new model L 1 / 2 -SVM. To solve this nonconvex and non-Lipschitz optimization problem, we first transform it into an equivalent quadratic constrained optimization model with linear objective function and then develop an interior point algorithm. We establish the convergence of the proposed algorithm. Our experiments with artificial data and real data demonstrate that the L 1 / 2 -SVM model works well and the proposed algorithm is more effective than some popular methods in selecting relevant features and improving classification performance.

Article information

Source
J. Appl. Math. Volume 2014 (2014), Article ID 942520, 16 pages.

Dates
First available in Project Euclid: 2 March 2015

Permanent link to this document
https://projecteuclid.org/euclid.jam/1425305675

Digital Object Identifier
doi:10.1155/2014/942520

Mathematical Reviews number (MathSciNet)
MR3198418

Citation

Yao, Lan; Zhang, Xiongji; Li, Dong-Hui; Zeng, Feng; Chen, Haowen. An Interior Point Method for ${L}_{1/2}$ -SVM and Application to Feature Selection in Classification. J. Appl. Math. 2014 (2014), Article ID 942520, 16 pages. doi:10.1155/2014/942520. https://projecteuclid.org/euclid.jam/1425305675


Export citation

References

  • Y. M. Yang and J. O. Pedersen, “A comparative study on feature selection in text categorization,” in Proceedings of the 14th International Conference on Machine Learning (ICML '97), vol. 97, pp. 412–420, 1997.
  • G. Forman, “An extensive empirical study of feature selection metrics for text classification,” The Journal of Machine Learning Research, vol. 3, pp. 1289–1305, 2003.
  • I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, no. 1–3, pp. 389–422, 2002.
  • H. H. Zhang, J. Ahn, X. Lin, and C. Park, “Gene selection using support vector machines with non-convex penalty,” Bioinformatics, vol. 22, no. 1, pp. 88–95, 2006.
  • Y. Saeys, I. Inza, and P. Larrañaga, “A review of feature selection techniques in bioinformatics,” Bioinformatics, vol. 23, no. 19, pp. 2507–2517, 2007.
  • J. Weston, F. Pérez-Cruz, O. Bousquet, O. Chapelle, A. Elisseeff, and B. Schölkopf, “Feature selection and transduction for prediction of molecular bioactivity for drug design,” Bioinformatics, vol. 19, no. 6, pp. 764–771, 2003.
  • Y. Liu, “A comparative study on feature selection methods for drug discovery,” Journal of Chemical Information and Computer Sciences, vol. 44, no. 5, pp. 1823–1828, 2004.
  • I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, Eds., Feature Extraction: Foundations and Applications, vol. 207, Springer, 2006.
  • A. Rakotomamonjy, “Variable selection using SVM-based criteria,” Journal of Machine Learning Research, vol. 3, no. 7-8, pp. 1357–1370, 2003.
  • J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik, “Feature selection for SVMs,” in Proceedings of the Conference on Neural Information Processing Systems, vol. 13, pp. 668–674, Denver, Colo, USA, 2000.
  • D. Peleg and R. Meir, “A feature selection algorithm based on the global minimization of a generalization error bound,” in Proceedings of the Conference on Neural Information Processing Systems, Vancouver, Canada, 2004.
  • P. S. Bradley and O. L. Mangasarian, “Feature selection via con-cave minimization and support vector machines,” in Proceedings of the International Conference on Machine Learning, pp. 82–90, San Francisco, Calif, USA, 1998.
  • J. Weston, A. Elisseeff, B. Schölkopf, and M. Tipping, “Use of the zero-norm with linear models and kernel methods,” Journal of Machine Learning Research, vol. 3, pp. 1439–1461, 2003.
  • A. B. Chan, N. Vasconcelos, and G. R. G. Lanckriet, “Direct convex relaxations of sparse SVM,” in Proceedings of the 24th International Conference on Machine Learning (ICML '07), pp. 145–153, Corvallis, Ore, USA, June 2007.
  • G. M. Fung and O. L. Mangasarian, “A feature selection Newton method for support vector machine classification,” Computa-tional Optimization and Applications, vol. 28, no. 2, pp. 185–202, 2004.
  • J. Zhu, S. Rosset, T. Hastie, and R. Tibshirani, “1-norm support vector machines,” Advances in Neural Information Processing Systems, vol. 16, no. 1, pp. 49–56, 2004.
  • R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society B: Methodological, vol. 58, no. 1, pp. 267–288, 1996.
  • T. Zhang, “Analysis of multi-stage convex relaxation for sparse regularization,” Journal of Machine Learning Research, vol. 11, pp. 1081–1107, 2010.
  • R. Chartrand, “Exact reconstruction of sparse signals via nonconvex minimization,” IEEE Signal Processing Letters, vol. 14, no. 10, pp. 707–710, 2007.
  • R. Chartrand, “Nonconvex regularizaron for shape preservation,” in Proceedings of the 14th IEEE International Conference on Image Processing (ICIP '07), vol. 1, pp. I293–I296, San Antonio, Tex, USA, September 2007.
  • Z. Xu, H. Zhang, Y. Wang, X. Chang, and Y. Liang, “${L}_{1/2}$ regularization,” Science China: Information Sciences, vol. 53, no. 6, pp. 1159–1169, 2010.
  • W. J. Chen and Y. J. Tian, “Lp-norm proximal support vector machine and its applications,” Procedia Computer Science, vol. 1, no. 1, pp. 2417–2423, 2010.
  • J. Liu, J. Li, W. Xu, and Y. Shi, “A weighted L$_{q}$ adaptive least squares support vector machine classifiers-Robust and sparse approximation,” Expert Systems with Applications, vol. 38, no. 3, pp. 2253–2259, 2011.
  • Y. Liu, H. H. Zhang, C. Park, and J. Ahn, “Support vector mach-ines with adaptive ${L}_{q}$ penalty,” Computational Statistics & Data Analysis, vol. 51, no. 12, pp. 6380–6394, 2007.
  • Z. Liu, S. Lin, and M. Tan, “Sparse support vector machines with Lp penalty for biomarker identification,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, pp. 100–107, 2010.
  • A. Rakotomamonjy, R. Flamary, G. Gasso, and S. Canu, “l$_{p}$-l$_{q}$ penalty for sparse linear and sparse multiple kernel multitask learning,” IEEE Transactions on Neural Networks, vol. 22, no. 8, pp. 1307–1320, 2011.
  • J. Y. Tan, Z. Zhang, L. Zhen, C. H. Zhang, and N. Y. Deng, “Adaptive feature selection via a new version of support vector machine,” Neural Computing and Applications, vol. 23, no. 3-4, pp. 937–945, 2013.
  • Z. B. Xu, X. Y. Chang, F. M. Xu, and H. Zhang, “${L}_{1/2}$ regu-larization: an iterative half thresholding algorithm,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 7, pp. 1013–1027, 2012.
  • D. Ge, X. Jiang, and Y. Ye, “A note on the complexity of ${L}_{p}$ minimization,” Mathematical Programming, vol. 129, no. 2, pp. 285–299, 2011.
  • V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
  • J. Fan and H. Peng, “Nonconcave penalized likelihood with a diverging number of parameters,” The Annals of Statistics, vol. 32, no. 3, pp. 928–961, 2004.
  • K. Knight and W. Fu, “Asymptotics for lasso-type estimators,” The Annals of Statistics, vol. 28, no. 5, pp. 1356–1378, 2000.
  • B. S. Tian and X. Q. Yang, “An interior-point l ${L}_{1/2}$-penalty-method for nonlinear programming,” Technical Report, Department of Applied Mathematics, Hong Kong Polytechnic University, 2013.
  • L. Armijo, “Minimization of functions having Lipschitz continuous first partial derivatives,” Pacific Journal of Mathematics, vol. 16, pp. 1–3, 1966.
  • F. John, “Extremum problems with inequalities as side-con-ditions,” in Studies and Essays. Courant Anniversary Volume, pp. 187–1204, 1948.
  • D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz, “UCI repository of machine learning databases,” Technical Report 9702, Department of Information and Computer Science, Uni-verisity of California, Irvine, Calif, USA, 1998, http://archive.ics.uci.edu/ml/. \endinput