Abstract and Applied Analysis

Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification

Mingzhu Tang, Chunhua Yang, Kang Zhang, and Qiyue Xie

Full-text: Open access

Abstract

Cost-sensitive support vector machine is one of the most popular tools to deal with class-imbalanced problem such as fault diagnosis. However, such data appear with a huge number of examples as well as features. Aiming at class-imbalanced problem on big data, a cost-sensitive support vector machine using randomized dual coordinate descent method (CSVM-RDCD) is proposed in this paper. The solution of concerned subproblem at each iteration is derived in closed form and the computational cost is decreased through the accelerating strategy and cheap computation. The four constrained conditions of CSVM-RDCD are derived. Experimental results illustrate that the proposed method increases recognition rates of positive class and reduces average misclassification costs on real big class-imbalanced data.

Article information

Source
Abstr. Appl. Anal., Volume 2014, Special Issue (2014), Article ID 416591, 9 pages.

Dates
First available in Project Euclid: 6 October 2014

Permanent link to this document
https://projecteuclid.org/euclid.aaa/1412606361

Digital Object Identifier
doi:10.1155/2014/416591

Mathematical Reviews number (MathSciNet)
MR3230521

Zentralblatt MATH identifier
07022349

Citation

Tang, Mingzhu; Yang, Chunhua; Zhang, Kang; Xie, Qiyue. Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification. Abstr. Appl. Anal. 2014, Special Issue (2014), Article ID 416591, 9 pages. doi:10.1155/2014/416591. https://projecteuclid.org/euclid.aaa/1412606361


Export citation

References

  • M. A. Davenport, “The 2nu-SVM: a cost-sensitive extension of the nu-SVM,” Tech. Rep. TREE 0504, Department of Electrical and Computer Engineering, Rice University, 2005.
  • M. Kim, “Large margin cost-sensitive learning of conditional random fields,” Pattern Recognition, vol. 43, no. 10, pp. 3683–3692, 2010.
  • Y.-J. Park, S.-H. Chun, and B.-C. Kim, “Cost-sensitive case-based reasoning using a genetic algorithm: application to medical diagnosis,” Artificial Intelligence in Medicine, vol. 51, no. 2, pp. 133–145, 2011.
  • J. Kim, K. Choi, G. Kim, and Y. Suh, “Classification cost: an empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost,” Expert Systems with Applications, vol. 39, no. 4, pp. 4013–4019, 2012.
  • C.-Y. Yang, J.-S. Yang, and J.-J. Wang, “Margin calibration in SVM class-imbalanced learning,” Neurocomputing, vol. 73, no. 1–3, pp. 397–411, 2009.
  • H. Masnadi-Shirazi, N. Vasconcelos, and A. Iranmehr, “Cost-sensitive support čommentComment on ref. [6?]: Please update the information of this reference, if possible.vector machines,” http://arxiv.org/abs/1212.0975.
  • Y. Artan, M. A. Haider, D. L. Langer et al., “Prostate cancer localization with multispectral MRI using cost-sensitive support vector machines and conditional random fields,” IEEE Transactions on Image Processing, vol. 19, no. 9, pp. 2444–2455, 2010.
  • C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan, “A dual coordinate descent method for large-scale linear SVM,” in Proceedings of the 25th International Conference on Machine Learning, pp. 408–415, July 2008.
  • R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “LIBLINEAR: a library for large linear classification,” The Journal of Machine Learning Research, vol. 9, pp. 1871–1874, 2008.
  • T. Eitrich and B. Lang, “Parallel cost-sensitive support vector machine software for classification,” in Proceedings of the Workshop from Computational Biophysics to Systems Biology, NIC Series, pp. 141–144, John von Neumann Institute for Computing, Jülich, Germany, 2006.
  • B. Tang, W. Liu, and T. Song, “Wind turbine fault diagnosis based on Morlet wavelet transformation and Wigner-Ville distribution,” Renewable Energy, vol. 35, no. 12, pp. 2862–2866, 2010. \endinput