LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification
Abstract
Abstract– While the core quality of SVM comes from its ability to get the global optima, its classification performance also depends on computing kernels.However, while this kernel-complexity generates the power of machine, it is also responsible for the compu-tational load to execute this kernel. Moreover, insisting on a similarity function to be a positive definite kernel demands some properties to be satisfied that seem unproductive sometimes raising a question about which similarity measures to be used for classifier. We model Vapnik’s LPSVM proposing a new similarity function replacing kernel func-tion.Following the strategy of ”Accuracy first, speed second”, we have modelled a similarity function that is mathematically well-defined depending on analysis as well as geometry and complex enough to train the machine for generating solid generalization ability. Being consistent with the theory of learning by Balcan and Blum [1], our similarity function does not need to be a valid kernel function and demands less computational cost for executing compared to its counterpart like RBF or other kernels while provides sufficient power to the classifier using its optimal complexity. Benchmarking shows that our similarity function based LPSVM poses test error 0.86 times of the most powerful RBF based QP SVM but demands only 0.40 times of its computational cost.References
M.-F. Balcan, A. Blum, and N. Srebro, “A
theory of learning with similarity functions,”
Machine Learning, vol. 72, no. 1, pp. 89–112,
N. Cristianini, J. Shawe-Taylor et al., An
introduction to support vector machines and
other kernel-based learning methods. Cam-
bridge university press, 2000.
B. Scholkopf and A. J. Smola, Learning with
kernels: support vector machines, regulariza-
tion, optimization, and beyond. Adaptive
Computation and Machine Learning Series,
V. Vapnik, Statistical learning theory. Wi-
ley, 1998.
R. Karim, M. Bergtholdt, J. H. Kappes, and
C. Schn ̈orr, “Greedy-based design of sparse
two-stage svms for fast classification,” in
Proc. of the 29th DAGM Symposium on Pat-
tern Recognition, 2007, pp. 395–404.
R. Karim and A. K. Kundu, “Efficiency and
performance analysis of a sparse and pow-
erful second order svm based on lp and qp,”
International Journal of Advanced Computer
Science and Applications (IJACSA), vol. 9,
no. 2, pp. 311–318, 2018.
——, “Computational analysis to reduce
classification cost keeping high accuracy of
the sparser lpsvm,” International Journal of
Machine Learning and Computing, vol. 9,
no. 6, pp. 728–733, 2019.
T. Joachims and C. N. J. Yu, “Sparse kernel
svms via cutting-plane training,” in Proc. of
the European Conference on Machine Learn-
ing and Knowledge Discovery in Databases
(ECML), 2009, p. 8.
S. S. Keerthi, O. Chapelle, and D. DeCoste,
“Building support vector machines with re-
duced classifier complexity,” Journal of Ma-
chine Learning Research, (JMLR), vol. 7, no.
Jul, pp. 1493–1515, 2006.
C. J. Burges, “A tutorial on support vector
machines for pattern recognition,” Data min-
ing and knowledge discovery, vol. 2, no. 2, pp.
–167, 1998.
Y. Ying, C. Campbell, and M. Girolami,
“Analysis of svm with indefinite kernels,”
Advances in neural information processing
systems, vol. 22, pp. 2205–2213, 2009.
J. Chen and J. Ye, “Training svm with in-
definite kernels,” in Proceedings of the 25th
international conference on Machine learn-
ing, 2008, pp. 136–143.
T. Graepel, R. Herbrich, P. Bollmann-
Sdorra, and K. Obermayer, “Classification
on pairwise proximity data,” Advances in
neural information processing systems, pp.
–444, 1999.
B. Haasdonk, “Feature space interpreta-
tion of svms with indefinite kernels,” IEEE
Transactions on pattern analysis and ma-
chine intelligence, vol. 27, no. 4, pp. 482–492,
H.-T. Lin and C.-J. Lin, “A study on sigmoid
kernels for svm and the training of non-psd
kernels by smo-type methods,” submitted to
Neural Computation, vol. 3, no. 1-32, p. 16,
R. Luss and A. d’Aspremont, “Support vec-
tor machine classification with indefinite ker-
nels,” Mathematical Programming Computa-
tion, vol. 1, no. 2, pp. 97–118, 2009.
C. S. Ong, X. Mary, S. Canu, and A. J.
Smola, “Learning with non-positive kernels,”
in Proceedings of the twenty-first interna-
tional conference on Machine learning, 2004,
p. 81.
E. Pekalska, P. Paclik, and R. P. Duin, “A
generalized kernel approach to dissimilarity-
based classification,” Journal of machine
learning research, vol. 2, no. Dec, pp. 175–
, 2001.
G. Wu, E. Y. Chang, and Z. Zhang, “An
analysis of transformation on non-positive
semidefinite similarity matrix for kernel ma-
chines,” in Proceedings of the 22nd Inter-
national Conference on Machine Learning,
vol. 8. Citeseer, 2005.
I. Alabdulmohsin, X. Gao, and X. Z. Zhang,
“Support vector machines with indefinite
kernels,” in Asian Conference on Machine
Learning. PMLR, 2015, pp. 32–47.
L. Wang, C. Yang, and J. Feng, “On learning
with dissimilarity functions,” in Proceedings
of the 24th international conference on Ma-
chine learning, 2007, pp. 991–998.
M.-F. Balcan, A. Blum, and N. Srebro, “Im-
proved guarantees for learning via similarity
functions,” 2008.
A. Nefedov, J. Ye, C. Kulikowski, I. Much-
nik, and K. Morgan, “Experimental study of
support vector machines based on linear and
quadratic optimization criteria,” DIMACS
Technical Report 2009-18, 2009.
J. Kools. (2013) 6 functions for generating
artificial datasets. https://www.mathworks.
com/matlabcentral/fileexchange/
-6-functions-for-generating-artificial-datasets.
[Online; accessed 22-September-2020].
H. A. Abu Alfeilat, A. B. Hassanat,
O. Lasassmeh, A. S. Tarawneh, M. B. Al-
hasanat, H. S. Eyal Salman, and V. S.
Prasath, “Effects of distance measure choice
on k-nearest neighbor classifier performance:
a review,” Big data, vol. 7, no. 4, pp. 221–
, 2019.
G. R ̈atsch. Benchmark data sets. [On-
line]. Available: http://ida.first.fraunhofer.
de/projects/bench/benchmarks.htm
T. Diethe, “13 benchmark datasets derived
from the UCI, DELVE and STATLOG
repositories,” https://github.com/tdiethe/
gunnar raetsch benchmark datasets/,
S. Boyd and L. Vandenberghe, Convex op-
timization. Cambridge University Press,
A. S. Shirkhorshidi, S. Aghabozorgi, and
T. Y. Wah, “A comparison study on similar-
ity and dissimilarity measures in clustering
continuous data,” PloS one, vol. 10, no. 12,
p. e0144059, 2015.
V. Prasath, H. A. A. Alfeilat, A. Hassanat,
O. Lasassmeh, A. S. Tarawneh, M. B. Al-
hasanat, and H. S. E. Salman, “Distance and
similarity measures effect on the performance
of k-nearest neighbor classifier–a review,”
arXiv preprint arXiv:1708.04321, 2017.
D. R. Wilson and T. R. Martinez, “Improved
heterogeneous distance functions,” Journal
of artificial intelligence research, vol. 6, pp.
–34, 1997.
C. C. Aggarwal, A. Hinneburg, and D. A.
Keim, “On the surprising behavior of dis-
tance metrics in high dimensional space,” in
International conference on database theory.
Springer, 2001, pp. 420–434
DOI:
https://doi.org/10.31449/inf.v47i8.4767Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







