Enhanced Phishing Website Categorization Using Random Forest with Sea Horse and Jellyfish Search Optimization

Yin Chen; Yuan Ma; Juan Zhao; Yongqiang Zhang

doi:10.31449/inf.v49i10.8089

Contact Editors Europe, Africa:
Matjaz Gams
N. and S. America:
Karthick Gunasekaran
Asia, Australia:
Vinay Singh
Overview papers:
Maria Ganzha
Wiesław Pawlowski
Aleksander Denisiuk Abstacting / Indexing

Informatica is surveyed by:

ACM Digital Library
Citeseer
COBISS
Compendex
Computer & Information Systems Abstracts
Computer Database
Computer Science Index
dLib.si
DBLP Computer Science Bibliography
Directory of Open Access Journals
Google Scholar
InfoTrac OneFile
Inspec
Linguistic and Language Behaviour Abstracts
Mathematical Reviews, MatSciNet, MatSci on SilverPlatter and Current Mathematical Publications
Scopus Publishing

Informatica is published by:

Support

Informatica is supported by:

ACM Slovenia
Slovenian Society for Pattern Recognition
Slovenian Artificial Intelligence Society
Slovenian Society for Cognitive Science
Slovenian Society of Mathematicians, Physicists and Astronomers
Automatic Control Society of Slovenia
Slovenian Academy of Engineering
International Federation for Information Processing

Journal Help

User

Journal Content Search
Browse

Information

Notifications

About The Authors

Yin Chen

Yuan Ma
Xi'an Siyuan University, Xi’an Shaanxi 710038, China
China

Juan Zhao
Xi'an Siyuan University, Xi’an Shaanxi 710038, China
China

Yongqiang Zhang
Xi'an Siyuan University, Xi’an Shaanxi 710038, China
China

Support & Indexing

Enhanced Phishing Website Categorization Using Random Forest with Sea Horse and Jellyfish Search Optimization

Yin Chen, Yuan Ma, Juan Zhao, Yongqiang Zhang

Abstract

In contemporary society, with advancements in science and technology, many global activities, ranging from financial transactions to information transfers, are conducted through the Internet via dedicated websites and applications. Unfortunately, the prevalence of online platforms has increased the proliferation of fake websites aimed at exploiting sensitive data, such as bank card information and personal details. It addresses the problem of cybersecurity w.r.t. the categorization of a set of 1353 websites by a machine learning algorithm into three categories, namely phishing, suspicious, and legitimate URLs. The dataset was gathered from published papers and divided into 70-30 in the training and testing phases. This will help keep members' banking and personal data much safer online. This paper uses the RFC model with two optimization schemes, Sea Horse Optimizer (SHO) and Jellyfish Search Optimization Algorithm (JSOA), to improve performance. After that, optimized versions of the schemes are tagged as RFSH and RFJS, respectively. After extensive training and testing on these three schemes, the best model was identified by comparing the performances of the three on the database in hand. The RFSH model performed better predicting, achieving 0.952 for all the data. It outperformed the RFJS model with a precision of 0.932 and the RFC single framework with an accuracy of 0.9106. Hence, it emerged as the best-predicting model.

Full Text:

PDF

DOI: https://doi.org/10.31449/inf.v49i10.8089

This work is licensed under a Creative Commons Attribution 3.0 License.

Informatica is financially supported by the Slovenian research agency from the Call for co-financing of scientific periodical publications.

Webmaster: Mario Konecki

Username
Password
Remember me