Transformer-Based Fake News Classification: Evaluation of DistilBERT With CNN-LSTM and GloVe Embedding
Abstract
Social media networks have changed the face of communication in recent times, but at the same time, they have brought various challenges, such as disseminating fake news and information. NLP and machinelearning algorithms try to meet these challenges by structuring online information, although dataset bias remains a critical concern. SA has helped people gain insight into the context of news dissemination. Still, sham news dissemination-often by fake accounts represents a great hazard not only to users but to the stability of society. Several researchers have tried to assess the credibility of information and reduce sham data flow. In this work, the datasets of 17,903 fake news and 20,826 real news from Kaggle will be used. Preprocessing steps included text normalization, removal of punctuation, links, usernames, and nonalphabetic characters to prepare the data for analysis. This study explored categorizing fake and true news using advanced NLP techniques, transformer-based architectures, and deep learning models. A focus on improving classification accuracy and addressing dataset bias was achieved through models like DistilBERT, CNN, and LSTM. DistilBERT demonstrated remarkable performance, achieving an accuracy of 99.65%, with precision, recall, F1-score, and ROC-AUC values of 0.992188, 1, 0.996078, and 0.996894, respectively, outperforming the other models. The study's novelty lies in its detailed evaluation of DistilBERT, which showed significant improvements in accuracy, recall, and AUC while mitigating dataset bias. The results highlight the potential of DistilBERT for robust and reliable fake news classification, addressing critical limitations in existing approaches.
Full Text:
PDFReferences
A. Wani, I. Joshi, S. Khandve, V. Wagh, and R. Joshi, “Evaluating deep learning approaches for covid19 fake news detection,” in Combating Online Hostile Posts in Regional Languages during Emergency Situation: First International Workshop, CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers 1, Springer, 2021, 153–163. https://doi.org/10.1007/978-3-030-73696-5_15
V. Dogra, A. Singh, S. Verma, Kavita, N. Z. Jhanjhi, and M. N. Talib, “Analyzing DistilBERT for sentiment classification of banking financial news,” in Intelligent Computing and Innovation on Data Science: Proceedings of ICTIDS 2021, Springer, 2021, 501–510. https://doi.org/10.1007/978-981-16-3153-5_53
J. Y. Khan, M. T. I. Khondaker, S. Afroz, G. Uddin, and A. Iqbal, “A benchmark study of machine learning models for online fake news detection,” Machine Learning with Applications, 4, 100032, 2021. https://doi.org/10.1016/j.mlwa.2021.100032
U. Saha et al., “Exploring public attitude towards children by leveraging emoji to track out sentiment using distil-BERT a fine-tuned model,” in International Conference on Image Processing and Capsule Networks, Springer, 2022, 332–346. https://doi.org/10.1007/978-3-031-12413-6_26
M. Abdullah, O. Altiti, and R. Obiedat, “Detecting propaganda techniques in english news articles using pre-trained transformers,” in 2022 13th International Conference on Information and Communication Systems (ICICS), IEEE, 2022, 301–308. DOI:10.1109/ICICS55353.2022.9811117
P. K. Verma, P. Agrawal, V. Madaan, and C. Gupta, “UCred: fusion of machine learning and deep learning methods for user credibility on social media,” Soc Netw Anal Min, 12)1:( 54, 2022. https://doi.org/10.1007/s13278-022-00880-1
H. Karande, R. Walambe, V. Benjamin, K. Kotecha, and T. S. Raghu, “Stance detection with BERT embeddings for credibility analysis of information on social media,” PeerJ Comput Sci, 7, e467, 2021. https://doi.org/10.7717/peerj-cs.467
J. Liu, T. Singhal, L. T. M. Blessing, K. L. Wood, and K. H. Lim, “Crisisbert: a robust transformer for crisis classification and contextual crisis embedding,” in Proceedings of the 32nd ACM conference on hypertext and social media, 2021, 133–141. https://doi.org/10.1145/3465336.3475117
F. Balouchzahi, H. L. Shashirekha, and G. Sidorov, “MUCIC at CheckThat! 2021: FaDo-Fake News Detection and Domain Identification using Transformers Ensembling.,” in CLEF (Working Notes), 2021, 455–464. https://doi.org/10.1007/978-3-030-85251-1_1
R. O. Ogundokun, M. O. Arowolo, S. Misra, and I. D. Oladipo, “Early detection of fake news from social media networks using computational intelligence approaches,” Combating fake news with computational intelligence techniques, 71–89, 2022. https://doi.org/10.1007/978-3-030-90087-8_4
S.-C. B. Lo, H.-P. Chan, J.-S. Lin, H. Li, M. T. Freedman, and S. K. Mun, “Artificial convolution neural network for medical image pattern recognition,” Neural networks, 8)7–8:( 1201–1214, 1995. https://doi.org/10.1016/0893-6080(95)00061-5
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE Trans Neural Netw Learn Syst, 33)12:( 6999–7019, 2021. DOI:10.1109/TNNLS.2021.3084827
J. Gu et al., “Recent advances in convolutional neural networks,” Pattern Recognit, 77, 354–377, 2018. https://doi.org/10.1016/j.patcog.2017.10.013
A. Patil and M. Rane, “Convolutional neural networks: an overview and its applications in pattern recognition,” Information and Communication Technology for Intelligent Systems: Proceedings of ICTIS 2020, Volume 1, 21–30, 2021. https://doi.org/10.1007/978-981-15-7078-0_3
Z. Xu, S. Qian, X. Ran, and J. Zhou, “Application of Deep Convolution Neural Network in Crack Identification,” Applied Artificial Intelligence, 36, )1( 2014188, 2022. https://doi.org/10.1080/08839514.2021.2014188
R. Shyam, “Convolutional neural network and its architectures,” Journal of Computer Technology & Applications, 12)2:( 6–14p, 2021. DOI:10.20944/preprints201811. 0546.v1
J. Yang and J. Li, “Application of deep convolution neural network,” in 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, 2017, 229–232. DOI:10.1109/ICCWAMTIP.2017.8301485
B. Gao and L. Pavel, “On the properties of the softmax function with application in game theory and reinforcement learning,” arXiv preprint arXiv:1704.00805, 2017. https://doi.org/10.48550/arXiv.1704.00805
K. Banerjee, R. R. Gupta, K. Vyas, and B. Mishra, “Exploring alternatives to softmax function,” arXiv preprint arXiv:2011.11538, 2020. https://doi.org/10.48550/arXiv.2011.11538
M. Hort, R. Moussa, and F. Sarro, “Multi-objective search for gender-fair and semantically correct word embeddings,” Appl Soft Comput, 133, 109916, 2023. https://doi.org/10.1016/j.asoc.2022.109916
N. Badri, F. Kboubi, and A. H. Chaibi, “Combining fasttext and glove word embedding for offensive and hate speech text detection,” Procedia Comput Sci, 207, 769–778, 2022. https://doi.org/10.1016/j.procs.2022.09.132
P. M. Brennan, J. J. M. Loan, N. Watson, P. M. Bhatt, and P. A. Bodkin, “Pre-operative obesity does not predict poorer symptom control and quality of life after lumbar disc surgery,” Br J Neurosurg, 31)6:( 682–687, 2017. https://doi.org/10.1080/02688697.2017.1354122
S. Hochreiter, “Long Short-term Memory,” Neural Computation MIT-Press, 1997. DOI:10.1162/neco.1997.9.8.1735
B. Gülmez, “Stock price prediction with optimized deep LSTM network with artificial rabbit’s optimization algorithm,” Expert Syst Appl,227, 120346, 2023. https://doi.org/10.1016/j.eswa.2023.120346
Q. Kang, E. J. Chen, Z.-C. Li, H.-B. Luo, and Y. Liu, “Attention-based LSTM predictive model for the attitude and position of shield machine in tunneling,” Underground Space, 13, 335–350, 2023. https://doi.org/10.1016/j.undsp.2023.05.006
J. Luo and Y. Gong, “Air pollutant prediction based on ARIMA-WOA-LSTM model,” Atmos Pollut Res, 14)6:( 101761, 2023. https://doi.org/10.1016/j.apr.2023.101761
S. R. Patra and H.-J. Chu, “Regional groundwater sequential forecasting using global and local LSTM models,” J Hydrol Reg Stud, 47, 101442, 2023. https://doi.org/10.1016/j.ejrh.2023.101442
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,”. 2–6, 2019. DOI:10.48550/arXiv.1910.01108
B. Büyüköz, A. Hürriyetoğlu, and A. Özgür, “Analyzing ELMo and DistilBERT on socio-political news classification,” in Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020, 2020, 9–18. DOI:10.32604/jqc.2022.026658
M. Abadeer, “Assessment of DistilBERT performance on named entity recognition task for the detection of protected health information and medical concepts,” in Proceedings of the 3rd clinical natural language processing workshop, 2020, 158–167. DOI: 10.18653/v1/2020.clinicalnlp-1.18
V. Sanh, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108, 2019. DOI:10.48550/arXiv.1910.01108
P. Singh, N. Singh, K. K. Singh, and A. Singh, “Diagnosing of disease using machine learning,” in Machine learning and the internet of medical things in healthcare, Elsevier, 2021, 89–111. https://doi.org/10.1016/B978-0-12-821229-5.00003-3
S. Reddi and G. V Eswar, “Fake news in social media recognition using Modified Long Short-Term Memory network,” in Security in IoT Social Networks, Elsevier, 2021, 205–227. https://doi.org/10.1016/B978-0-12-821599-9.00009-1
DOI: https://doi.org/10.31449/inf.v49i25.7710

This work is licensed under a Creative Commons Attribution 3.0 License.