Enhanced Hate Speech Detection in Indonesian-English Code-Mixed Texts Using XLM-RoBERTa
Abstract
Full Text:
PDFReferences
J. B. Walther, “Social media and online hate,” Curr. Opin. Psychol., vol. 45, no. January, 2022, doi: 10.1016/j.copsyc.2021.12.010.
K. Sreelakshmi, B. Premjith, and K. P. Soman, “Detection of Hate Speech Text in Hindi-English Code-mixed Data,” Procedia Comput. Sci., vol. 171, no. 2019, pp. 737–744, 2020, doi: 10.1016/j.procs.2020.04.080.
E. W. Pamungkas, A. Fatmawati, Y. S. Nugroho, D. Gunawan, and E. Sudarmilah, “Hate Speech Detection in Code-Mixed Indonesian Social Media: Exploiting Multilingual Languages Resources,” in 2022 Seventh International Conference on Informatics and Computing (ICIC), IEEE, Dec. 2022, pp. 1–5. doi: 10.1109/ICIC56845.2022.10006940.
M. S. Jahan and M. Oussalah, “A systematic review of hate speech automatic detection using natural language processing,” Neurocomputing, vol. 546, p. 126232, Aug. 2023, doi: 10.1016/j.neucom.2023.126232.
L. Xu, J. Zeng, and S. Chen, “yasuo at HASOC2020: Fine-tune XML-RoBERTa for Hate Speech Identification,” 2020.
X. Ou and H. Li, “YNU@Dravidian-CodeMix-FIRE2020: XLM-RoBERTa for Multi-language Sentiment Analysis,” 2020.
T. Tita, Q. Mary, and A. Zubiaga, “Cross-lingual Hate Speech Detection using Transformer Models”, doi: 10.48550/arXiv.2111.00981.
T. Leburu-Dingalo, K. Johannes Ntwaagae, N. Peace Motlogelwa, E. Thuma, and M. Mudongo, “Application of XLM-RoBERTa for Multi-Class Classification of Conversational Hate Speech,” 2022.
S. Wang, J. Liu, X. Ouyang, and Y. Sun, “Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models,” 14th Int. Work. Semant. Eval. SemEval 2020 - co-located 28th Int. Conf. Comput. Linguist. COLING 2020, Proc., pp. 1448–1455, 2020, doi: 10.18653/v1/2020.semeval-1.189.
A. Conneau et al., “Unsupervised Cross-lingual Representation Learning at Scale,” Nov. 2019.
Y. Liu et al., “RoBERTa: A robustly optimized BERT pretraining approach,” arXiv, no. 1, 2019.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2019.
Anonymous, “XLM-RoBERTa GitHub,” https://github.com/huggingface/transformers/blob/main/docs/source/en/model_doc/xlm-roberta.md.
Y. Meng et al., “Representation Deficiency in Masked Language Modeling,” Feb. 2023.
Anonymous, “XLM-RoBERTa HuggingFace,” https://huggingface.co/docs/transformers/v4.46.0/en/model_doc/xlm-roberta#transformers.TFXLMRobertaForSequenceClassification.
X. Amatriain, A. Sankar, J. Bing, P. K. Bodigutla, T. J. Hazen, and M. Kazi, “Transformer models: an introduction and catalog,” Feb. 2023.
G. Lample and A. Conneau, “Cross-lingual Language Model Pretraining,” Jan. 2019.
DOI: https://doi.org/10.31449/inf.v49i21.7713

This work is licensed under a Creative Commons Attribution 3.0 License.