Enhancing Sentiment Analysis in the Airline Sector: A Deep Learning Approach Using Twitter Data
Abstract
The airline sector has been increasingly entangled with Twitter data owing to its real-time nature and vast usage; therefore, airlines are trying to gain a better understanding of customer perspectives and trying to solve them as quickly as possible. Thus, opinion mining has emerged as one of the vital tools of survival for airlines in competitive scenarios by understanding their customer opinions and improving their business strategies accordingly. Twitter is a valid data source because of its real-time traits and immense number of users. In this work, the tweets were changed to numerical vectors by the Word2Vec word embedding methods in a deep neural network. Using TF-IDF and a few traditional machine learning (ML) frameworks for sentiment categorization, the productivity of this model is contrasted with that of a deep neural network. It does an evaluation, too, on the performance of frameworks in comparison to a neural network architecture: XGBClassifier, LGBMClassifier, ExtraTreeClassifier, AdaBoostClassifier, BernoulliNB, and NearestCentroid. The outcomes of the investigation displayed that the recommended framework, which incorporates neural networks and word embedding techniques, achieved a remarkable accuracy of 0.88 in sentiment classification tasks on Twitter data.References
F. Rustam, I. Ashraf, A. Mehmood, S. Ullah, and G. S. Choi, “Tweets classification on the base of sentiments for US airline companies,” Entropy, vol. 21, no. 11, p. 1078, 2019. MDPI. https://doi.org/10.3390/e21111078.
A. Rane and A. Kumar, “Sentiment Classification System of Twitter Data for US Airline Service Analysis,” Proceedings - International Computer Software and Applications Conference, vol. 1, pp. 769–773, 2018. IEEE. https://doi.org/10.1109/COMPSAC.2018.00114.
E. Prabhakar, M. Santhosh, A. H. Krishnan, T. Kumar, and R. Sudhakar, “Sentiment analysis of US airline twitter data using new adaboost approach,” International Journal of Engineering Research & Technology (IJERT), vol. 7, no. 1, pp. 1–6, 2019.
K. M. Hasib, M. A. Habib, N. A. Towhid, and M. I. H. Showrov, “A novel deep learning based sentiment analysis of twitter data for us airline service,” in 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), IEEE, Dhaka, Bangladesh, 2021, pp. 450–455. https://doi.org/10.1109/ICICT4SD50815.2021.9396879.
Y. Wan and Q. Gao, “An Ensemble Sentiment Classification System of Twitter Data for Airline Services Analysis,” Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015, IEEE. Atlantic City, NJ, USA, pp. 1318–1325, 2016. https://doi.org/10.1109/ICDMW.2015.7.
A. I. Saad, “Opinion Mining on US Airline Twitter Data Using Machine Learning Techniques,” 16th International Computer Engineering Conference, ICENCO 2020, IEEE. Cairo, Egypt, pp. 59–63, 2020. https://doi.org/10.1109/ICENCO49778.2020.9357390.
M. T. H. K. Tusar and M. T. Islam, “A Comparative Study of Sentiment Analysis Using NLP and Different Machine Learning Techniques on US Airline Twitter Data,” Proceedings of International Conference on Electronics, Communications and Information Technology, ICECIT 2021, IEEE, Khulna, Bangladesh, 2021 https://doi.org/10.1109/ICECIT54077.2021.9641336.
R. Monika, S. Deivalakshmi, and B. Janet, “Sentiment Analysis of US Airlines Tweets Using LSTM/RNN,” Proceedings of the 2019 IEEE 9th International Conference on Advanced Computing, IACC 2019, IEEE, Tiruchirappalli, India, pp. 92–95, 2019. https://doi.org/10.1109/IACC48062.2019.8971592.
W. Aljedaani et al., “Sentiment analysis on Twitter data integrating TextBlob and deep learning models: The case of US airline industry,” Knowl Based Syst, vol. 255, p. 109780, 2022. Elsevier. https://doi.org/10.1016/j.knosys.2022.109780.
S. Rahman, M. Hasan, and A. K. Sarkar, “Prediction of Brain Stroke using Machine Learning Algorithms and Deep Neural Network Techniques,” European Journal of Electrical Engineering and Computer Science, vol. 7, no. 1, pp. 23–30, 2023.
N. D. Lagaros and M. Fragiadakis, “Fragility assessment of steel frames using neural networks,” Earthquake Spectra, vol. 23, no. 4, pp. 735–752, 2007. Sge Publications. https://doi.org/10.1193/1.2798241.
D. Sharma, R. Kumar, and A. Jain, “Measurement : Sensors Breast cancer prediction based on neural networks and extra tree classifier using feature ensemble learning,” Measurement: Sensors, vol. 24, no. November, p. 100560, 2022. Elsevier. https://doi.org/10.1016/j.measen.2022.100560.
N. Badri, F. Kboubi, and A. H. Chaibi, “Combining FastText and Glove Word Embedding for Offensive and Hate speech Text Detection,” Procedia Comput Sci, vol. 207, no. Kes, pp. 769–778, 2022. Elsevier. https://doi.org/10.1016/j.procs.2022.09.132.
M. De Gregorio, A. Sorgente, and G. Vettigli, “Weightless Neural Networks for text classification using tf-idf,” ESANN 2021 Proceedings - 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, no. October, pp. 239–244, 2021.
B. Kabra and C. Nagar, “Convolutional Neural Network based sentiment analysis with TF-IDF based vectorization,” Journal of Integrated Science and Technology, vol. 11, no. 3, pp. 1–7, 2023.
X. Wang, H. Wang, G. Zhao, Z. Liu, and H. Wu, “Albert over match-lstm network for intelligent questions classification in chinese,” Agronomy, vol. 11, no. 8, 2021. MDPI. https://doi.org/10.3390/agronomy11081530.
Y. Ho and S. Wookey, “The Real-World-Weight Cross-Entropy Loss Function : Modeling the Costs of Mislabeling,” IEEE Access, vol. 8, pp. 4806–4813, 2020. IEEE. https://doi.org/10.1109/ACCESS.2019.2962617.
A. Tato and R. Nkambou, “Mproving dam ptimizer,” pp. 1–4, 2018.
M. Mirzehi Kalateh Kazemi, Z. Nabavi, M. Rezakhah, and A. Masoudi, “Application of XGB-based metaheuristic techniques for prediction time-to-failure of mining machinery,” Systems and Soft Computing, vol. 5, no. January, 2023. Elsevier. https://doi.org/10.1016/j.sasc.2023.200061.
J. Zhou et al., “Predicting TBM penetration rate in hard rock condition: A comparative study among six XGB-based metaheuristic techniques,” Geoscience Frontiers, vol. 12, no. 3, 2021. Elsevier. https://doi.org/10.1016/j.gsf.2020.09.020.
S. Guan, Y. Wang, L. Liu, J. Gao, Z. Xu, and S. Kan, “Ultra-short-term wind power prediction method based on FTI-VACA-XGB model,” Expert Syst Appl, vol. 235, no. August 2023, p. 121185, 2024. Elsevier. https://doi.org/10.1016/j.eswa.2023.121185.
F. Alzamzami and M. Hoda, “Light Gradient Boosting Machine for General Sentiment Classification on Short Texts : A Comparative Evaluation,” vol. 8, 2020. IEEE. https://doi.org/10.1109/ACCESS.2020.2997330.
D. A. Mccarty and H. W. Kim, “Evaluation of Light Gradient Boosted Machine Learning Technique in Large Scale Land Use and Land Cover Classification,” 2020. MDPI. https://doi.org/10.3390/environments7100084.
J. Park, J. Moon, and S. Jung, “Multistep-Ahead Solar Radiation Forecasting Scheme Based on the Light Gradient Boosting Machine : A Case Study of Jeju Island,” 2020. MDPI. https://doi.org/10.3390/rs12142271.
G. Alfian et al., “Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method,” 2022. MDPI. https://doi.org/10.3390/computers11090136.
D. Sharma, R. Kumar, and A. Jain, “Breast cancer prediction based on neural networks and extra tree classifier using feature ensemble learning,” Measurement: Sensors, vol. 24, no. October, p. 100560, 2022. Elsevier. https://doi.org/10.1016/j.measen.2022.100560.
D. Baby, S. J. Devaraj, J. Hemanth, and A. R. A. J. M. M, “Turkish Journal of Electrical Engineering and Computer Sciences Leukocyte classification based on feature selection using extra trees classifier : atransfer learning approach,” vol. 29, no. 8, 2021. Tubitak. DOI: 10.3906/elk-2104-183
Y. Freund, R. E. Schapire, P. Avenue, and F. Park, “A Short Introduction to Boosting,” vol. 14, no. 5, pp. 771–780, 1999.
D. J. Fleet and M. Brubaker, “AdaBoost,” pp. 123–129, 2015.
M. F. Siddiqui, F. Iqbal, and N. Hussain, “Sentiment Analysis on twitter data using Machine Learning,” Journal of Xidian University, vol. 14, no. 12, pp. 30–38, 2020. DOI: 10.37896/jxu14.12/039.
K. Deepa, H. Sangita, and H. Shruthi, “Sentiment Analysis of Twitter Data Using Machine Learning,” Lecture Notes in Electrical Engineering, vol. 905, pp. 259–267, 2022. Springer. https://doi.org/10.1007/978-981-19-2177-3_26.
L. Sayfullina et al., “Efficient detection of zero-day android malware using normalized bernoulli naive bayes,” Proceedings - 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015, IEEE, Helsinki, Finland, vol. 1, pp. 198–205, 2015. https://doi.org/10.1109/Trustcom.2015.375.
S. Mehta, X. Shen, J. Gou, and D. Niu, “A New Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance,” 2018. MDPI. https://doi.org/10.3390/info9090234.
B. Wang and S. Zhang, “A new locally adaptive K-nearest centroid neighbor classification based on the average distance,” Conn Sci, vol. 34, no. 1, pp. 2084–2107, 2022. Taylor & Francis. https://doi.org/10.1080/09540091.2022.2088695.
DOI:
https://doi.org/10.31449/inf.v50i7.9539Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







