Crime Prediction Using Twitter Sentiments and Crime Data

Gbadegesin Adetayo Taiwo, Muhamad Saraee, Jimoh Fatai


The incidence of crime is now of great concern globally. The culprits change their tactics on a regular basis. These crimes affect persons, groups, and the government to the extent a whole lot of budgets are allocated to serve as preventive measure to these crimes. The aim of this research is to predict crime based on Twitter hourly sentiments and crime data records. This is because it has been observed that existing crime prediction models that used Twitter data entail some drawbacks in predicting criminal incidents as a result of the unavailability of hourly sentiment polarity and demographic factors. Additionally, SHAP framework was used for the interpretability to rank the feature based on their importance. The xgboost algorithm was utilized with tuning to have an optimal model. The accuracy of 0.81 (81%) was obtained and an Area Under the Receiver Operating Curve (ROC AUC) score of 0.7079 was obtained. The result of this study indicated that crime could be predicted in real-time in contrast to earlier studies on this subject matter. Consequently, it is advised that this work be applied to real-world situations

Full Text:



ToppiReddy HKR, Saini B, Mahajan G. Crime Prediction & Monitoring Framework Based on Spatial Analysis. Procedia Comput Sci [Internet]. 2018;132(Iccids):696–705. Available from:

Umair A, Sarfraz MS, Ahmad M, Habib U, Ullah MH, Mazzara M. Spatiotemporal Analysis of Web News Archives for Crime Prediction. Appl Sci. 2020;10.

Tompson L, Johnson S, Ashby M, Perkins C, Edwards P. UK open source crime data: Accuracy and possibilities for research. Cartogr Geogr Inf Sci. 2015;42(2):97–111.

Oladimeji OO, Oladimeji A, Oladimeji O. Classification models for likelihood prediction of diabetes at early stage using feature selection. Appl Comput Informatics. 2021;

Oladimeji OO, Oladimeji O. Predicting Survival of Heart Failure Patients Using Classification Algorithms. JITCE (Journal Inf Technol Comput Eng [Internet]. 2020 Sep 30;4(02):90–4. Available from:

Malathi A, Baboo SS. Enhanced Algorithms to Identify Change in Crime Patterns. Int J Comb Optim Probl Informatics. 2011;2(3):32–8.

Brayne S, Christin A. Technologies of Crime Prediction: The Reception of Algorithms in Policing and Criminal Courts. Soc Probl. 2021;68(3):608–24.

Manzanares MCS, Diez JJR, Sánchez RM, Yáñez MJZ, Menéndez RC. Lifelong learning from sustainable education: An analysis with eye tracking and data mining techniques. Sustain. 2020;12(5).

Kotevska O, Kusne AG, Samarov D V., Lbath A, Battou A. Dynamic Network Model for Smart City Data-Loss Resilience Case Study: City-to-City Network for Crime Analytics. IEEE Access. 2017;5:20524–35.

Ahishakiye E, Omulo EO, Taremwa D, Niyonzima I. Crime prediction using Decision Tree (J48) classification algorithm. Int J Comput Inf Technol. 2017;06(03):188–95.

Nasridinov A, Ihm SY, Park YH. A decision tree-based classification model for crime prediction. Lect Notes Electr Eng. 2013;253 LNEE:531–8.

Iqbal R, Murad MAA, Mustapha A, Panahy PHS, Khanahmadliravi N. An experimental study of classification algorithms for crime prediction. Indian J Sci Technol. 2013;6(3):4219–25.

Chen X, Cho Y, Jang SY. Crime prediction using Twitter sentiment and weather. 2015 Syst Inf Eng Des Symp SIEDS 2015. 2015;(c):63–8.

Ohana B, Tierney B. Sentiment classification of reviews using SentiWordNet. 9th IT T Conf. 2009;

Mousa SR, Bakhit PR, Osman OA, Ishak S. A comparative analysis of tree-based ensemble methods for detecting imminent lane change maneuvers in connected vehicle environments. Transp Res Rec. 2018;2672(42):268–79.

Zhang X, Liu L, Lan M, Song G, Xiao L, Chen J. Interpretable machine learning models for crime prediction. Comput Environ Urban Syst [Internet]. 2022;94(November 2021):101789. Available from:

Qi Z. The Text Classification of Theft Crime Based on TF-IDF and XGBoost Model. Proc 2020 IEEE Int Conf Artif Intell Comput Appl ICAICA 2020. 2020;1241–6.

Mitchell R, Frank E. Accelerating the XGBoost algorithm using GPU computing. PeerJ Comput Sci. 2017;2017(7).

Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng [Internet]. 2018;2(10):749–60. Available from:

Sayres R, Taly A, Rahimy E, Blumer K, Coz D, Hammel N, et al. Using a Deep Learning Algorithm and Integrated Gradients Explanation to Assist Grading for Diabetic Retinopathy. Ophthalmology. 2019;126(4):552–64.

Putatunda S, Rama K. A comparative analysis of hyperopt as against other approaches for hyper-parameter optimization of XGBoost. ACM Int Conf Proceeding Ser. 2018;6–10.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.