An Illustration of Rheumatoid Arthritis Disease Using Decision Tree Algorithm

Uma Ramasamy, Santhoshkumar Sundar


Data Mining domain integrates several partitions of the computer science and analytics field. Data mining focuses on mine data from a repository of datasets, to identify patterns, discover knowledge, additionally to predict probable outcomes. One of the inflammatory diseases is the Rheumatoid Arthritis (RA) caused by specific autoantibodies with the destruction of synovial joint autoantibodies. Most RA patients experience abhor pain on the joints of the hands, legs, hip, spine, and shoulder.  ID3 is the general significant algorithm to construct a decision tree. Information gain determined to find the dominant attributes from the dataset to build the decision tree for the ID3 algorithm. C4.5 is another algorithm to construct the decision tree. It is the successor of ID3 that handles dataset contains different numerical attributes. C4.5 algorithm builds the decision tree evaluated by the gain ratio. Decision tree belongs to classification techniques is a well-known method appropriate for medical diagnosis. Decision tree algorithms such as ID3 and C4.5 are popular and efficiently used classifiers for Rheumatoid Arthritis (RA) prediction from a RA dataset. The objective of this paper is to display the prominent features to predict RA from the RA dataset.

Full Text:



Zahra Shiezadeh, Hedieh Sajedi, and Elham Aflakie, “Diagnosis of Rheumatoid Arthritis using an Ensemble Learning Approach,” Computer Science and Information Technology, DOI: 10.5121/csit.2015.51512, 2015. © CS & IT-CSCP 2015, pp. 139–148.

Rohini Handa, Rao, U. R. K., Juliana F. M. Lewis, Gautam Rambhad, Susan Shiff, and Canna J. Ghia, “Literature review of rheumatoid arthritis in India International Journal of Rheumatic Diseases,” vol. 19, pp. 440-451, 2016.

Halima ELAIDI, Zahra BENABBOU, Hassan ABBAR, A comparative study of algorithms constructing decision trees: ID3 and C4.5, LOPAL’18, May 2-5, 2018, Rabat, Morocco © 2018 Association for Computing Machinery.,, 1-5

Badr HSSINA, Abdelkarim MERBOUHA, Hanane EZZIKOURI, Mohammed ERRITALI, A comparative study of decision tree ID3 and C4.5, International Journal of Advanced Computer Science and Applications, pp: 13-19, 2014.

Sonia Singh , Manoj Giri , Comparative Study Id3, Cart And C4.5 Decision Tree Algorithm: A Survey , International Journal of Advanced Information Science and Technology (IJAIST) ISSN: 2319:2682 Vol.3, No.7, July 2014, DOI:10.15693/ijaist/2014.v3i7.47-52, 47-52.

Elaidi, Zahra Benabbou, Hassan Abbar , Using Game Theory to Handle Missing Data at Prediction Time of ID3 and C4.5 Algorithms, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 9, No. 12, 2018 , 218-224.

Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao, Predicting Students' Performance Using ID3 and C4.5 Classification Algorithms, International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013, 39-52.

Y.Wang, Y.Li, Y.Song, X.Rong, S.Zhang, Improvement of ID3 Algorithm Based on Simplified Information Entropy and Coordination Degree, Algorithms journal vol : 10 (4) pp :124, 2017.

Pattama Charoenporn, Reservoir Inflow Forecasting Using ID3 and C4.5 Decision Tree Model, 2017 IEEE 3rd International Conference on Control Science and Systems Engineering, 978-1-5386-0484-7/17/$31.00 ©2017 IEEE, 698-701.

Sudrajat, I. Irianingsih, D. Krisnawan, Analysis of data mining classification by comparison of C4.5 and ID algorithms,, IOP Conf. Series: Materials Science and Engineering 166 (2017) 012031 doi:10.1088/1757-899X/166/1/012031, 1-9.

Joko Azhari Suyatno, Fhira Nhita, Aniq Atiqi Rohmawati, Rainfall Forecasting in Bandung Regency using C4.5 Algorithm, 2018 6th International Conference on Information and Communication Technology (ICoICT), ISBN: 978-1-5386-4571-0 (c) 2018 IEEE, 324-328.

X. Wanga, C. Zhoua, X. Xub, Application of C4.5 decision tree for scholarship evaluations, The 10th International Conference on Ambient Systems, Networks and Technologies (ANT), © 2019 The Authors. Published by Elsevier Ltd. , ScienceDirect , Procedia Computer Science 151 (2019) 179–184.

Daniel Aletaha, Tuhina Neogi, Alan J. Silman, Julia Funovits, David T. Felson, Clifton O. Bingham., … Gillian Hawker(2010). “2010 Rheumatoid Arthritis Classification Criteria,” Arthritis & Rheumatism, (62) 9, pp. 2569-2581.

Lakshmi.B.N, Dr.Indumathi.T.S, Dr.Nandini Ravi, A study on C.5 Decision Tree Classification Algorithm for Risk Predictions during Pregnancy, International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015), © 2016 The Authors.Published by Elsevier Ltd., ScienceDirect, Procedia Technology 24 ( 2016 ) 1542 – 1549.

Yanwei Xing, Jie Wang and Zhihong Zhao, Yonghong Gao, Combination data mining methods with new medical data to predicting outcome of Coronary Heart Disease, 2007 International Conference on Convergence Information Technology, 0-7695-3038-9/07 $25.00 © 2007 IEEE, DOI 10.1109/ICCIT.2007.204, 868-872.

Begona Garcia-Zapirian, Yolanda Garcia-Chimeno, and Heather Rogers(2015), ”Machine Learning Techniques for Automatic Classification of Patients with Fibromyalgia and Arthritis,” International Journal of Computer Trends and Technology, vol. 25 (3). ISSN: 2231-2803, 2015.

Begum Cigsar and Deniz Unal, Comparison of Data Mining Classification Algorithms Determining the Default Risk, Hindawi Scientific Programming Feb. 2019, Article ID 8706505.

Nikita Jain , Vishal Srivastava, Data Mining Techniques: A Survey Paper, International Journal of Research in Engineering and Technology, vol. 2(11), eISSN: 2319-1163,pISSN: 2321-7308, 2013.

S. Umadevi, and K. S. Jeen Marseline, A Survey on Data Mining Classification algorithms, International Conference on Signal Processing and Communication (ICSPC’ 17), July 2017.

Shanmugam, S., & Preethi, J., “Design of Rheumatoid Arthritis Predictor Model Using Machine Learning Algorithms,” SpringerBriefs in Applied Sciences and Technology, DOI: 10.1007/978-981-10-6698-6_7.

Vaishali S. Parsania, Krunal Kamani, and Gautam J Kamani, “Comparative Analysis of Data Mining Algorithms on EHR of Rheumatoid Arthritis of Multiple Systems of Medicine International,” Journal of Engineering Research and General Science, vol 3 (1), pp. 344-350, 2015.

Beau Norgeot, MS; Benjamin S. Glicksberg, Laura Trupin, Dmytro Lituiev, Milena Gianfrancesco, Boris Oskotsky, Gabriela Schmajuk, Jinoos Yazdany, and Atul J. Butte, Assessment of a Deep Learning Model Based on Electronic Health Record Data to Forecast Clinical Outcomes in Patients With Rheumatoid Arthritis, JAMA Network Open. 2019, vol. 2(3).

Jihyung Yoo, Mi Kyoung Lim, Chunhwa Ihm, Eun Soo Choi and Min Soo Kang(2017). A Study on Prediction of Rheumatoid Arthritis Using Machine Learning. International Journal of Applied Engineering Research, vol. 12 (20). ISSN 0973-4562, pp. 9858-9862, 2017.

Catia Sofia Tadeu Botas(2017), “Feature analysis to predict treatment outcome in Rheumatoid Arthritis,” Instituto Superior Tecnico, Lisboa, Portugal. pp. 1-10, 2017.

Elena Myasoedova, John M. Davis Ill, Eric L.Matteson, Sara J. Achenbach, Soko Setoguchi, Shannon M. Dunlay, Veronique L. Roger, Sherine E. Gabriel, and Cynthia S. Crowson, ” Increased hospitalization rates following heart failure diagnosis in rheumatoid arthritis as compared to the general population,” Seminars in Arthritis and Rheumatism, vol. 50, pp. 25-29, 2020  Elsevier Inc.

Cynthia S Crowson, Katherine P Liao, John M Davis, Daniel H Solomon, Eric L Matteson, Keith L Knutson, Mark A Hlatky, and Sherine E Gabriel, Rheumatoid Arthritis and Cardiovascular Disease, NIH Public Access Am Heart J. 2013, vol. 166(4), pp. 622–628, doi:10.1016/j.ahj.2013.07.010.

Usman Khalid, Alexander Egeberg, Ole Ahlehoff, Deirdre Lane, Gunnar H. Gislason, Gregory Y. H. Lip and Peter R. Hansen, Incident Heart Failure in Patients With Rheumatoid Arthritis: A Nationwide Cohort Study, Journal of the American Heart Association 2018.

Khalid Raza , Application of Data Mining in Bioinformatics, Indian Journal of Computer Science and Engineering 2012, vol 1(2), pp. 114-118, ISSN : 0976-5166.

Xing-Ming Zhao, Data Mining in Systems Biology, IEEE/ACM Transactions on Computational Biology and Bioinformatics 2016, vol 13(6), 1003-1003.

Zijing Wang, Yo Liu and Li Liu,” A New Way to Choose Splitting Attribute in ID3 Algorithm,” DOI: 10.1109/ITNEC.2017.8284813, 978-1-5090-6414-4/17 2017 IEEE.

Audu Musa Mabu, Rajesh Prasad, Raghav Yadav, Suleiman S Jauro,” A Review of Data Mining Methods in Bioinformatics,” Recent Advances of Engineering, Technology and Computational Sciences , 978-1-5386-1686-4/18 2018 IEEE.

He Zhang and Runjing Zhou,” The Analysis and Optimization of Decision Tree Based on ID3 Algorithm,” The 9th International Conference on Modeling, identification and Control, 2017.

Jorn Lotsch, Lars Alfredsson, Jon Lampa, “Machine-Learning-based knowledge discovery in rheumatoid arthritis-related registry data to identify predictors of persistent pain ,” The International Association for the Study of Pain Research Paper – PAIN, vol. 161, pp. 114-126, 2020.

Tiffany D. Pan, Beth A. Mueller, Carin E. Dugowson, Michael L. Richardson, and J. Lee Nelson, Disease progression in relation to pre-onset parity among women with rheumatoid arthritis, Seminars in Arthritis and Rheumatism 2019, 0049-0172/© 2019 Elsevier Inc.

Ho Sharon, I. Elamvazuthi, CK. Lu, S. Parasuraman and Elango Natarajan, Classification of Rheumatoid Arthritis using Machine Learning Algorithms , IEEE Student Conference on Research and Development (SCOReD) 2019, pp. 345-350.

Ho Sharon, Irraivan Elamvazuthi, Cheng-Kai Lu, S. Parasuraman and Elango Natarajan, Development of Rheumatoid Arthritis Classification From Electronic Image Sensor Using Ensemble Method, Sensors 2020, 20, 167; doi:10.3390/s20010167.

Fautrel, B. Guillemin, F. Meyer, O. Bant, M.D. et al., Choice of Second-Line Diases-Modifying Antirheumatic Drugs After Failure of Methotrexate Therapy for Rheumatoid Arthritis: A Decision Tree for Clinical Practice Based on Rheumatologists’ Preferences. Arthritis Rheumatol. 2009, 61, 425–434.

Keyser, F.D. Choice of biologic therapy for patients with rheumatoid arthritis: The infection perspective. Curr. Rheumatol. Rev. 2011, 7, 77-87.

Wu, C.-T.; Lo, C.-L.; Tung, C.-H.; Cheng, H.-L. Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis. Healthcare 2020, 8, 85.; doi:10.3390/healthcare8020085.

Li Y, Sun X, Zhang X, Liu Y, Yang Y, Li R, Liu X, Jia R, Li Z. Establishment of a decision tree model for diagnosis of early rheumatoid arthritis by proteomic fingerprinting. Int J Rheum Dis. 2015 Nov;18(8):835-41. doi: 10.1111/1756-185X.12595. Epub 2015 Aug 6. PMID: 26249836.

Ma Dan, Liang Nana, Zhang Liyun. Establishing Classification Tree Models in Rheumatoid Arthritis Using Combination of Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry and Magnetic Beads.Frontiers in Medicine. 2021;8:190. ISSN:2296-858X. doi:10.3389/fmed.2021.609773.

Hans Ulrich Scherer, Thomas Häupl, Gerd R. Burmester,The etiology of rheumatoid arthritis, Journal of Autoimmunity, Volume 110,2020,102400, ISSN 0896-8411,

Maria Hugle, Patrick Omoumi, Jacob M. van Laar, Joschka Boedecker and Thomas Hugle. Applied machine learning and artificial intelligence in rheumatology.Rheumatology Advances in Practice 20;0:1–10. 2020. doi:10.1093/rap/rkaa005.

Shanmugam, S., Preethi, J. Improved feature selection and classification for rheumatoid arthritis disease using weighted decision tree approach (REACT). J Supercomput 75, 5507–5519 (2019).

Guo Q, Wang Y, Xu D, Nossent J, Pavlos NJ, Xu J. Rheumatoid arthritis: pathological mechanisms and modern pharmacologic therapies. Bone Res. 2018 Apr 27;6:15. doi: 10.1038/s41413-018-0016-9. PMID: 29736302; PMCID: PMC5920070.

Maria Kourilovitch,Claudio Galarza-Maldonado, Esteban Ortiz-Prado. Diagnosis and classification of rheumatoid arthritis. Journal of autoimmunity. Feb. 2014, doi: 10.1016/j.jaut.2014.01.027

Vodencarevic, A., Tascilar, K., Hartmann, F. et al. Advanced machine learning for predicting individual risk of flares in rheumatoid arthritis patients tapering biologic drugs, Arthritis Res Ther , 23, 67 (2021).

Rehberg, M., Giegerich, C., Praestgaard, A. et al., Identification of a Rule to Predict Response to Sarilumab in Patients with Rheumatoid Arthritis Using Machine Learning and Clinical Trial Data, Rheumatol Ther 8, 1661–1675 (2021),

Liu, J., Chen, N. A 9 mRNAs-based diagnostic signature for rheumatoid arthritis by integrating bioinformatic analysis and machine-learning, J Orthop Surg Res 16, 44 (2021).

Huang Jie, Fu Xuekun, Chen Xinxin, Li Zheng, Huang Yuhong, Liang Chao, Promising Therapeutic Targets for Treatment of Rheumatoid Arthritis, Frontiers in Immunology 2021, Vol.12, ISSN:1664-3224, doi:10.3389/fimmu.2021.686155.

Maarseveen TD, Meinderink T, Reinders MJT, Knitza J, Huizinga TWJ, Kleyer A, Simon D, van den Akker EB, Knevel R, Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study, JMIR Med Inform 2020;8(11):e23930, doi: 10.2196/23930, PMID: 33252349, PMCID: 7735897

Koo, B.S., Eun, S., Shin, K. et al. Machine learning model for identifying important clinical features for predicting remission in patients with rheumatoid arthritis treated with biologics. Arthritis Res Ther 23, 178 (2021). doi:10.1186/s13075-021-02567-y

Fraenkel L, Bathon JM, England BR, et al. 2021 American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis, Arthritis Rheumatol. 2021; 73(7): 1108-1123. doi:10.1002/art.41752

nptelhrd. (2008, October 16). Lecture – 35 Rule Induction and Decision Trees – I. Retrieved from

Tom M. Mitchell. Machine Learning McGraw-Hill Science/Engineering/Math. 1997; pp. 52-76.

Wikipedia Website. [Online]. Available:

Seema Sharma, Jitendra Agrawal, Sanjeev Sharma, Classification Through Machine Learning Technique: C4.5 Algorithm based on Various Entropies, International Journal of Computer Applications, 2013, vol 82(16), (0975-8887).

R. Sudrajat, I. Irianingsih, D. Krisnawan, Analysis of data mining classification by comparison of C4.5 and ID algorithms, IOP Conf. Series: Materials Science and Engineering 166 (2017) 012031, doi: 10.1088/1757-899X/166/1/012031.

Wikipedia Website. [Online]. Available:

Poonkuzhali. S, Saravanakumar. C. Data Warehousing & Data Mining. Charulatha Publications 2008. 1st Ed. pp: 6.12.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.