Parkinson’s Disease Classification Using SHAP, LIME, and Fisher Score with XGBoost and White-Box Machine Learning Models: An Explainable AI Perspective
Abstract
Parkinson’s disease (PD) is a chronic neurological disorder that progressively impairs motor functions,often characterized by tremors, rigidity, and bradykinesia. Early diagnosis plays a vital role inimproving patient outcomes. This study applies machine learning (ML) classifiers for the accuratedetection of PD using a publicly available dataset comprising 195 voice recordings with 24 real-valuedspeech attributes. The preprocessing phase involved normalization and labeling, where the statusattribute encoded healthy (0) and PD (1) subjects. The models are evaluated using 10-fold crossvalidation to ensure robust performance estimation. The study evaluated a diverse range of machinelearning classifiers, encompassing interpretable white-box models along with complex black-boxmodels. Explainable Artificial Intelligence (XAI) techniques, including Fisher Score, SHAP, and LIME,are employed to interpret and rank feature importance, enhancing model transparency. Among allclassifiers, XGBoost achieved the best results, with an accuracy of 94.87%, F1-score of 96.97%, andROC-AUC of 0.91. The findings highlight that integrating XAI methods with ML not only yields highclassification accuracy but also provides interpretable insights essential for clinical decision support inPD diagnosisReferences
Mahlknecht P, Krismer F, Poewe W, Seppi K (2017) Meta-analysis of dorsolateral nigral hyperintensity on magnetic resonance imaging as a marker for Parkinson's disease. Movement Disorders 32(4):619-623
Dickson DW (2018) Neuropathology of Parkinson disease. Parkinsonism Relat Disord 46:S30-S33
Kalia LV, Lang AE (2015) Parkinson's disease. Lancet 386(9996):896-912
Rayan Z, Alfonse M, Salem ABM (2019) Machine learning approaches in smart health. Procedia Comput Sci 154:361-368
Prashanth R, Roy SD (2018) Novel and improved stage estimation in Parkinson's disease using clinical scales and machine learning. Neurocomputing 305:78-103
Almeida JS, Rebouças Filho PP, Carneiro T, Wei W, Damaševičius R, Maskeliūnas R, de Albuquerque VHC (2019) Detecting Parkinson’s disease with sustained phonation and speech signals using machine learning techniques. Pattern Recognit Lett 125:55-62
Das R (2010) A comparison of multiple classification methods for diagnosis of Parkinson disease. Expert Syst Appl 37(2):1568-1572
Wang Y, Wang AN, Ai Q, Sun HJ (2017) An adaptive kernel-based weighted extreme learning machine approach for effective detection of Parkinson’s disease. Biomed Signal Process Control 38:400-410
Ali L, Zhu C, Zhang Z, Liu Y (2019) Automated detection of Parkinson’s disease based on multiple types of sustained phonations using linear discriminant analysis and genetically optimized neural network. IEEE J Transl Eng Health Med 7:1-10
Senturk ZK (2020) Early diagnosis of Parkinson’s disease using machine learning algorithms. Med Hypotheses 138:109603
Harvey J, Reijnders RA, Cavill R, Duits A, Köhler S, Eijssen L, Pishva E (2022) Machine learning-based prediction of cognitive outcomes in de novo Parkinson’s disease. npj Parkinsons Dis 8(1):150
Shahid AH, Singh MP (2020) A deep learning approach for prediction of Parkinson’s disease progression. Biomed Eng Lett 10:227-239
Borzì L, Mazzetta I, Zampogna A, Suppa A, Olmo G, Irrera F (2021) Prediction of freezing of gait in Parkinson’s disease using wearables and machine learning. Sensors 21(2):614
Pahuja G, Nagabhushan TN (2021) A comparative study of existing machine learning approaches for Parkinson's disease detection. IETE J Res 67(1):4-14
Mei J, Desrosiers C, Frasnelli J (2021) Machine learning for the diagnosis of Parkinson's disease: a review of literature. Front Aging Neurosci 13:633752
Gupta R, Kumari S, Senapati A, Ambasta RK, Kumar P (2023) New era of artificial intelligence and machine learning-based detection, diagnosis, and therapeutics in Parkinson’s disease. Ageing Res Rev 102013
Tsanas A, Little MA, McSharry PE, Ramig LO (2011) Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity. J R Soc Interface 8(59):842-855
Arora S, Sahu A, Meena YK (2019) Classification of Parkinson's disease using machine learning and deep learning techniques: a review. J Biomed Eng Med Imaging 6(2):30-39
Dua S, Acharya UR (2020) Machine learning applications in the diagnosis of Parkinson's disease: a review. Parkinsonism Relat Disord 81:10-23
Li H, Habes M, Wolk DA, Fan Y, Alzheimer's Disease Neuroimaging Initiative (2019) A deep learning model for early prediction of Alzheimer's disease dementia based on hippocampal magnetic resonance imaging data. Alzheimers Dement 15(8):1059-1070
Little M, McSharry P, Hunter E, Spielman J, Ramig L (2008) Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. Nat Preced 1:1
Guleria P, Sood M (2023) Explainable AI and machine learning: performance evaluation and explainability of classifiers on educational data mining inspired career counseling. Educ Inf Technol 28(1):1081-1116
Guleria P, Srinivasu PN, Hassaballah M (2023) Diabetes prediction using Shapley additive explanations and DSaaS over machine learning classifiers: a novel healthcare paradigm. Multimed Tools Appl 1-36
Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you? Explaining the predictions of any classifier. Proc 22nd ACM SIGKDD Int Conf Knowl Discov Data Min 1135-1144
Guleria P (2024) NLP-based clinical text classification and sentiment analyses of complex medical transcripts using transformer model and machine learning classifiers. Neural Comput Appl 1-26
Guleria P (2024) Blending Shapley values for feature ranking in machine learning: an analysis on educational data. Neural Comput Appl 36(23):14093-14117
Sakar BE, Isenkul ME, Sakar CO, Sertbas A, Gurgen F, Delil S, Kursun O (2013) Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE J Biomed Health Inform 17(4):828-834
Vadovský M, Paralič J (2017) Parkinson's disease patients classification based on the speech signals. Proc 15th IEEE Int Symp Appl Mach Intell Inform 000321-000326
Mabrouk R, Chikhaoui B, Bentabet L (2018) Machine learning based classification using clinical and DaTSCAN SPECT imaging features: a study on Parkinson’s disease and SWEDD. IEEE Trans Radiat Plasma Med Sci 3(2):170-177
Benba A, Jilbab A, Hammouch A (2017) Using human factor cepstral coefficient on multiple types of voice recordings for detecting patients with Parkinson's disease. IRBM 38(6):346-351
Betrouni N, Delval A, Chaton L, Defebvre L, Duits A, Moonen A, Dujardin K (2019) Electroencephalography-based machine learning for cognitive profiling in Parkinson's disease: preliminary results. Mov Disord 34(2):210-217
Chaturvedi M, Hatz F, Gschwandtner U, Bogaarts JG, Meyer A, Fuhr P, Roth V (2017) Quantitative EEG (QEEG) measures differentiate Parkinson's disease patients from healthy controls. Front Aging Neurosci 9:3
Oh SL, Hagiwara Y, Raghavendra U, Yuvaraj R, Arunkumar N, Murugappan M, Acharya UR (2020) A deep learning approach for Parkinson’s disease diagnosis from EEG signals. Neural Comput Appl 32:10927-10933
Ruffini G, Ibañez D, Castellano M, Dubreuil-Vall L, Soria-Frisch A, Postuma R, Montplaisir J (2019) Deep learning with EEG spectrograms in rapid eye movement behavior disorder. Front Neurol 10:806
Vanneste S, Song JJ, De Ridder D (2018) Thalamocortical dysrhythmia detected by machine learning. Nat Commun 9(1):1103
DOI:
https://doi.org/10.31449/inf.v50i7.9607Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







