Omodunbi Bolaji A, Olawade David B, Awe Omosigho F, Soladoye Afeez A, Aderinto Nicholas, Ovsepian Saak V, Boussios Stergios
Department of Computer Engineering, Federal University Oye-Ekiti, Oye-Ekiti 371104, Nigeria.
Department of Allied and Public Health, School of Health, Sport and Bioscience, University of East London, London E16 2RD, UK.
Diagnostics (Basel). 2025 Jun 9;15(12):1467. doi: 10.3390/diagnostics15121467.
Parkinson's disease (PD) is a progressive neurodegenerative condition that impairs motor and non-motor functions. Early and accurate diagnosis is critical for effective management and care. Leveraging machine learning (ML) techniques, this study aimed to develop a robust prediction system for PD using a stacked ensemble learning approach, addressing challenges such as imbalanced datasets and feature optimization. An open-access PD dataset comprising 22 vocal attributes and 195 instances from 31 subjects was utilized. To prevent data leakage, subjects were divided into training (22 subjects) and testing (9 subjects) groups, ensuring no subject appeared in both sets. Preprocessing included data cleaning and normalization via min-max scaling. The synthetic minority oversampling technique (SMOTE) was applied exclusively to the training set to address class imbalance. Feature selection techniques-forward search, gain ratio, and Kruskal-Wallis test-were employed using subject-wise cross-validation to identify significant attributes. The developed system combined support vector machine (SVM), random forest (RF), K-nearest neighbor (KNN), and decision tree (DT) as base classifiers, with logistic regression (LR) as the meta-classifier in a stacked ensemble learning framework. Performance was evaluated using both recording-wise and subject-wise metrics to ensure clinical relevance. The stacked ensemble learning model achieved realistic performance with a recording-wise accuracy of 84.7% and subject-wise accuracy of 77.8% on completely unseen subjects, outperforming individual classifiers including KNN (81.4%), RF (79.7%), and SVM (76.3%). Cross-validation within the training set showed 89.2% accuracy, with the performance difference highlighting the importance of proper validation methodology. Feature selection results showed that using the top 10 features ranked by gain ratio provided optimal balance between performance and clinical interpretability. The system's methodological robustness was validated through rigorous subject-wise evaluation, demonstrating the critical impact of validation methodology on reported performance. By implementing subject-wise validation and preventing data leakage, this study demonstrates that proper validation yields substantially different (and more realistic) results compared to flawed recording-wise approaches. The findings underscore the critical importance of validation methodology in healthcare ML applications and provide a template for methodologically sound PD classification research. Future research should focus on validating the model with larger, multi-center datasets and implementing standardized validation protocols to enhance clinical applicability.
帕金森病(PD)是一种进行性神经退行性疾病,会损害运动和非运动功能。早期准确诊断对于有效管理和护理至关重要。本研究利用机器学习(ML)技术,旨在采用堆叠集成学习方法开发一个强大的帕金森病预测系统,以应对数据集不平衡和特征优化等挑战。使用了一个开放获取的帕金森病数据集,该数据集包含22个声音属性和来自31名受试者的195个实例。为防止数据泄露,将受试者分为训练组(22名受试者)和测试组(9名受试者),确保没有受试者同时出现在两组中。预处理包括通过最小-最大缩放进行数据清理和归一化。合成少数过采样技术(SMOTE)仅应用于训练集以解决类别不平衡问题。使用基于受试者的交叉验证,采用前向搜索、增益比和Kruskal-Wallis检验等特征选择技术来识别重要属性。所开发的系统在堆叠集成学习框架中,将支持向量机(SVM)、随机森林(RF)、K近邻(KNN)和决策树(DT)作为基分类器,将逻辑回归(LR)作为元分类器。使用基于记录和基于受试者的指标来评估性能,以确保临床相关性。堆叠集成学习模型在完全未见过的受试者上实现了实际性能,基于记录的准确率为84.7%,基于受试者的准确率为77.8%,优于包括KNN(81.4%)、RF(79.7%)和SVM(76.3%)在内的单个分类器。训练集内的交叉验证显示准确率为89.2%,性能差异突出了适当验证方法的重要性。特征选择结果表明,使用按增益比排名的前10个特征可在性能和临床可解释性之间提供最佳平衡。通过严格的基于受试者的评估验证了该系统方法的稳健性,证明了验证方法对报告性能的关键影响。通过实施基于受试者的验证并防止数据泄露,本研究表明,与有缺陷的基于记录的方法相比,适当的验证会产生截然不同(且更现实)的结果。研究结果强调了验证方法在医疗保健ML应用中的至关重要性,并为方法合理的帕金森病分类研究提供了一个模板。未来的研究应侧重于使用更大的多中心数据集验证模型,并实施标准化的验证协议以提高临床适用性。