使用多个语音信号数据库的混合预处理与集成分类用于增强帕金森病检测

Hybrid preprocessing and ensemble classification for enhanced detection of Parkinson's disease using multiple speech signal databases.

作者信息

Sun Ling-Chun, Tseng Chun-Wei, Lin Ke-Feng, Chen Ping-Nan

机构信息

School of Medicine, National Defense Medical Center, Taipei, Taiwan.

School of Public Health, National Defense Medical Center, Taipei, Taiwan.

出版信息

Digit Health. 2025 Jun 26;11:20552076251352941. doi: 10.1177/20552076251352941. eCollection 2025 Jan-Dec.

DOI:10.1177/20552076251352941

PMID:40585053

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12202955/

Abstract

OBJECTIVE

With the increasing prevalence of Parkinson's disease (PD) and the development of PD-based acoustic recording databases, this study aims to evaluate the feasibility of using an ensemble-based machine learning (ML) approach to detect PD across diverse acoustic datasets.

METHODS

We utilized three publicly available PD speech datasets-MIU (Sakar), UEX (Carrón), and UCI (Little)-to build ML models incorporating a hybrid preprocessing framework. This framework includes a scaling phase (using RobustScaler), a sampling phase (employing random oversampling (ROS), synthetic minority oversampling technique (SMOTE), and random undersampling (RUS)), and an ML classifier selection phase (featuring eXtreme gradient boosting (XGBoost) and adaptive boosting (AdaBoost)). Performance was evaluated using accuracy, precision, recall, and F1-score metrics. Additionally, we conducted SHAP (SHapley Additive exPlanations) analysis to identify the most significant PD-related acoustic features.

RESULTS

The optimal combination of preprocessing and classification techniques varied across datasets. However, the highest classification performance was generally achieved using RobustScaler for scaling, a combination of ROS, SMOTE, and RUS for sampling, and XGBoost or AdaBoost for classification. The best-performing model on the MIU dataset achieved accuracy of 97.37%, precision of 96.07%, and F1-score of 96.57%. The UEX and UCI datasets achieved perfect classification with 100% accuracy, precision, and recall. SHAP analysis revealed that Mel-frequency cepstral coefficients were consistently among the most influential PD-related acoustic features.

CONCLUSIONS

Our findings confirm the feasibility of an ensemble-based approach for PD detection using acoustic recordings, highlighting the importance of dataset-specific preprocessing strategies. This study ranks impactful PD-related acoustic features, offering guidance for future voice-based PD screening tools.

摘要

目的

随着帕金森病（PD）患病率的不断上升以及基于PD的声学记录数据库的发展，本研究旨在评估使用基于集成的机器学习（ML）方法在不同声学数据集上检测PD的可行性。

方法

我们利用三个公开可用的PD语音数据集——MIU（萨卡尔）、UEX（卡龙）和UCI（利特尔）——构建包含混合预处理框架的ML模型。该框架包括一个缩放阶段（使用稳健缩放器）、一个采样阶段（采用随机过采样（ROS）、合成少数过采样技术（SMOTE）和随机欠采样（RUS））以及一个ML分类器选择阶段（以极端梯度提升（XGBoost）和自适应提升（AdaBoost）为特色）。使用准确率、精确率、召回率和F1分数指标评估性能。此外，我们进行了SHAP（SHapley加法解释）分析，以确定与PD相关的最重要声学特征。

结果

预处理和分类技术的最佳组合因数据集而异。然而，一般来说，使用稳健缩放器进行缩放、ROS、SMOTE和RUS的组合进行采样以及XGBoost或AdaBoost进行分类可实现最高的分类性能。MIU数据集上表现最佳的模型准确率达到97.37%，精确率为96.07%，F1分数为96.57%。UEX和UCI数据集实现了完美分类，准确率、精确率和召回率均为100%。SHAP分析表明，梅尔频率倒谱系数始终是与PD相关的最具影响力的声学特征之一。