Neto Osmar Pinto
Center of Innovation, Technology and Education (CITE) at Anhembi Morumbi University - Anima Institute, São José dos Campos, São Paulo, Brazil; Arena235 Research Lab, São José dos Campos, São Paulo, Brazil.
J Voice. 2024 May 12. doi: 10.1016/j.jvoice.2024.04.020.
This study evaluates the efficacy of voice analysis combined with machine learning (ML) techniques in enabling the diagnosis of Parkinson's disease (PD).
Voice data, phonation of the vowel "a," from three distinct datasets (two from the University of California Irvine ML Repository and one from figshare) for 432 participants (278 PD patients) were analyzed. We employed four ML models-Artificial Neural Networks, Random Forest, Gradient Boosting (GB), and Support Vector Machine (SVM)-alongside two ensemble methods (soft voting classifier-Ensemble Voting Classifier and stacking method-Ensemble Stacking Model (ESM)). The models underwent 50 iterations of evaluation, involving various data splits and 10-fold cross-validation. Comparative analysis was done using one-way Analysis of Variance followed by Bonferroni posthoc corrections.
The ESM, SVM, and GB models emerged as the top performers, demonstrating superior performance across metrics, including accuracy, sensitivity, specificity, precision, F1 score, and area under the receiver operating characteristic curve (ROC AUC). Despite data heterogeneity and variable selection limitations, the models showed high values for all metrics.
ML integration with voice analysis, mainly through ESM, SVM, and GB, is promising for early PD diagnosis. Using multi-source data and a large sample size enhances our findings' validity, reliability, and generalizability.
Integrating advanced ML techniques with voice analysis demonstrates substantial potential for improving early PD detection, offering valuable tools for speech-language pathologists (SLPs). These findings provide clinically relevant insights that can be applied within the scope of SLP practice to refine diagnostic processes and facilitate early intervention.
本研究评估语音分析结合机器学习(ML)技术用于诊断帕金森病(PD)的疗效。
分析了来自三个不同数据集(两个来自加利福尼亚大学欧文分校ML库,一个来自figshare)的432名参与者(278名PD患者)的语音数据,即元音“a”的发声。我们采用了四种ML模型——人工神经网络、随机森林、梯度提升(GB)和支持向量机(SVM)——以及两种集成方法(软投票分类器——集成投票分类器和堆叠方法——集成堆叠模型(ESM))。这些模型进行了50次评估迭代,涉及各种数据划分和10折交叉验证。使用单因素方差分析并随后进行Bonferroni事后校正进行比较分析。
ESM、SVM和GB模型表现最为出色,在包括准确性、敏感性、特异性、精确性、F1分数和受试者工作特征曲线下面积(ROC AUC)等指标上均表现出卓越性能。尽管存在数据异质性和变量选择限制,但这些模型在所有指标上都显示出很高的值。
ML与语音分析相结合,主要通过ESM、SVM和GB,在早期PD诊断方面很有前景。使用多源数据和大样本量提高了我们研究结果的有效性、可靠性和普遍性。
将先进的ML技术与语音分析相结合,在改善早期PD检测方面显示出巨大潜力,为言语语言病理学家(SLP)提供了有价值的工具。这些发现提供了临床相关见解,可应用于SLP实践范围内,以优化诊断过程并促进早期干预。