通过分析语音声学模式，利用深度学习对帕金森病进展水平进行细粒度分类。

Leveraging Deep Learning for Fine-Grained Categorization of Parkinson's Disease Progression Levels through Analysis of Vocal Acoustic Patterns.

作者信息

Malekroodi Hadi Sedigh, Madusanka Nuwan, Lee Byeong-Il, Yi Myunggi

机构信息

Industry 4.0 Convergence Bionics Engineering, Pukyong National University, Busan 48513, Republic of Korea.

Digital of Healthcare Research Center, Institute of Information Technology and Convergence, Pukyong National University, Busan 48513, Republic of Korea.

出版信息

Bioengineering (Basel). 2024 Mar 21;11(3):295. doi: 10.3390/bioengineering11030295.

DOI:10.3390/bioengineering11030295

PMID:38534569

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10968564/

Abstract

Speech impairments often emerge as one of the primary indicators of Parkinson's disease (PD), albeit not readily apparent in its early stages. While previous studies focused predominantly on binary PD detection, this research explored the use of deep learning models to automatically classify sustained vowel recordings into healthy controls, mild PD, or severe PD based on motor symptom severity scores. Popular convolutional neural network (CNN) architectures, VGG and ResNet, as well as vision transformers, Swin, were fine-tuned on log mel spectrogram image representations of the segmented voice data. Furthermore, the research investigated the effects of audio segment lengths and specific vowel sounds on the performance of these models. The findings indicated that implementing longer segments yielded better performance. The models showed strong capability in distinguishing PD from healthy subjects, achieving over 95% precision. However, reliably discriminating between mild and severe PD cases remained challenging. The VGG16 achieved the best overall classification performance with 91.8% accuracy and the largest area under the ROC curve. Furthermore, focusing analysis on the vowel /u/ could further improve accuracy to 96%. Applying visualization techniques like Grad-CAM also highlighted how CNN models focused on localized spectrogram regions while transformers attended to more widespread patterns. Overall, this work showed the potential of deep learning for non-invasive screening and monitoring of PD progression from voice recordings, but larger multi-class labeled datasets are needed to further improve severity classification.

摘要

言语障碍常常作为帕金森病（PD）的主要指标之一出现，尽管在其早期阶段并不容易显现。虽然先前的研究主要集中在二元PD检测上，但本研究探索了使用深度学习模型，根据运动症状严重程度评分，将持续元音录音自动分类为健康对照、轻度PD或重度PD。流行的卷积神经网络（CNN）架构VGG和ResNet以及视觉Transformer（Swin），在分割语音数据的对数梅尔频谱图图像表示上进行了微调。此外，该研究还调查了音频片段长度和特定元音对这些模型性能的影响。研究结果表明，使用更长的片段会产生更好的性能。这些模型在区分PD和健康受试者方面表现出强大的能力，准确率超过95%。然而，可靠地区分轻度和重度PD病例仍然具有挑战性。VGG16在总体分类性能方面表现最佳，准确率为91.8%，ROC曲线下面积最大。此外，将分析重点放在元音/u/上可进一步将准确率提高到96%。应用Grad-CAM等可视化技术还突出了CNN模型如何聚焦于局部频谱图区域，而Transformer则关注更广泛的模式。总体而言，这项工作展示了深度学习在从语音记录中对PD进展进行无创筛查和监测方面的潜力，但需要更大的多类标记数据集来进一步改善严重程度分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/deda/10968564/b1ef6ee7c567/bioengineering-11-00295-g001.jpg

相似文献

Leveraging Deep Learning for Fine-Grained Categorization of Parkinson's Disease Progression Levels through Analysis of Vocal Acoustic Patterns.通过分析语音声学模式，利用深度学习对帕金森病进展水平进行细粒度分类。

Bioengineering (Basel). 2024 Mar 21;11(3):295. doi: 10.3390/bioengineering11030295.

Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用：比较声学特征并开发一个可推广的框架。

Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.

Convolutional neural network ensemble for Parkinson's disease detection from voice recordings.用于从语音记录中检测帕金森病的卷积神经网络集成

Comput Biol Med. 2022 Feb;141:105021. doi: 10.1016/j.compbiomed.2021.105021. Epub 2021 Nov 9.

Deep transfer learning for detection of breast arterial calcifications on mammograms: a comparative study.基于深度迁移学习的乳腺钼靶动脉钙化检测：一项对比研究。

Eur Radiol Exp. 2024 Jul 15;8(1):80. doi: 10.1186/s41747-024-00478-6.

Do it the transformer way: A comprehensive review of brain and vision transformers for autism spectrum disorder diagnosis and classification.采用变压器方法：自闭症谱系障碍诊断和分类的脑和视觉变压器的全面综述。

Comput Biol Med. 2023 Dec;167:107667. doi: 10.1016/j.compbiomed.2023.107667. Epub 2023 Nov 3.

Explainable classification of Parkinson's disease using deep learning trained on a large multi-center database of T1-weighted MRI datasets.利用基于 T1 加权 MRI 数据集的大型多中心数据库训练的深度学习对帕金森病进行可解释分类。

Neuroimage Clin. 2023;38:103405. doi: 10.1016/j.nicl.2023.103405. Epub 2023 Apr 17.

Implementation of a Deep Learning Algorithm Based on Vertical Ground Reaction Force Time-Frequency Features for the Detection and Severity Classification of Parkinson's Disease.基于垂直地面反力时频特征的深度学习算法在帕金森病检测及严重程度分类中的应用。

Sensors (Basel). 2021 Jul 31;21(15):5207. doi: 10.3390/s21155207.

Machine Learning-Assisted Speech Analysis for Early Detection of Parkinson's Disease: A Study on Speaker Diarization and Classification Techniques.基于机器学习的帕金森病早期检测语音分析：说话人分割和分类技术研究。

Sensors (Basel). 2024 Feb 26;24(5):1499. doi: 10.3390/s24051499.

Deep Learning Approach to Parkinson's Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification.使用语音记录和专用于图像分类的卷积神经网络的深度学习方法进行帕金森病检测。

Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:717-720. doi: 10.1109/EMBC.2019.8856972.

Convolutional Neural Networks for the Identification of African Lions from Individual Vocalizations.用于从个体叫声中识别非洲狮的卷积神经网络

J Imaging. 2022 Apr 1;8(4):96. doi: 10.3390/jimaging8040096.

引用本文的文献

Motor symptoms of Parkinson's disease: critical markers for early AI-assisted diagnosis.帕金森病的运动症状：早期人工智能辅助诊断的关键标志物。

Front Aging Neurosci. 2025 Jul 18;17:1602426. doi: 10.3389/fnagi.2025.1602426. eCollection 2025.

Speech-Based Parkinson's Detection Using Pre-Trained Self-Supervised Automatic Speech Recognition (ASR) Models and Supervised Contrastive Learning.基于语音的帕金森病检测：使用预训练的自监督自动语音识别（ASR）模型和监督对比学习

Bioengineering (Basel). 2025 Jul 1;12(7):728. doi: 10.3390/bioengineering12070728.

Pre-trained convolutional neural networks identify Parkinson's disease from spectrogram images of voice samples.预训练卷积神经网络从语音样本的频谱图图像中识别帕金森病。

Sci Rep. 2025 Mar 1;15(1):7337. doi: 10.1038/s41598-025-92105-6.

A Deep-Learning Model for Multi-class Audio Classification of Vocal Fold Pathologies in Office Stroboscopy.一种用于办公室频闪喉镜检查中声带病变多类别音频分类的深度学习模型。

Laryngoscope. 2025 Jul;135(7):2428-2436. doi: 10.1002/lary.32036. Epub 2025 Feb 5.

Clinical Decision Support Using Speech Signal Analysis: Systematic Scoping Review of Neurological Disorders.使用语音信号分析的临床决策支持：神经系统疾病的系统综述

J Med Internet Res. 2025 Jan 13;27:e63004. doi: 10.2196/63004.

Pre-trained Convolutional Neural Networks Identify Parkinson's Disease from Spectrogram Images of Voice Samples.预训练卷积神经网络从语音样本的频谱图图像中识别帕金森病。

Res Sq. 2024 Dec 18:rs.3.rs-5348708. doi: 10.21203/rs.3.rs-5348708/v1.

Transformer-based transfer learning on self-reported voice recordings for Parkinson's disease diagnosis.基于Transformer的自我报告语音记录的迁移学习用于帕金森病诊断。

Sci Rep. 2024 Dec 3;14(1):30131. doi: 10.1038/s41598-024-81824-x.

Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson's Disease Detection and Speech Features Extraction.分析 Wav2Vec 1.0 嵌入以进行跨数据库帕金森病检测和语音特征提取。

Sensors (Basel). 2024 Aug 26;24(17):5520. doi: 10.3390/s24175520.

SS-DRPL: self-supervised deep representation pattern learning for voice-based Parkinson's disease detection.SS-DRPL：用于基于语音的帕金森病检测的自监督深度表征模式学习

Front Comput Neurosci. 2024 Jun 12;18:1414462. doi: 10.3389/fncom.2024.1414462. eCollection 2024.

本文引用的文献

Towards a Corpus (and Language)-Independent Screening of Parkinson's Disease from Voice and Speech through Domain Adaptation.通过域适应实现从语音中对帕金森病进行独立于语料库（及语言）的筛查。

Bioengineering (Basel). 2023 Nov 15;10(11):1316. doi: 10.3390/bioengineering10111316.

On the inter-dataset generalization of machine learning approaches to Parkinson's disease detection from voice.基于机器学习的帕金森病语音检测方法在跨数据集上的泛化能力。

Int J Med Inform. 2023 Nov;179:105237. doi: 10.1016/j.ijmedinf.2023.105237. Epub 2023 Sep 29.

A machine learning framework for the quantification of experimental uveitis in murine OCT.一种用于量化小鼠光学相干断层扫描（OCT）中实验性葡萄膜炎的机器学习框架。

Biomed Opt Express. 2023 Jun 16;14(7):3413-3432. doi: 10.1364/BOE.489271. eCollection 2023 Jul 1.

Robust and language-independent acoustic features in Parkinson's disease.帕金森病中稳健且与语言无关的声学特征。

Front Neurol. 2023 Jun 13;14:1198058. doi: 10.3389/fneur.2023.1198058. eCollection 2023.

A lung sound recognition model to diagnoses the respiratory diseases by using transfer learning.一种通过迁移学习来诊断呼吸系统疾病的肺音识别模型。

Multimed Tools Appl. 2023 Mar 29:1-17. doi: 10.1007/s11042-023-14727-0.

CNN-Based Identification of Parkinson's Disease from Continuous Speech in Noisy Environments.基于卷积神经网络在噪声环境下从连续语音中识别帕金森病

Bioengineering (Basel). 2023 Apr 26;10(5):531. doi: 10.3390/bioengineering10050531.

Early detection of Parkinson's disease from multiple signal speech: Based on Mandarin language dataset.基于普通话语言数据集从多信号语音中早期检测帕金森病

Front Aging Neurosci. 2022 Nov 10;14:1036588. doi: 10.3389/fnagi.2022.1036588. eCollection 2022.

Voice in Parkinson's Disease: A Machine Learning Study.帕金森病中的语音：一项机器学习研究。

Front Neurol. 2022 Feb 15;13:831428. doi: 10.3389/fneur.2022.831428. eCollection 2022.

Convolutional neural network ensemble for Parkinson's disease detection from voice recordings.用于从语音记录中检测帕金森病的卷积神经网络集成

Comput Biol Med. 2022 Feb;141:105021. doi: 10.1016/j.compbiomed.2021.105021. Epub 2021 Nov 9.

Assessment of Acoustic Features and Machine Learning for Parkinson's Detection.帕金森病检测的声学特征评估与机器学习。

J Healthc Eng. 2021 Aug 21;2021:9957132. doi: 10.1155/2021/9957132. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过分析语音声学模式，利用深度学习对帕金森病进展水平进行细粒度分类。

Leveraging Deep Learning for Fine-Grained Categorization of Parkinson's Disease Progression Levels through Analysis of Vocal Acoustic Patterns.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献