使用深度学习方法对语音产生和嘴唇运动进行同步分析以检测帕金森病

Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson's Disease Using Deep Learning Methods.

作者信息

Ríos-Urrego Cristian David, Escobar-Grisales Daniel, Orozco-Arroyave Juan Rafael

机构信息

GITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, Colombia.

LME Lab., University of Erlangen, 91054 Erlangen, Germany.

出版信息

Diagnostics (Basel). 2024 Dec 31;15(1):73. doi: 10.3390/diagnostics15010073.

DOI:10.3390/diagnostics15010073

PMID:39795601

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11720596/

Abstract

BACKGROUND/OBJECTIVES: Parkinson's disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients.

METHODS

This paper introduces, for the first time, a new methodology that performs the synchronous fusion of information extracted from speech recordings and their corresponding videos of lip movement, namely the bimodal approach.

RESULTS

Our results indicate that the introduced method is more accurate and suitable than unimodal approaches or classical asynchronous approaches that combine both sources of information but do not incorporate the underlying temporal information.

CONCLUSIONS

This study demonstrates that using a synchronous fusion strategy with concatenated projections based on attention mechanisms, i.e., speech-to-lips and lips-to-speech, exceeds previous results reported in the literature. Complementary information between lip movement and speech production is confirmed when advanced fusion strategies are employed. Finally, multimodal approaches, combining visual and speech signals, showed great potential to improve PD classification, generating more confident and robust models for clinical diagnostic support.

摘要

背景/目的：帕金森病（PD）在全球影响着超过600万人。其准确诊断和监测是减轻其经济负担的关键因素。典型方法要么考虑语音信号，要么考虑面部视频记录，以自动对帕金森病患者的异常模式进行建模。

方法

本文首次介绍了一种新方法，该方法对从语音记录及其相应的唇部运动视频中提取的信息进行同步融合，即双峰方法。

结果

我们的结果表明，与单峰方法或结合两种信息源但未纳入潜在时间信息的经典异步方法相比，所介绍的方法更准确且更适用。

结论

本研究表明，使用基于注意力机制的串联投影的同步融合策略，即语音到唇部和唇部到语音，超过了文献中报道的先前结果。当采用先进的融合策略时，唇部运动和语音产生之间的互补信息得到了证实。最后，结合视觉和语音信号的多模态方法在改善帕金森病分类方面显示出巨大潜力，为临床诊断支持生成更可靠和稳健的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb42/11720596/586879973fa2/diagnostics-15-00073-g001.jpg

相似文献

Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson's Disease Using Deep Learning Methods.使用深度学习方法对语音产生和嘴唇运动进行同步分析以检测帕金森病

Diagnostics (Basel). 2024 Dec 31;15(1):73. doi: 10.3390/diagnostics15010073.

A multimodal Parkinson quantification by fusing eye and gait motion patterns, using covariance descriptors, from non-invasive computer vision.基于计算机视觉的非侵入式方法，融合眼动和步态运动模式，使用协方差描述符进行多模态帕金森量化。

Comput Methods Programs Biomed. 2022 Mar;215:106607. doi: 10.1016/j.cmpb.2021.106607. Epub 2021 Dec 30.

The effect of two speech and language approaches on speech problems in people with Parkinson's disease: the PD COMM RCT.两种言语语言治疗方法对帕金森病患者言语问题的影响：PD COMM RCT。

Health Technol Assess. 2024 Oct;28(58):1-141. doi: 10.3310/ADWP8001.

FLP: Factor lattice pattern-based automated detection of Parkinson's disease and specific language impairment using recorded speech.FLP：基于因子格子模式的帕金森病和特定语言障碍的自动检测，使用记录的语音。

Comput Biol Med. 2024 May;173:108280. doi: 10.1016/j.compbiomed.2024.108280. Epub 2024 Mar 20.

Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease.迈向帕金森病患者构音障碍程度的自动评估

J Commun Disord. 2018 Nov-Dec;76:21-36. doi: 10.1016/j.jcomdis.2018.08.002. Epub 2018 Aug 20.

Early detection of Parkinson's disease using a multi area graph convolutional network.使用多区域图卷积网络早期检测帕金森病

Sci Rep. 2025 Feb 14;15(1):5561. doi: 10.1038/s41598-024-82027-0.

Computerized analysis of hypomimia and hypokinetic dysarthria for improved diagnosis of Parkinson's disease.用于改善帕金森病诊断的面无表情和运动减少型构音障碍的计算机分析

Heliyon. 2023 Oct 23;9(11):e21175. doi: 10.1016/j.heliyon.2023.e21175. eCollection 2023 Nov.

The effects of self-generated synchronous and asynchronous visual speech feedback on overt stuttering frequency.自我产生的同步和异步视觉言语反馈对明显口吃频率的影响。

J Commun Disord. 2009 May-Jun;42(3):235-44. doi: 10.1016/j.jcomdis.2009.02.002. Epub 2009 Feb 28.

A multimodal cross-transformer-based model to predict mild cognitive impairment using speech, language and vision.基于多模态交叉变换的模型，使用语音、语言和视觉预测轻度认知障碍。

Comput Biol Med. 2024 Nov;182:109199. doi: 10.1016/j.compbiomed.2024.109199. Epub 2024 Sep 26.

PIDGN: An explainable multimodal deep learning framework for early prediction of Parkinson's disease.PIDGN：一种用于帕金森病早期预测的可解释多模态深度学习框架。

J Neurosci Methods. 2025 Mar;415:110363. doi: 10.1016/j.jneumeth.2025.110363. Epub 2025 Jan 18.

本文引用的文献

Automatic speech-based assessment to discriminate Parkinson's disease from essential tremor with a cross-language approach.基于自动语音的跨语言方法评估以区分帕金森病与特发性震颤。

NPJ Digit Med. 2024 Feb 17;7(1):37. doi: 10.1038/s41746-024-01027-6.

Heliyon. 2023 Oct 23;9(11):e21175. doi: 10.1016/j.heliyon.2023.e21175. eCollection 2023 Nov.

Deep Learning and Artificial Intelligence Applied to Model Speech and Language in Parkinson's Disease.深度学习与人工智能应用于帕金森病的言语和语言建模

Diagnostics (Basel). 2023 Jun 25;13(13):2163. doi: 10.3390/diagnostics13132163.

CNN-Based Identification of Parkinson's Disease from Continuous Speech in Noisy Environments.基于卷积神经网络在噪声环境下从连续语音中识别帕金森病

Bioengineering (Basel). 2023 Apr 26;10(5):531. doi: 10.3390/bioengineering10050531.

Exploring facial expressions and action unit domains for Parkinson detection.探索用于帕金森检测的面部表情和动作单元领域。

PLoS One. 2023 Feb 2;18(2):e0281248. doi: 10.1371/journal.pone.0281248. eCollection 2023.

An integrated biometric voice and facial features for early detection of Parkinson's disease.用于帕金森病早期检测的集成生物识别语音和面部特征。

NPJ Parkinsons Dis. 2022 Oct 29;8(1):145. doi: 10.1038/s41531-022-00414-8.

Cognitive Determinants of Dysarthria in Parkinson's Disease: An Automated Machine Learning Approach.帕金森病构音障碍的认知决定因素：一种自动化机器学习方法。

Mov Disord. 2021 Dec;36(12):2862-2873. doi: 10.1002/mds.28751. Epub 2021 Aug 14.

Deep Audio-Visual Speech Recognition.深度视听语音识别

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):8717-8727. doi: 10.1109/TPAMI.2018.2889052. Epub 2022 Nov 7.

Speech disorders in Parkinson's disease: early diagnostics and effects of medication and brain stimulation.帕金森病中的言语障碍：早期诊断以及药物和脑刺激的影响

J Neural Transm (Vienna). 2017 Mar;124(3):303-334. doi: 10.1007/s00702-017-1676-0. Epub 2017 Jan 18.

Markerless Analysis of Articulatory Movements in Patients With Parkinson's Disease.帕金森病患者发音运动的无标记分析

J Voice. 2016 Nov;30(6):766.e1-766.e11. doi: 10.1016/j.jvoice.2015.10.014. Epub 2015 Nov 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用深度学习方法对语音产生和嘴唇运动进行同步分析以检测帕金森病

Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson's Disease Using Deep Learning Methods.

作者信息

机构信息

出版信息

METHODS

RESULTS

CONCLUSIONS

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献