基于深度学习的中风和听力障碍语音障碍分类

Deep learning-based classification of speech disorder in stroke and hearing impairment.

作者信息

Park Joo Kyung, Mun Sae Byeol, Kim Young Jae, Kim Kwang Gi

机构信息

Department of Biomedical Engineering, College of Medicine, Gachon University, Gil Medical Center, Incheon, Republic of Korea.

Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences & Technology, Gachon University, Incheon, Republic of Korea.

出版信息

PLoS One. 2025 May 28;20(5):e0315286. doi: 10.1371/journal.pone.0315286. eCollection 2025.

DOI:10.1371/journal.pone.0315286

PMID:40435156

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12118888/

Abstract

BACKGROUND AND OBJECTIVE

Speech disorders can arise from various causes, including congenital conditions, neurological damage, diseases, and other disorders. Traditionally, medical professionals have used changes in voice to diagnose the underlying causes of these disorders. With the advancement of artificial intelligence (AI), new possibilities have emerged in this field. However, most existing studies primarily focus on comparing voice data between normal individuals and those with speech disorders. Research that classifies the causes of these disorders within the abnormal voice data, attributing them to specific etiologies, remains limited. Therefore, our objective was to classify the specific causes of speech disorders from voice data resulting from various conditions, such as stroke and hearing impairments (HI).

METHODS

We experimentally developed a deep learning model to analyze Korean speech disorder voice data caused by stroke and HI. Our goal was to classify the disorders caused by these specific conditions. To achieve effective classification, we employed the ResNet-18, Inception V3, and SEResNeXt-18 models for feature extraction and training processes.

RESULTS

The models demonstrated promising results, with area under the curve (AUC) values of 0.839 for ResNet-18, 0.913 for Inception V3, and 0.906 for SEResNeXt-18, respectively.

CONCLUSIONS

These outcomes suggest the feasibility of using AI to efficiently classify the origins of speech disorders through the analysis of voice data.

摘要

背景与目的

言语障碍可由多种原因引起，包括先天性疾病、神经损伤、疾病及其他病症。传统上，医学专业人员利用嗓音变化来诊断这些障碍的潜在病因。随着人工智能（AI）的发展，该领域出现了新的可能性。然而，大多数现有研究主要集中于比较正常个体与言语障碍患者的嗓音数据。在异常嗓音数据中对这些障碍的病因进行分类并将其归因于特定病因的研究仍然有限。因此，我们的目标是从由中风和听力障碍（HI）等各种情况产生的嗓音数据中对言语障碍的具体病因进行分类。

方法

我们通过实验开发了一种深度学习模型，以分析由中风和HI引起的韩语言语障碍嗓音数据。我们的目标是对由这些特定情况引起的障碍进行分类。为实现有效分类，我们采用ResNet-18、Inception V3和SEResNeXt-18模型进行特征提取和训练过程。

结果

这些模型展示了令人满意的结果，ResNet-18的曲线下面积（AUC）值为0.839，Inception V3为0.913，SEResNeXt-18为0.906。

结论

这些结果表明，通过分析嗓音数据，利用人工智能有效分类言语障碍病因具有可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03cf/12118888/1503dde85815/pone.0315286.g001.jpg

相似文献

Deep learning-based classification of speech disorder in stroke and hearing impairment.基于深度学习的中风和听力障碍语音障碍分类

PLoS One. 2025 May 28;20(5):e0315286. doi: 10.1371/journal.pone.0315286. eCollection 2025.

Speech and voice physiology of children who are hard of hearing.听力障碍儿童的言语和嗓音生理学

Ear Hear. 2005 Dec;26(6):546-58. doi: 10.1097/01.aud.0000188151.99086.a3.

Deep learning-based auditory attention decoding in listeners with hearing impairment.基于深度学习的听力受损者听觉注意力解码

J Neural Eng. 2024 May 22;21(3). doi: 10.1088/1741-2552/ad49d7.

A profile of the features and speech in patients with mandibulofacial dysostosis.下颌面骨发育不全患者的特征与言语概况。

Cleft Palate Craniofac J. 2002 Nov;39(6):623-34. doi: 10.1597/1545-1569_2002_039_0623_apotfa_2.0.co_2.

Control of speech and voice in cochlear implant patients.人工耳蜗植入患者的言语和嗓音控制。

Laryngoscope. 2019 Sep;129(9):2158-2163. doi: 10.1002/lary.27787. Epub 2019 Jan 6.

Diagnosis of pathological speech with streamlined features for long short-term memory learning.利用简化特征进行长短期记忆学习的病理性语音诊断。

Comput Biol Med. 2024 Mar;170:107976. doi: 10.1016/j.compbiomed.2024.107976. Epub 2024 Jan 8.

Image classification-driven speech disorder detection using deep learning technique.使用深度学习技术的图像分类驱动的言语障碍检测

SLAS Technol. 2025 Jun;32:100261. doi: 10.1016/j.slast.2025.100261. Epub 2025 Mar 6.

Physiological assessment of speech and voice production of adults with hearing loss.对成年听力损失患者言语和嗓音产生的生理评估。

J Speech Hear Res. 1994 Jun;37(3):510-21. doi: 10.1044/jshr.3703.510.

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review.生物信号传感器与基于深度学习的语音识别：综述。

Sensors (Basel). 2021 Feb 17;21(4):1399. doi: 10.3390/s21041399.

Artificial Intelligence-Based Speech Analysis System for Medical Support.用于医疗支持的基于人工智能的语音分析系统

Int Neurourol J. 2023 Jun;27(2):99-105. doi: 10.5213/inj.2346136.068. Epub 2023 Jun 30.

本文引用的文献

The Use of Deep Learning Software in the Detection of Voice Disorders: A Systematic Review.深度学习软件在语音障碍检测中的应用：一项系统综述。

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1531-1543. doi: 10.1002/ohn.636. Epub 2024 Jan 3.

A machine learning method to process voice samples for identification of Parkinson's disease.一种用于处理语音样本以识别帕金森病的机器学习方法。

Sci Rep. 2023 Nov 23;13(1):20615. doi: 10.1038/s41598-023-47568-w.

Artificial Intelligence-Based Speech Analysis System for Medical Support.用于医疗支持的基于人工智能的语音分析系统

Int Neurourol J. 2023 Jun;27(2):99-105. doi: 10.5213/inj.2346136.068. Epub 2023 Jun 30.

Dysarthria Speech Detection Using Convolutional Neural Networks with Gated Recurrent Unit.基于门控循环单元的卷积神经网络的构音障碍语音检测

Healthcare (Basel). 2022 Oct 7;10(10):1956. doi: 10.3390/healthcare10101956.

Using SincNet for Learning Pathological Voice Disorders.基于 SincNet 学习病理性嗓音障碍。

Sensors (Basel). 2022 Sep 2;22(17):6634. doi: 10.3390/s22176634.

Characterization of Mild and Moderate Dysarthria in Parkinson's Disease: Behavioral Measures and Neural Correlates.帕金森病轻度和中度构音障碍的特征：行为测量与神经关联

Front Aging Neurosci. 2022 May 16;14:870998. doi: 10.3389/fnagi.2022.870998. eCollection 2022.

Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study.深度学习在声门疾病预测中的应用：通过语音识别——初步开发研究

J Med Internet Res. 2021 Jun 8;23(6):e25247. doi: 10.2196/25247.

Comparative Analysis of CNN and RNN for Voice Pathology Detection.卷积神经网络（CNN）和循环神经网络（RNN）在语音病理学检测中的比较分析。

Biomed Res Int. 2021 Apr 14;2021:6635964. doi: 10.1155/2021/6635964. eCollection 2021.

Dysarthria following acute ischemic stroke: Prospective evaluation of characteristics, type and severity.急性缺血性卒中后构音障碍：特征、类型及严重程度的前瞻性评估

Int J Lang Commun Disord. 2021 May;56(3):549-557. doi: 10.1111/1460-6984.12607. Epub 2021 Feb 12.

Convolutional Neural Network Classifies Pathological Voice Change in Laryngeal Cancer with High Accuracy.卷积神经网络可高精度地对喉癌中的病理性声音变化进行分类。

J Clin Med. 2020 Oct 25;9(11):3415. doi: 10.3390/jcm9113415.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于深度学习的中风和听力障碍语音障碍分类

Deep learning-based classification of speech disorder in stroke and hearing impairment.

作者信息

机构信息

出版信息

BACKGROUND AND OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景与目的

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献