Park Joo Kyung, Mun Sae Byeol, Kim Young Jae, Kim Kwang Gi
Department of Biomedical Engineering, College of Medicine, Gachon University, Gil Medical Center, Incheon, Republic of Korea.
Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences & Technology, Gachon University, Incheon, Republic of Korea.
PLoS One. 2025 May 28;20(5):e0315286. doi: 10.1371/journal.pone.0315286. eCollection 2025.
Speech disorders can arise from various causes, including congenital conditions, neurological damage, diseases, and other disorders. Traditionally, medical professionals have used changes in voice to diagnose the underlying causes of these disorders. With the advancement of artificial intelligence (AI), new possibilities have emerged in this field. However, most existing studies primarily focus on comparing voice data between normal individuals and those with speech disorders. Research that classifies the causes of these disorders within the abnormal voice data, attributing them to specific etiologies, remains limited. Therefore, our objective was to classify the specific causes of speech disorders from voice data resulting from various conditions, such as stroke and hearing impairments (HI).
We experimentally developed a deep learning model to analyze Korean speech disorder voice data caused by stroke and HI. Our goal was to classify the disorders caused by these specific conditions. To achieve effective classification, we employed the ResNet-18, Inception V3, and SEResNeXt-18 models for feature extraction and training processes.
The models demonstrated promising results, with area under the curve (AUC) values of 0.839 for ResNet-18, 0.913 for Inception V3, and 0.906 for SEResNeXt-18, respectively.
These outcomes suggest the feasibility of using AI to efficiently classify the origins of speech disorders through the analysis of voice data.
言语障碍可由多种原因引起,包括先天性疾病、神经损伤、疾病及其他病症。传统上,医学专业人员利用嗓音变化来诊断这些障碍的潜在病因。随着人工智能(AI)的发展,该领域出现了新的可能性。然而,大多数现有研究主要集中于比较正常个体与言语障碍患者的嗓音数据。在异常嗓音数据中对这些障碍的病因进行分类并将其归因于特定病因的研究仍然有限。因此,我们的目标是从由中风和听力障碍(HI)等各种情况产生的嗓音数据中对言语障碍的具体病因进行分类。
我们通过实验开发了一种深度学习模型,以分析由中风和HI引起的韩语言语障碍嗓音数据。我们的目标是对由这些特定情况引起的障碍进行分类。为实现有效分类,我们采用ResNet-18、Inception V3和SEResNeXt-18模型进行特征提取和训练过程。
这些模型展示了令人满意的结果,ResNet-18的曲线下面积(AUC)值为0.839,Inception V3为0.913,SEResNeXt-18为0.906。
这些结果表明,通过分析嗓音数据,利用人工智能有效分类言语障碍病因具有可行性。