Suppr超能文献

深度学习在声门疾病预测中的应用:通过语音识别——初步开发研究

Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study.

机构信息

Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.

Department of Otorhinolaryngology-Head and Neck Surgery, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City, Taiwan.

出版信息

J Med Internet Res. 2021 Jun 8;23(6):e25247. doi: 10.2196/25247.

Abstract

BACKGROUND

Dysphonia influences the quality of life by interfering with communication. However, a laryngoscopic examination is expensive and not readily accessible in primary care units. Experienced laryngologists are required to achieve an accurate diagnosis.

OBJECTIVE

This study sought to detect various vocal fold diseases through pathological voice recognition using artificial intelligence.

METHODS

We collected 189 normal voice samples and 552 samples of individuals with voice disorders, including vocal atrophy (n=224), unilateral vocal paralysis (n=50), organic vocal fold lesions (n=248), and adductor spasmodic dysphonia (n=30). The 741 samples were divided into 2 sets: 593 samples as the training set and 148 samples as the testing set. A convolutional neural network approach was applied to train the model, and findings were compared with those of human specialists.

RESULTS

The convolutional neural network model achieved a sensitivity of 0.66, a specificity of 0.91, and an overall accuracy of 66.9% for distinguishing normal voice, vocal atrophy, unilateral vocal paralysis, organic vocal fold lesions, and adductor spasmodic dysphonia. Compared with the accuracy of human specialists, the overall accuracy rates were 60.1% and 56.1% for the 2 laryngologists and 51.4% and 43.2% for the 2 general ear, nose, and throat doctors.

CONCLUSIONS

Voice alone could be used for common vocal fold disease recognition through a deep learning approach after training with our Mandarin pathological voice database. This approach involving artificial intelligence could be clinically useful for screening general vocal fold disease using the voice. The approach includes a quick survey and a general health examination. It can be applied during telemedicine in areas with primary care units lacking laryngoscopic abilities. It could support physicians when prescreening cases by allowing for invasive examinations to be performed only for cases involving problems with automatic recognition or listening and for professional analyses of other clinical examination results that reveal doubts about the presence of pathologies.

摘要

背景

发声障碍通过干扰交流影响生活质量。然而,喉镜检查既昂贵又不能在初级保健单位普及,需要有经验的喉科医生才能做出准确的诊断。

目的

本研究旨在通过人工智能识别病理声音来检测各种声带疾病。

方法

我们收集了 189 个正常声音样本和 552 个患有声音障碍的个体样本,包括声带萎缩(n=224)、单侧声带麻痹(n=50)、器质性声带病变(n=248)和痉挛性发声障碍(n=30)。741 个样本被分为 2 组:593 个样本作为训练集,148 个样本作为测试集。应用卷积神经网络方法对模型进行训练,并将结果与人类专家的结果进行比较。

结果

对于区分正常声音、声带萎缩、单侧声带麻痹、器质性声带病变和痉挛性发声障碍,卷积神经网络模型的灵敏度为 0.66,特异性为 0.91,总准确率为 66.9%。与 2 位喉科专家和 2 位耳鼻喉科普通医生的准确率相比,总准确率分别为 60.1%和 56.1%,51.4%和 43.2%。

结论

通过使用我们的普通话病理声音数据库进行深度学习训练后,仅凭声音即可用于常见声带疾病的识别。这种涉及人工智能的方法可通过声音用于一般声带疾病的筛查,包括快速调查和一般健康检查。它可应用于缺乏喉镜能力的初级保健单位的远程医疗中,支持医生进行预筛选,仅对自动识别或听力有问题的病例进行有创检查,并对其他临床检查结果进行专业分析,以怀疑是否存在病理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a89f/8241431/54c6cfcce025/jmir_v23i6e25247_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验