• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 MFCC 和深度神经网络的语音病理学检测分析研究。

An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.

机构信息

Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 57168, Riyadh 21574, Saudi Arabia.

Division of Electronics Engineering, School of Engineering, Cochin University of Science and Technology, India.

出版信息

Comput Math Methods Med. 2022 Apr 4;2022:7814952. doi: 10.1155/2022/7814952. eCollection 2022.

DOI:10.1155/2022/7814952
PMID:35529259
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9071878/
Abstract

Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the "continuous sentence" audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry.

摘要

除了声带之外,其他内脏器官的疾病也可能会影响人的声音。因此,尽管声音问题经常被忽视,但它们的发生率却在上升。最近的一项研究表明,语音病理检测系统可以成功帮助评估语音异常,并实现语音病理的早期诊断。例如,在早期识别和诊断声音问题时,区分健康和患病声音的自动系统引起了广泛关注。因此,人工智能辅助的语音分析为医疗保健带来了新的可能性。这项工作旨在评估几种自动语音信号分析方法在诊断语音障碍方面的效用,并提出一种健康和患病声音分类的策略。所提出的框架集成了三种声音特征的功效:色度、梅尔频谱和梅尔频率倒谱系数(MFCC)。我们还设计了一个深度神经网络(DNN),能够从检索到的数据中学习,并生成一个高度准确的基于语音的疾病预测模型。该研究描述了一系列使用 Saarbruecken 语音数据库(SVD)来检测异常声音的研究。该模型使用高、中、低三个音高发音的元音 /a/、/i/ 和 /u/ 进行开发和测试。我们还保留了从 SVD 收集的“连续句子”音频文件,以选择开发的模型对全新数据的泛化程度。最高达到的准确率为 77.49%,优于同领域的先前尝试。此外,通过集成说话人性别信息,该模型的准确率达到 88.01%。在选定疾病上训练的设计模型也可以获得 96.77%的最大准确率(声带切除术×健康)。因此,所提出的框架最适合医疗保健行业。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/45e8c5de1536/CMMM2022-7814952.013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/6b227ce3cb34/CMMM2022-7814952.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/9e9468549218/CMMM2022-7814952.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/8a3ecb429cbf/CMMM2022-7814952.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/42d43e5183de/CMMM2022-7814952.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/fb350549e244/CMMM2022-7814952.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/df1c75ab3b94/CMMM2022-7814952.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/991319d3c499/CMMM2022-7814952.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/78c33e911b3b/CMMM2022-7814952.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/07c38769f27c/CMMM2022-7814952.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/1bf53d7b25cf/CMMM2022-7814952.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/4a8912d9b04d/CMMM2022-7814952.011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/52aca42dd53d/CMMM2022-7814952.012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/45e8c5de1536/CMMM2022-7814952.013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/6b227ce3cb34/CMMM2022-7814952.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/9e9468549218/CMMM2022-7814952.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/8a3ecb429cbf/CMMM2022-7814952.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/42d43e5183de/CMMM2022-7814952.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/fb350549e244/CMMM2022-7814952.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/df1c75ab3b94/CMMM2022-7814952.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/991319d3c499/CMMM2022-7814952.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/78c33e911b3b/CMMM2022-7814952.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/07c38769f27c/CMMM2022-7814952.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/1bf53d7b25cf/CMMM2022-7814952.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/4a8912d9b04d/CMMM2022-7814952.011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/52aca42dd53d/CMMM2022-7814952.012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/029f/9071878/45e8c5de1536/CMMM2022-7814952.013.jpg

相似文献

1
An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.基于 MFCC 和深度神经网络的语音病理学检测分析研究。
Comput Math Methods Med. 2022 Apr 4;2022:7814952. doi: 10.1155/2022/7814952. eCollection 2022.
2
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
3
Neurogenerative Disease Diagnosis in Cepstral Domain Using MFCC with Deep Learning.基于梅尔频率倒谱系数的深度学习在声谱域中的神经退行性疾病诊断
Comput Math Methods Med. 2022 Apr 4;2022:4364186. doi: 10.1155/2022/4364186. eCollection 2022.
4
Intra- and Inter-database Study for Arabic, English, and German Databases: Do Conventional Speech Features Detect Voice Pathology?阿拉伯语、英语和德语数据库的库内及库间研究:传统语音特征能否检测语音病理学?
J Voice. 2017 May;31(3):386.e1-386.e8. doi: 10.1016/j.jvoice.2016.09.009. Epub 2016 Oct 10.
5
Unraveling the complexities of pathological voice through saliency analysis.通过显著分析揭示病理嗓音的复杂性。
Comput Biol Med. 2023 Nov;166:107566. doi: 10.1016/j.compbiomed.2023.107566. Epub 2023 Oct 14.
6
The Effect of the MFCC Frame Length in Automatic Voice Pathology Detection.MFCC 帧数对自动语音病理学检测的影响。
J Voice. 2024 Sep;38(5):975-982. doi: 10.1016/j.jvoice.2022.03.021. Epub 2022 Apr 27.
7
Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions.基于相关函数的不同频率区域语音病理学检测与分类研究
J Voice. 2017 Jan;31(1):3-15. doi: 10.1016/j.jvoice.2016.01.014. Epub 2016 Mar 15.
8
Discrimination between pathological and normal voices using GMM-SVM approach.基于 GMM-SVM 方法的病理性嗓音与正常嗓音的区分。
J Voice. 2011 Jan;25(1):38-43. doi: 10.1016/j.jvoice.2009.08.002. Epub 2010 Feb 4.
9
Voice pathology detection using optimized convolutional neural networks and explainable artificial intelligence-based analysis.基于优化卷积神经网络和可解释人工智能的语音病理学检测。
Comput Methods Biomech Biomed Engin. 2024 Nov;27(14):2041-2057. doi: 10.1080/10255842.2023.2270102. Epub 2023 Oct 18.
10
Voice pathology detection and classification from speech signals and EGG signals based on a multimodal fusion method.基于多模态融合方法的语音信号和 EEG 信号的语音病理学检测与分类。
Biomed Tech (Berl). 2021 Nov 29;66(6):613-625. doi: 10.1515/bmt-2021-0112. Print 2021 Dec 20.

引用本文的文献

1
A safe and effective protocol for postdilution hemofiltration with regional citrate anticoagulation.局部枸橼酸抗凝后稀释血液滤过的安全有效方案。
BMC Nephrol. 2024 Jul 9;25(1):218. doi: 10.1186/s12882-024-03659-y.
2
Retracted: An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.撤回:基于梅尔频率倒谱系数(MFCC)和深度神经网络的言语病理学检测分析研究。
Comput Math Methods Med. 2023 Dec 13;2023:9829813. doi: 10.1155/2023/9829813. eCollection 2023.
3
A novel hybrid model integrating MFCC and acoustic parameters for voice disorder detection.

本文引用的文献

1
Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix.使用共现矩阵评估语音病理学以提高生活质量。
Sensors (Basel). 2017 Jan 29;17(2):267. doi: 10.3390/s17020267.
2
Speech disorders in Parkinson's disease: early diagnostics and effects of medication and brain stimulation.帕金森病中的言语障碍:早期诊断以及药物和脑刺激的影响
J Neural Transm (Vienna). 2017 Mar;124(3):303-334. doi: 10.1007/s00702-017-1676-0. Epub 2017 Jan 18.
3
Mobile Communication Devices, Ambient Noise, and Acoustic Voice Measures.移动通信设备、环境噪声与声学语音测量
一种融合 MFCC 和声学参数的新型混合模型用于语音障碍检测。
Sci Rep. 2023 Dec 20;13(1):22719. doi: 10.1038/s41598-023-49869-6.
4
An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection.多倒谱投影表示策略在嗓音障碍检测中的实验分析。
Sensors (Basel). 2023 May 30;23(11):5196. doi: 10.3390/s23115196.
J Voice. 2017 Mar;31(2):248.e11-248.e23. doi: 10.1016/j.jvoice.2016.07.023. Epub 2016 Sep 29.
4
An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification.在三个不同数据库中用于语音病理学检测和分类的多维语音程序参数研究
J Voice. 2017 Jan;31(1):113.e9-113.e18. doi: 10.1016/j.jvoice.2016.03.019. Epub 2016 Apr 19.
5
Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions.基于相关函数的不同频率区域语音病理学检测与分类研究
J Voice. 2017 Jan;31(1):3-15. doi: 10.1016/j.jvoice.2016.01.014. Epub 2016 Mar 15.
6
Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features.基于多窗梅尔频率倒谱系数特征的嗓音障碍分类
Comput Math Methods Med. 2015;2015:956249. doi: 10.1155/2015/956249. Epub 2015 Nov 22.
7
Voice data mining for laryngeal pathology assessment.声纹数据挖掘在喉科病理评估中的应用。
Comput Biol Med. 2016 Feb 1;69:270-6. doi: 10.1016/j.compbiomed.2015.07.026. Epub 2015 Aug 10.
8
On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices.结合调制谱和梅尔频率倒谱系数信息用于病理性嗓音自动检测
Logoped Phoniatr Vocol. 2011 Jul;36(2):60-9. doi: 10.3109/14015439.2010.528788. Epub 2010 Nov 12.
9
Using modulation spectra for voice pathology detection and classification.利用调制谱进行语音病理学检测与分类。
Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:2514-7. doi: 10.1109/IEMBS.2009.5334850.
10
Voice assessment: updates on perceptual, acoustic, aerodynamic, and endoscopic imaging methods.嗓音评估:感知、声学、空气动力学及内镜成像方法的进展
Curr Opin Otolaryngol Head Neck Surg. 2008 Jun;16(3):211-5. doi: 10.1097/MOO.0b013e3282fe96ce.