• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于病理性语音检测的卷积神经网络

Convolutional Neural Networks for Pathological Voice Detection.

作者信息

Wu Huiyi, Soraghan John, Lowit Anja, Di Caterina Gaetano

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.

DOI:10.1109/EMBC.2018.8513222
PMID:30440307
Abstract

Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5%, 66.2% and 77.0% accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.

摘要

使用信号处理工具进行声学分析可用于提取语音特征,以区分语音是否病态或健康。所提出的工作将语音数据库中语音记录的频谱图作为卷积神经网络(CNN)的输入,用于自动提取紊乱语音和正常语音的特征并进行分类。该新型分类器在482个正常语音文件和482个器质性发声障碍语音文件的训练、验证和测试数据集上分别达到了88.5%、66.2%和77.0%的准确率。结果表明,在萨尔布吕肯语音数据库上提出的新型算法可有效地用于筛选病态语音记录。

相似文献

1
Convolutional Neural Networks for Pathological Voice Detection.用于病理性语音检测的卷积神经网络
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.
2
Design and Validation of a New Diagnostic Tool for the Differentiation of Pathological Voices in Parkinsonian Patients.设计和验证一种用于帕金森病患者病理性声音鉴别诊断的新工具。
Adv Exp Med Biol. 2021;1339:77-83. doi: 10.1007/978-3-030-78787-5_11.
3
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
4
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.基于倒谱向量的病理性嗓音检测:深度学习方法。
J Voice. 2019 Sep;33(5):634-641. doi: 10.1016/j.jvoice.2018.02.003. Epub 2018 Mar 19.
5
Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions.基于相关函数的不同频率区域语音病理学检测与分类研究
J Voice. 2017 Jan;31(1):3-15. doi: 10.1016/j.jvoice.2016.01.014. Epub 2016 Mar 15.
6
Convolutional neural network ensemble for Parkinson's disease detection from voice recordings.用于从语音记录中检测帕金森病的卷积神经网络集成
Comput Biol Med. 2022 Feb;141:105021. doi: 10.1016/j.compbiomed.2021.105021. Epub 2021 Nov 9.
7
Validation of the Cepstral Spectral Index of Dysphonia (CSID) as a Screening Tool for Voice Disorders: Development of Clinical Cutoff Scores.嗓音障碍的谐波倒谱谱指数(CSID)作为嗓音疾病筛查工具的验证:临床临界值的制定
J Voice. 2016 Mar;30(2):130-44. doi: 10.1016/j.jvoice.2015.04.009. Epub 2015 Sep 8.
8
Voice pathology detection and classification from speech signals and EGG signals based on a multimodal fusion method.基于多模态融合方法的语音信号和 EEG 信号的语音病理学检测与分类。
Biomed Tech (Berl). 2021 Nov 29;66(6):613-625. doi: 10.1515/bmt-2021-0112. Print 2021 Dec 20.
9
The Effect of the MFCC Frame Length in Automatic Voice Pathology Detection.MFCC 帧数对自动语音病理学检测的影响。
J Voice. 2024 Sep;38(5):975-982. doi: 10.1016/j.jvoice.2022.03.021. Epub 2022 Apr 27.
10
Voice pathology detection using optimized convolutional neural networks and explainable artificial intelligence-based analysis.基于优化卷积神经网络和可解释人工智能的语音病理学检测。
Comput Methods Biomech Biomed Engin. 2024 Nov;27(14):2041-2057. doi: 10.1080/10255842.2023.2270102. Epub 2023 Oct 18.

引用本文的文献

1
Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.基于梅尔频谱图和卷积神经网络的单侧声带麻痹严重程度自动评估研究
Biomed Eng Online. 2025 Jun 21;24(1):76. doi: 10.1186/s12938-025-01401-9.
2
A hybrid approach for binary and multi-class classification of voice disorders using a pre-trained model and ensemble classifiers.一种使用预训练模型和集成分类器对语音障碍进行二分类和多分类的混合方法。
BMC Med Inform Decis Mak. 2025 May 1;25(1):177. doi: 10.1186/s12911-025-02978-w.
3
A Deep-Learning Model for Multi-class Audio Classification of Vocal Fold Pathologies in Office Stroboscopy.
一种用于办公室频闪喉镜检查中声带病变多类别音频分类的深度学习模型。
Laryngoscope. 2025 Jul;135(7):2428-2436. doi: 10.1002/lary.32036. Epub 2025 Feb 5.
4
Multitask and Transfer Learning Approach for Joint Classification and Severity Estimation of Dysphonia.多任务和迁移学习方法在联合分类和嗓音障碍严重程度估计中的应用。
IEEE J Transl Eng Health Med. 2023 Dec 7;12:233-244. doi: 10.1109/JTEHM.2023.3340345. eCollection 2024.
5
Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.用于嗓音障碍语音检测的稳健嗓音质量特征嵌入
IEEE/ACM Trans Audio Speech Lang Process. 2023;31:1348-1359. doi: 10.1109/taslp.2023.3261753. Epub 2023 Mar 28.
6
Using SincNet for Learning Pathological Voice Disorders.基于 SincNet 学习病理性嗓音障碍。
Sensors (Basel). 2022 Sep 2;22(17):6634. doi: 10.3390/s22176634.
7
State-of-the-Art Deep Learning Methods on Electrocardiogram Data: Systematic Review.心电图数据的最新深度学习方法:系统综述。
JMIR Med Inform. 2022 Aug 15;10(8):e38454. doi: 10.2196/38454.
8
Lightweight Deep Learning Model for Assessment of Substitution Voicing and Speech after Laryngeal Carcinoma Surgery.用于评估喉癌手术后替代发声和语音的轻量级深度学习模型
Cancers (Basel). 2022 May 11;14(10):2366. doi: 10.3390/cancers14102366.
9
Neurogenerative Disease Diagnosis in Cepstral Domain Using MFCC with Deep Learning.基于梅尔频率倒谱系数的深度学习在声谱域中的神经退行性疾病诊断
Comput Math Methods Med. 2022 Apr 4;2022:4364186. doi: 10.1155/2022/4364186. eCollection 2022.
10
Continuous Speech for Improved Learning Pathological Voice Disorders.用于改善学习病理性嗓音障碍的连续语音
IEEE Open J Eng Med Biol. 2022 Feb 14;3:25-33. doi: 10.1109/OJEMB.2022.3151233. eCollection 2022.