• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 SincNet 学习病理性嗓音障碍。

Using SincNet for Learning Pathological Voice Disorders.

机构信息

Department of Electrical Engineering, Yuan Ze University, Taoyuan 320, Taiwan.

Department of Otolaryngology Head and Neck Surgery, Far Eastern Memorial Hospital, New Taipei City 220, Taiwan.

出版信息

Sensors (Basel). 2022 Sep 2;22(17):6634. doi: 10.3390/s22176634.

DOI:10.3390/s22176634
PMID:36081092
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9460101/
Abstract

Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%-accuracy and 9%-sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results.

摘要

深度学习技术,如卷积神经网络 (CNN),已成功应用于识别病理性嗓音。然而,使用这些先进模型的主要缺点是缺乏可解释性来解释预测结果。这一缺陷进一步为促进嗓音障碍系统的分类或检测带来了瓶颈,尤其是在当前大流行期间。在本文中,我们提出使用一系列可学习的 sinc 函数来替代常用 CNN 的第一层,以开发一个可解释的 sincNet 系统,用于对病理性嗓音进行分类或检测。应用的 sinc 滤波器是 sincNet 的前端信号处理器,对于构建有意义的层至关重要,并且直接用于提取后续网络的声学特征,以生成高级别的语音信息。我们在三个不同的远东纪念医院语音数据集上进行了测试。从我们的评估结果来看,与传统方法相比,所提出的方法在预测 sincNet 系统的输入病理性波形方面取得了最高 7%的准确率和 9%的灵敏度提高,从而展示了卓越的性能。更重要的是,我们根据评估结果,试图在系统输出和提取的第一层语音特征之间给出可能的解释。

相似文献

1
Using SincNet for Learning Pathological Voice Disorders.基于 SincNet 学习病理性嗓音障碍。
Sensors (Basel). 2022 Sep 2;22(17):6634. doi: 10.3390/s22176634.
2
Classification of Voice Disorders Using a One-Dimensional Convolutional Neural Network.基于一维卷积神经网络的嗓音障碍分类
J Voice. 2022 Jan;36(1):15-20. doi: 10.1016/j.jvoice.2020.02.009. Epub 2020 Mar 13.
3
Interpretable SincNet-based Deep Learning for Emotion Recognition from EEG brain activity.基于可解释SincNet的深度学习用于从脑电图脑活动中进行情绪识别。
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:412-415. doi: 10.1109/EMBC46164.2021.9630427.
4
Convolutional Neural Networks for Pathological Voice Detection.用于病理性语音检测的卷积神经网络
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.
5
EEG Emotion Classification Using an Improved SincNet-Based Deep Learning Model.基于改进型SincNet深度学习模型的脑电图情感分类
Brain Sci. 2019 Nov 14;9(11):326. doi: 10.3390/brainsci9110326.
6
Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture.基于开源深度学习架构的原始声音波形的鸟类叫声的生物声学分类。
Sci Rep. 2021 Aug 3;11(1):15733. doi: 10.1038/s41598-021-95076-6.
7
Design and Validation of a New Diagnostic Tool for the Differentiation of Pathological Voices in Parkinsonian Patients.设计和验证一种用于帕金森病患者病理性声音鉴别诊断的新工具。
Adv Exp Med Biol. 2021;1339:77-83. doi: 10.1007/978-3-030-78787-5_11.
8
Voice pathology detection using optimized convolutional neural networks and explainable artificial intelligence-based analysis.基于优化卷积神经网络和可解释人工智能的语音病理学检测。
Comput Methods Biomech Biomed Engin. 2024 Nov;27(14):2041-2057. doi: 10.1080/10255842.2023.2270102. Epub 2023 Oct 18.
9
Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions.基于相关函数的不同频率区域语音病理学检测与分类研究
J Voice. 2017 Jan;31(1):3-15. doi: 10.1016/j.jvoice.2016.01.014. Epub 2016 Mar 15.
10
Neurogenerative Disease Diagnosis in Cepstral Domain Using MFCC with Deep Learning.基于梅尔频率倒谱系数的深度学习在声谱域中的神经退行性疾病诊断
Comput Math Methods Med. 2022 Apr 4;2022:4364186. doi: 10.1155/2022/4364186. eCollection 2022.

引用本文的文献

1
Deep learning-based classification of speech disorder in stroke and hearing impairment.基于深度学习的中风和听力障碍语音障碍分类
PLoS One. 2025 May 28;20(5):e0315286. doi: 10.1371/journal.pone.0315286. eCollection 2025.
2
A Deep-Learning Model for Multi-class Audio Classification of Vocal Fold Pathologies in Office Stroboscopy.一种用于办公室频闪喉镜检查中声带病变多类别音频分类的深度学习模型。
Laryngoscope. 2025 Jul;135(7):2428-2436. doi: 10.1002/lary.32036. Epub 2025 Feb 5.
3
Laryngeal disease classification using voice data: Octave-band vs. mel-frequency filters.

本文引用的文献

1
Continuous Speech for Improved Learning Pathological Voice Disorders.用于改善学习病理性嗓音障碍的连续语音
IEEE Open J Eng Med Biol. 2022 Feb 14;3:25-33. doi: 10.1109/OJEMB.2022.3151233. eCollection 2022.
2
From Local Explanations to Global Understanding with Explainable AI for Trees.利用可解释人工智能实现从局部解释到树木的全局理解
Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.
3
Convolutional Neural Networks for Pathological Voice Detection.用于病理性语音检测的卷积神经网络
使用语音数据进行喉疾病分类:倍频程滤波器与梅尔频率滤波器
Heliyon. 2024 Nov 30;10(24):e40748. doi: 10.1016/j.heliyon.2024.e40748. eCollection 2024 Dec 30.
4
New developments in the application of artificial intelligence to laryngology.人工智能在喉科学中的应用新进展。
Curr Opin Otolaryngol Head Neck Surg. 2024 Dec 1;32(6):391-397. doi: 10.1097/MOO.0000000000000999. Epub 2024 Jul 24.
5
Classification of laryngeal diseases including laryngeal cancer, benign mucosal disease, and vocal cord paralysis by artificial intelligence using voice analysis.利用语音分析通过人工智能对包括喉癌、良性黏膜疾病和声带麻痹在内的喉部疾病进行分类。
Sci Rep. 2024 Apr 23;14(1):9297. doi: 10.1038/s41598-024-58817-x.
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.
4
A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders.机器学习方法在自动语音障碍检测中的研究进展
J Voice. 2019 Nov;33(6):947.e11-947.e33. doi: 10.1016/j.jvoice.2018.07.014. Epub 2018 Oct 11.
5
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.基于倒谱向量的病理性嗓音检测:深度学习方法。
J Voice. 2019 Sep;33(5):634-641. doi: 10.1016/j.jvoice.2018.02.003. Epub 2018 Mar 19.
6
Approximated and User Steerable tSNE for Progressive Visual Analytics.渐进式可视分析的近似和用户可引导 t-SNE。
IEEE Trans Vis Comput Graph. 2017 Jul;23(7):1739-1752. doi: 10.1109/TVCG.2016.2570755. Epub 2016 May 19.
7
Using Ambulatory Voice Monitoring to Investigate Common Voice Disorders: Research Update.使用动态语音监测研究常见嗓音障碍:研究进展。
Front Bioeng Biotechnol. 2015 Oct 16;3:155. doi: 10.3389/fbioe.2015.00155. eCollection 2015.
8
Prevalence of voice disorders in teachers and the general population.教师与普通人群中嗓音疾病的患病率。
J Speech Lang Hear Res. 2004 Apr;47(2):281-93. doi: 10.1044/1092-4388(2004/023).