• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多模态融合方法的语音信号和 EEG 信号的语音病理学检测与分类。

Voice pathology detection and classification from speech signals and EGG signals based on a multimodal fusion method.

机构信息

School of Life Sciences, Tiangong University, Tianjin, China.

Tianjin Key Laboratory of Optoelectronic Detection Technology and System, Tianjin, China.

出版信息

Biomed Tech (Berl). 2021 Nov 29;66(6):613-625. doi: 10.1515/bmt-2021-0112. Print 2021 Dec 20.

DOI:10.1515/bmt-2021-0112
PMID:34845886
Abstract

Automatic voice pathology detection and classification plays an important role in the diagnosis and prevention of voice disorders. To accurately describe the pronunciation characteristics of patients with dysarthria and improve the effect of pathological voice detection, this study proposes a pathological voice detection method based on a multi-modal network structure. First, speech signals and electroglottography (EGG) signals are mapped from the time domain to the frequency domain spectrogram via a short-time Fourier transform (STFT). The Mel filter bank acts on the spectrogram to enhance the signal's harmonics and denoise. Second, a pre-trained convolutional neural network (CNN) is used as the backbone network to extract sound state features and vocal cord vibration features from the two signals. To obtain a better classification effect, the fused features are input into the long short-term memory (LSTM) network for voice feature selection and enhancement. The proposed system achieves 95.73% for accuracy with 96.10% F1-score and 96.73% recall using the Saarbrucken Voice Database (SVD); thus, enabling a new method for pathological speech detection.

摘要

自动语音病理学检测和分类在语音障碍的诊断和预防中起着重要作用。为了准确描述构音障碍患者的发音特征,提高病理语音检测效果,本研究提出了一种基于多模态网络结构的病理语音检测方法。首先,通过短时傅里叶变换(STFT)将语音信号和电声门图(EGG)信号从时域映射到频域声谱图。梅尔滤波器组作用于声谱图以增强信号的谐波并进行去噪。其次,使用预训练的卷积神经网络(CNN)作为骨干网络,从两种信号中提取声音状态特征和声带振动特征。为了获得更好的分类效果,将融合特征输入到长短期记忆(LSTM)网络中进行语音特征选择和增强。使用 Saarbrucken 语音数据库(SVD),该系统的准确率为 95.73%,F1 得分为 96.10%,召回率为 96.73%,从而为病理语音检测提供了一种新方法。

相似文献

1
Voice pathology detection and classification from speech signals and EGG signals based on a multimodal fusion method.基于多模态融合方法的语音信号和 EEG 信号的语音病理学检测与分类。
Biomed Tech (Berl). 2021 Nov 29;66(6):613-625. doi: 10.1515/bmt-2021-0112. Print 2021 Dec 20.
2
Pathological Voice Detection and Classification Based on Multimodal Transmission Network.基于多模态传输网络的病理性语音检测与分类
J Voice. 2025 May;39(3):591-601. doi: 10.1016/j.jvoice.2022.11.018. Epub 2022 Dec 5.
3
Convolutional Neural Networks for Pathological Voice Detection.用于病理性语音检测的卷积神经网络
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.
4
Voice pathology detection using optimized convolutional neural networks and explainable artificial intelligence-based analysis.基于优化卷积神经网络和可解释人工智能的语音病理学检测。
Comput Methods Biomech Biomed Engin. 2024 Nov;27(14):2041-2057. doi: 10.1080/10255842.2023.2270102. Epub 2023 Oct 18.
5
Deep Learning-Based Speech Enhancement of an Extrinsic Fabry-Perot Interferometric Fiber Acoustic Sensor System.基于深度学习的外腔 Fabry-Perot 干涉光纤声传感器系统的语音增强。
Sensors (Basel). 2023 Mar 29;23(7):3574. doi: 10.3390/s23073574.
6
E-DGAN: An Encoder-Decoder Generative Adversarial Network Based Method for Pathological to Normal Voice Conversion.E-DGAN:一种基于编解码器生成对抗网络的病理语音到正常语音转换方法。
IEEE J Biomed Health Inform. 2023 May;27(5):2489-2500. doi: 10.1109/JBHI.2023.3239551. Epub 2023 May 4.
7
Design and Validation of a New Diagnostic Tool for the Differentiation of Pathological Voices in Parkinsonian Patients.设计和验证一种用于帕金森病患者病理性声音鉴别诊断的新工具。
Adv Exp Med Biol. 2021;1339:77-83. doi: 10.1007/978-3-030-78787-5_11.
8
Using Voice Activity Detection and Deep Neural Networks with Hybrid Speech Feature Extraction for Deceptive Speech Detection.使用语音活动检测和具有混合语音特征提取的深度神经网络进行欺骗性语音检测。
Sensors (Basel). 2022 Feb 6;22(3):1228. doi: 10.3390/s22031228.
9
Unraveling the complexities of pathological voice through saliency analysis.通过显著分析揭示病理嗓音的复杂性。
Comput Biol Med. 2023 Nov;166:107566. doi: 10.1016/j.compbiomed.2023.107566. Epub 2023 Oct 14.
10
Diagnosis of pathological speech with streamlined features for long short-term memory learning.利用简化特征进行长短期记忆学习的病理性语音诊断。
Comput Biol Med. 2024 Mar;170:107976. doi: 10.1016/j.compbiomed.2024.107976. Epub 2024 Jan 8.