• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于双通道变换的网络,具有均衡生成组件预测,用于在时域中对柔性振动传感器语音进行增强。

Dual-path transformer-based network with equalization-generation components prediction for flexible vibrational sensor speech enhancement in the time domain.

机构信息

High-tech Institute, Fan Gong-ting South Street on the 12th, Weifang 261000, China.

Command and Control Engineering College, Army Engineering University, Nanjing 210007, China.

出版信息

J Acoust Soc Am. 2022 May;151(5):2814. doi: 10.1121/10.0010316.

DOI:10.1121/10.0010316
PMID:35649897
Abstract

The flexible vibrational sensor (FVS) has the potential to become a popular wearable communication device because of its natural noise shielding characteristics and soft materials. However, FVS speech faces a severe loss of frequency components. To improve speech quality, a time-domain neural network model based on the dual-path transformer combined with equalization-generation components prediction (DPT-EGNet) is proposed. More specifically, the DPT-EGNet consists of five modules, namely the pre-processing module, dual-path transformer module, equalization module, generation module, and post-processing module. The dual-path transformer module is leveraged to extract the local and global contextual relationship of long-term speech sequences, which is extremely beneficial for inferring the missing components. The equalization and generation modules are designed according to the characteristics of FVS speech, which further improve the speech quality by simulating the inversion process of the speech distortion. The experimental results demonstrate that the proposed model effectively improves the quality of FVS speech; the average perceptual evaluation of speech quality (PESQ), short-time objective intelligibility (STOI), and composite measure for overall speech quality (COVL) scores of three males and three females are relatively increased by 64.19%, 29.63%, and 101.37%, which is superior to other baseline models developed in different domains. The proposed model also has significantly lower complexity than the others.

摘要

柔性振动传感器 (FVS) 具有成为流行的可穿戴通信设备的潜力,因为它具有天然的噪声屏蔽特性和柔软的材料。然而,FVS 语音面临着严重的频率分量损失。为了提高语音质量,提出了一种基于双路径变换器结合均衡-生成组件预测 (DPT-EGNet) 的时域神经网络模型。更具体地说,DPT-EGNet 由五个模块组成,即预处理模块、双路径变换器模块、均衡模块、生成模块和后处理模块。双路径变换器模块用于提取长期语音序列的局部和全局上下文关系,这对于推断缺失分量非常有利。均衡和生成模块是根据 FVS 语音的特点设计的,通过模拟语音失真的反过程,进一步提高了语音质量。实验结果表明,所提出的模型有效地提高了 FVS 语音的质量;三个男性和三个女性的平均语音质量感知评估 (PESQ)、短期客观可懂度 (STOI) 和整体语音质量综合测量 (COVL) 得分分别相对提高了 64.19%、29.63%和 101.37%,优于其他在不同领域开发的基线模型。与其他模型相比,所提出的模型的复杂度也显著降低。

相似文献

1
Dual-path transformer-based network with equalization-generation components prediction for flexible vibrational sensor speech enhancement in the time domain.基于双通道变换的网络,具有均衡生成组件预测,用于在时域中对柔性振动传感器语音进行增强。
J Acoust Soc Am. 2022 May;151(5):2814. doi: 10.1121/10.0010316.
2
Smartphone-based real-time speech enhancement for improving hearing aids speech perception.基于智能手机的实时语音增强技术,用于改善助听器的语音感知能力。
Annu Int Conf IEEE Eng Med Biol Soc. 2016 Aug;2016:5885-5888. doi: 10.1109/EMBC.2016.7592067.
3
Improved Transformer-Based Dual-Path Network with Amplitude and Complex Domain Feature Fusion for Speech Enhancement.基于改进Transformer的双路径网络,融合幅度和复域特征用于语音增强
Entropy (Basel). 2023 Jan 26;25(2):228. doi: 10.3390/e25020228.
4
Objective measures of perceptual quality for predicting speech intelligibility in sensorineural hearing loss.预测感音神经性听力损失中言语可懂度的感知质量客观测量方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug;2015:5577-80. doi: 10.1109/EMBC.2015.7319656.
5
Towards real-world objective speech quality and intelligibility assessment using speech-enhancement residuals and convolutional long short-term memory networks.利用语音增强残差和卷积长短期记忆网络进行真实客观的语音质量和可懂度评估。
J Acoust Soc Am. 2020 Nov;148(5):3348. doi: 10.1121/10.0002702.
6
Speech Enhancement by Multiple Propagation through the Same Neural Network.通过同一个神经网络多次传播进行语音增强。
Sensors (Basel). 2022 Mar 22;22(7):2440. doi: 10.3390/s22072440.
7
Comparison of different forms of compression using wearable digital hearing aids.使用可穿戴数字助听器对不同压缩形式的比较。
J Acoust Soc Am. 1999 Dec;106(6):3603-19. doi: 10.1121/1.428213.
8
Speech quality evaluation of a sparse coding shrinkage noise reduction algorithm with normal hearing and hearing impaired listeners.针对正常听力和听力受损听众的稀疏编码收缩降噪算法的语音质量评估
Hear Res. 2015 Sep;327:175-85. doi: 10.1016/j.heares.2015.07.019. Epub 2015 Jul 29.
9
Effect of Energy Equalization on the Intelligibility of Speech in Fluctuating Background Interference for Listeners With Hearing Impairment.能量均衡对听力障碍者在波动背景干扰下言语可懂度的影响。
Trends Hear. 2017 Jan-Dec;21:2331216517710354. doi: 10.1177/2331216517710354.
10
Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners.受听觉启发的机器学习技术可以提高听力受损听众的语音清晰度和质量。
J Acoust Soc Am. 2017 Mar;141(3):1985. doi: 10.1121/1.4977197.

引用本文的文献

1
Deep learning-enhanced anti-noise triboelectric acoustic sensor for human-machine collaboration in noisy environments.用于嘈杂环境中人机协作的深度学习增强型抗噪声摩擦电声学传感器。
Nat Commun. 2025 May 8;16(1):4276. doi: 10.1038/s41467-025-59523-6.