基于卷积神经网络（CNN）的多通道语音增强系统的降噪方法，采用离散小波变换（DWT）预处理。

CNN-based noise reduction for multi-channel speech enhancement system with discrete wavelet transform (DWT) preprocessing.

作者信息

Cherukuru Pavani, Mustafa Mumtaz Begum

机构信息

Department of Software Engineering, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia.

Department of Information Science, Dayananda Sagar Academy of Technology and Management, Bangalore, Karnataka, India.

出版信息

PeerJ Comput Sci. 2024 Feb 28;10:e1901. doi: 10.7717/peerj-cs.1901. eCollection 2024.

DOI:10.7717/peerj-cs.1901

PMID:38435554

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10909157/

Abstract

Speech enhancement algorithms are applied in multiple levels of enhancement to improve the quality of speech signals under noisy environments known as multi-channel speech enhancement (MCSE) systems. Numerous existing algorithms are used to filter noise in speech enhancement systems, which are typically employed as a pre-processor to reduce noise and improve speech quality. They may, however, be limited in performing well under low signal-to-noise ratio (SNR) situations. The speech devices are exposed to all kinds of environmental noises which may go up to a high-level frequency of noises. The objective of this research is to conduct a noise reduction experiment for a multi-channel speech enhancement (MCSE) system in stationary and non-stationary environmental noisy situations with varying speech signal SNR levels. The experiments examined the performance of the existing and the proposed MCSE systems for environmental noises in filtering low to high SNRs environmental noises (-10 dB to 20 dB). The experiments were conducted using the AURORA and LibriSpeech datasets, which consist of different types of environmental noises. The existing MCSE (BAV-MCSE) makes use of beamforming, adaptive noise reduction and voice activity detection algorithms (BAV) to filter the noises from speech signals. The proposed MCSE (DWT-CNN-MCSE) system was developed based on discrete wavelet transform (DWT) preprocessing and convolution neural network (CNN) for denoising the input noisy speech signals to improve the performance accuracy. The performance of the existing BAV-MCSE and the proposed DWT-CNN-MCSE were measured using spectrogram analysis and word recognition rate (WRR). It was identified that the existing BAV-MCSE reported the highest WRR at 93.77% for a high SNR (at 20 dB) and 5.64% on average for a low SNR (at -10 dB) for different noises. The proposed DWT-CNN-MCSE system has proven to perform well at a low SNR with WRR of 70.55% and the highest improvement (64.91% WRR) at -10 dB SNR.

摘要

语音增强算法应用于多个增强级别，以改善在被称为多通道语音增强（MCSE）系统的噪声环境下的语音信号质量。众多现有算法用于语音增强系统中的噪声滤波，这些算法通常用作预处理器以降低噪声并提高语音质量。然而，它们在低信噪比（SNR）情况下的性能可能会受到限制。语音设备会受到各种环境噪声的影响，这些噪声可能高达高频噪声水平。本研究的目的是针对多通道语音增强（MCSE）系统，在具有不同语音信号SNR水平的平稳和非平稳环境噪声情况下进行降噪实验。实验检验了现有和提出的MCSE系统在过滤低到高SNR环境噪声（-10 dB至20 dB）时对环境噪声的性能。实验使用了包含不同类型环境噪声的AURORA和LibriSpeech数据集进行。现有的MCSE（BAV-MCSE）利用波束形成、自适应降噪和语音活动检测算法（BAV）从语音信号中过滤噪声。提出的MCSE（DWT-CNN-MCSE）系统基于离散小波变换（DWT）预处理和卷积神经网络（CNN）开发，用于对输入的带噪语音信号进行去噪，以提高性能精度。使用频谱图分析和单词识别率（WRR）测量了现有BAV-MCSE和提出的DWT-CNN-MCSE的性能。结果表明，对于不同噪声，现有的BAV-MCSE在高SNR（20 dB）时的WRR最高为93.77%，在低SNR（-10 dB）时平均为5.64%。提出的DWT-CNN-MCSE系统已证明在低SNR时性能良好，WRR为70.55%，在-10 dB SNR时改善最大（WRR为64.91%）。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c154/10909157/56caf5f134f8/peerj-cs-10-1901-g001.jpg

相似文献

CNN-based noise reduction for multi-channel speech enhancement system with discrete wavelet transform (DWT) preprocessing.基于卷积神经网络（CNN）的多通道语音增强系统的降噪方法，采用离散小波变换（DWT）预处理。

PeerJ Comput Sci. 2024 Feb 28;10:e1901. doi: 10.7717/peerj-cs.1901. eCollection 2024.

WaveCNet: Wavelet Integrated CNNs to Suppress Aliasing Effect for Noise-Robust Image Classification.WaveCNet：用于抑制抗噪图像分类中的混叠效应的小波集成 CNNs。

IEEE Trans Image Process. 2021;30:7074-7089. doi: 10.1109/TIP.2021.3101395. Epub 2021 Aug 10.

Fetal phonocardiogram signals denoising using improved complete ensemble (EMD) with adaptive noise and optimal thresholding of wavelet coefficients.基于改进的完全集合经验模态分解（EMD）自适应噪声与最优小波系数阈值法的胎儿心音信号去噪。

Biomed Tech (Berl). 2022 Jun 1;67(4):237-247. doi: 10.1515/bmt-2022-0006. Print 2022 Aug 26.

Wearable Hearing Device Spectral Enhancement Driven by Non-Negative Sparse Coding-Based Residual Noise Reduction.基于非负稀疏编码的残余噪声降低驱动的可穿戴听力设备频谱增强

Sensors (Basel). 2020 Oct 10;20(20):5751. doi: 10.3390/s20205751.

Denoising Brain Images with the Aid of Discrete Wavelet Transform and Monarch Butterfly Optimization with Different Noises.基于离散小波变换和带有不同噪声的帝王蝶优化算法对脑部图像进行去噪处理

J Med Syst. 2018 Sep 22;42(11):207. doi: 10.1007/s10916-018-1069-4.

A wavelet-based noise reduction algorithm and its clinical evaluation in cochlear implants.基于小波的降噪算法及其在人工耳蜗中的临床评估。

PLoS One. 2013 Sep 26;8(9):e75662. doi: 10.1371/journal.pone.0075662. eCollection 2013.

Variational mode decomposition based ECG denoising using non-local means and wavelet domain filtering.基于变分模态分解，采用非局部均值和小波域滤波的心电图去噪方法

Australas Phys Eng Sci Med. 2018 Dec;41(4):891-904. doi: 10.1007/s13246-018-0685-0. Epub 2018 Sep 6.

Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients.基于卷积神经网络的人工耳蜗植入者语音增强技术

Interspeech. 2019 Sep;2019:4265-4269. doi: 10.21437/interspeech.2019-1850.

Noise-robust speech triage.抗噪语音分诊。

J Acoust Soc Am. 2018 Apr;143(4):2313. doi: 10.1121/1.5031029.

Wavelet speech enhancement algorithm using exponential semi-soft mask filtering.基于指数半软掩蔽滤波的小波语音增强算法。

Bioengineered. 2016 Sep 2;7(5):352-356. doi: 10.1080/21655979.2016.1197617. Epub 2016 Jul 19.

引用本文的文献

Multichannel speech enhancement for automatic speech recognition: a literature review.用于自动语音识别的多通道语音增强：文献综述

PeerJ Comput Sci. 2025 Mar 27;11:e2772. doi: 10.7717/peerj-cs.2772. eCollection 2025.

本文引用的文献

A New Framework for CNN-Based Speech Enhancement in the Time Domain.基于卷积神经网络的时域语音增强新框架。

IEEE/ACM Trans Audio Speech Lang Process. 2019 Jul;27(7):1179-1188. doi: 10.1109/taslp.2019.2913512. Epub 2019 Apr 29.

Wearable Hearing Device Spectral Enhancement Driven by Non-Negative Sparse Coding-Based Residual Noise Reduction.基于非负稀疏编码的残余噪声降低驱动的可穿戴听力设备频谱增强

Sensors (Basel). 2020 Oct 10;20(20):5751. doi: 10.3390/s20205751.

Supervised Speech Separation Based on Deep Learning: An Overview.基于深度学习的监督语音分离：综述

IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于卷积神经网络（CNN）的多通道语音增强系统的降噪方法，采用离散小波变换（DWT）预处理。

CNN-based noise reduction for multi-channel speech enhancement system with discrete wavelet transform (DWT) preprocessing.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献