Lee Gihyoun, Na Sung Dae, Cho Jin-Ho, Kim Myoung Nam
Department of Medical & Biological Engineering, Graduate School, Kyungpook National University, 680, Gukchaebosang-ro, Jung-gu, Daegu 700-842, Korea.
School of Electronics Engineering, College of IT Engineering, Kyungpook National University, 680, Gukchaebosang-ro, Jung-gu, Daegu 700-842, Korea.
Biomed Mater Eng. 2014;24(6):3295-301. doi: 10.3233/BME-141152.
This paper presents a voice activity detection (VAD) approach using a perceptual wavelet entropy neighbor slope (PWENS) in a low signal-to-noise (SNR) environment and with a variety of noise types. The basis for our study is to use acoustic features that have large entropy variance for each wavelet critical band. The speech signal is decomposed by the proposed perceptual wavelet packet decomposition (PWPD), and the VAD function is extracted by PWENS. Finally, VAD is decided by the proposed VAD decision rule using two memory buffers. In order to evaluate the performance of the VAD decision, many speech samples and a variety of SNR conditions were used in the experiment. The performance of the VAD decision is confirmed using objective indexes such as a graph of the VAD decision and the relative error rate.
本文提出了一种在低信噪比(SNR)环境下且针对多种噪声类型的使用感知小波熵邻域斜率(PWENS)的语音活动检测(VAD)方法。我们研究的基础是使用在每个小波临界带具有大熵方差的声学特征。语音信号通过所提出的感知小波包分解(PWPD)进行分解,并且通过PWENS提取VAD函数。最后,使用两个存储缓冲区通过所提出的VAD决策规则来确定VAD。为了评估VAD决策的性能,实验中使用了许多语音样本和各种SNR条件。使用诸如VAD决策图和相对错误率等客观指标来确认VAD决策的性能。