Department of Computer Science and Technology, Anhui University of Finance and Economics, Bengbu 233030, China.
School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China.
Sensors (Basel). 2023 Mar 10;23(6):3015. doi: 10.3390/s23063015.
Singing-voice separation is a separation task that involves a singing voice and musical accompaniment. In this paper, we propose a novel, unsupervised methodology for extracting a singing voice from the background in a musical mixture. This method is a modification of robust principal component analysis (RPCA) that separates a singing voice by using weighting based on gammatone filterbank and vocal activity detection. Although RPCA is a helpful method for separating voices from the music mixture, it fails when one single value, such as drums, is much larger than others (e.g., the accompanying instruments). As a result, the proposed approach takes advantage of varying values between low-rank (background) and sparse matrices (singing voice). Additionally, we propose an expanded RPCA on the cochleagram by utilizing coalescent masking on the gammatone. Finally, we utilize vocal activity detection to enhance the separation outcomes by eliminating the lingering music signal. Evaluation results reveal that the proposed approach provides superior separation outcomes than RPCA on ccMixter and DSD100 datasets.
歌声分离是一种分离任务,涉及歌声和音乐伴奏。在本文中,我们提出了一种新颖的、无监督的方法,用于从音乐混合物中提取背景中的歌声。该方法是鲁棒主成分分析(RPCA)的一种改进,通过基于伽马滤波器组和语音活动检测的加权来分离歌声。虽然 RPCA 是一种从音乐混合物中分离声音的有用方法,但当单个值(例如鼓)比其他值(例如伴奏乐器)大得多时,它就会失效。因此,所提出的方法利用低秩(背景)和稀疏矩阵(歌声)之间的变化值。此外,我们还提出了一种基于伽马滤波的协同掩蔽的扩展 RPCA,并利用它对 Cochleagram 进行处理。最后,我们利用语音活动检测来消除残留的音乐信号,从而提高分离效果。评估结果表明,所提出的方法在 ccMixter 和 DSD100 数据集上的分离效果优于 RPCA。