Suppr超能文献

面向在线最大似然语音聚类和分离。

Towards online maximum-likelihood-based speech clustering and separation.

机构信息

NTT Communication Science Laboratories, 2-4 Hikaridai Seika-cho, 619-0237 Kyoto, Japan.

出版信息

J Acoust Soc Am. 2013 May;133(5):EL339-45. doi: 10.1121/1.4795851.

Abstract

This paper introduces an approach for online speech source clustering and separation, which is based on the utilization of the multichannel location information in a recursive expectation maximization (EM) algorithm. Specifically, the normalized multichannel speech-recording vector is employed as a feature vector and is modeled using Watson mixture model. The model parameters are determined by maximizing the data likelihood at every time-frequency slot in an online processing manner. Consequently, the proposed approach can continuously adjust the speech clusters. Promising results showing the advantage of the proposed approach over the batch EM algorithm in the case of two speakers with speaker movement are obtained.

摘要

本文提出了一种基于递归期望最大化(EM)算法中多通道位置信息利用的在线语音源聚类和分离方法。具体来说,使用归一化多通道语音记录向量作为特征向量,并使用 Watson 混合模型对其进行建模。模型参数通过以在线处理方式在每个时频槽中最大化数据似然度来确定。因此,所提出的方法可以不断调整语音聚类。在两个说话人存在说话人移动的情况下,与批量 EM 算法相比,所提出的方法具有优势,结果令人鼓舞。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验