Suppr超能文献

基于 GFM 的说话人识别方法。

GFM-based methods for speaker identification.

机构信息

Netaji Subhas Institute of Technology, University of Delhi, New Delhi 110 078, India.

出版信息

IEEE Trans Cybern. 2013 Jun;43(3):1047-58. doi: 10.1109/TSMCB.2012.2223461. Epub 2012 Oct 26.

Abstract

This paper presents three novel methods for speaker identification of which two methods utilize both the continuous density hidden Markov model (HMM) and the generalized fuzzy model (GFM), which has the advantages of both Mamdani and Takagi-Sugeno models. In the first method, the HMM is utilized for the extraction of shape-based batch feature vector that is fitted with the GFM to identify the speaker. On the other hand, the second method makes use of the Gaussian mixture model (GMM) and the GFM for the identification of speakers. Finally, the third method has been inspired by the way humans cash in on the mutual acquaintances while identifying a speaker. To see the validity of the proposed models [HMM-GFM, GMM-GFM, and HMM-GFM (fusion)] in a real-life scenario, they are tested on VoxForge speech corpus and on the subset of the 2003 National Institute of Standards and Technology evaluation data set. These models are also evaluated on the corrupted VoxForge speech corpus by mixing with different types of noisy signals at different values of signal-to-noise ratios, and their performance is found superior to that of the well-known models.

摘要

本文提出了三种新颖的说话人识别方法,其中两种方法同时利用连续密度隐马尔可夫模型(HMM)和广义模糊模型(GFM),该模型结合了 Mamdani 和 Takagi-Sugeno 模型的优点。在第一种方法中,HMM 用于提取基于形状的批量特征向量,然后使用 GFM 对其进行拟合,以识别说话人。另一方面,第二种方法利用高斯混合模型(GMM)和 GFM 来识别说话人。最后,第三种方法受到人类在识别说话人时利用相互熟人的方式的启发。为了在实际场景中验证所提出模型(HMM-GFM、GMM-GFM 和 HMM-GFM(融合))的有效性,我们在 VoxForge 语音语料库和 2003 年国家标准与技术评估数据集的子集上对其进行了测试。我们还在受污染的 VoxForge 语音语料库上对这些模型进行了评估,通过在不同信噪比下混合不同类型的噪声信号,发现它们的性能优于著名模型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验