基于 GFM 的说话人识别方法。

GFM-based methods for speaker identification.

机构信息

Netaji Subhas Institute of Technology, University of Delhi, New Delhi 110 078, India.

出版信息

IEEE Trans Cybern. 2013 Jun;43(3):1047-58. doi: 10.1109/TSMCB.2012.2223461. Epub 2012 Oct 26.

DOI:10.1109/TSMCB.2012.2223461

PMID:23193244

Abstract

This paper presents three novel methods for speaker identification of which two methods utilize both the continuous density hidden Markov model (HMM) and the generalized fuzzy model (GFM), which has the advantages of both Mamdani and Takagi-Sugeno models. In the first method, the HMM is utilized for the extraction of shape-based batch feature vector that is fitted with the GFM to identify the speaker. On the other hand, the second method makes use of the Gaussian mixture model (GMM) and the GFM for the identification of speakers. Finally, the third method has been inspired by the way humans cash in on the mutual acquaintances while identifying a speaker. To see the validity of the proposed models [HMM-GFM, GMM-GFM, and HMM-GFM (fusion)] in a real-life scenario, they are tested on VoxForge speech corpus and on the subset of the 2003 National Institute of Standards and Technology evaluation data set. These models are also evaluated on the corrupted VoxForge speech corpus by mixing with different types of noisy signals at different values of signal-to-noise ratios, and their performance is found superior to that of the well-known models.

摘要

本文提出了三种新颖的说话人识别方法，其中两种方法同时利用连续密度隐马尔可夫模型（HMM）和广义模糊模型（GFM），该模型结合了 Mamdani 和 Takagi-Sugeno 模型的优点。在第一种方法中，HMM 用于提取基于形状的批量特征向量，然后使用 GFM 对其进行拟合，以识别说话人。另一方面，第二种方法利用高斯混合模型（GMM）和 GFM 来识别说话人。最后，第三种方法受到人类在识别说话人时利用相互熟人的方式的启发。为了在实际场景中验证所提出模型（HMM-GFM、GMM-GFM 和 HMM-GFM（融合））的有效性，我们在 VoxForge 语音语料库和 2003 年国家标准与技术评估数据集的子集上对其进行了测试。我们还在受污染的 VoxForge 语音语料库上对这些模型进行了评估，通过在不同信噪比下混合不同类型的噪声信号，发现它们的性能优于著名模型。

相似文献

GFM-based methods for speaker identification.

IEEE Trans Cybern. 2013 Jun;43(3):1047-58. doi: 10.1109/TSMCB.2012.2223461. Epub 2012 Oct 26.

Adaptation of hidden Markov models for recognizing speech of reduced frame rate.

IEEE Trans Cybern. 2013 Dec;43(6):2114-21. doi: 10.1109/TCYB.2013.2240450.

Maximum confidence hidden markov modeling for face recognition.

IEEE Trans Pattern Anal Mach Intell. 2008 Apr;30(4):606-16. doi: 10.1109/TPAMI.2007.70715.

Comments on "a separable low complexity 2D HMM with application to face recognition'.

IEEE Trans Pattern Anal Mach Intell. 2007 Feb;29(2):368. doi: 10.1109/TPAMI.2007.27.

Fuzzy Markov random fields versus chains for multispectral image segmentation.

IEEE Trans Pattern Anal Mach Intell. 2006 Nov;28(11):1753-67. doi: 10.1109/TPAMI.2006.228.

Detecting objects of variable shape structure with hidden state shape models.

IEEE Trans Pattern Anal Mach Intell. 2008 Mar;30(3):477-92. doi: 10.1109/TPAMI.2007.1178.

Performance enhancement for audio-visual speaker identification using dynamic facial muscle model.

Med Biol Eng Comput. 2006 Oct;44(10):919-30. doi: 10.1007/s11517-006-0106-5. Epub 2006 Sep 26.

EMG-based speech recognition using hidden markov models with global control variables.

IEEE Trans Biomed Eng. 2008 Mar;55(3):930-40. doi: 10.1109/TBME.2008.915658.

Sign language recognition by combining statistical DTW and independent classification.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):2040-6. doi: 10.1109/TPAMI.2008.123.

A model-based sequence similarity with application to handwritten word spotting.

IEEE Trans Pattern Anal Mach Intell. 2012 Nov;34(11):2108-20. doi: 10.1109/TPAMI.2012.25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 GFM 的说话人识别方法。

GFM-based methods for speaker identification.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献