Suppr超能文献

基于拉盖尔-沃罗诺伊描述符的自监督开集说话人识别

Self-Supervised Open-Set Speaker Recognition with Laguerre-Voronoi Descriptors.

作者信息

Ohi Abu Quwsar, Gavrilova Marina L

机构信息

Department of Computer Science, University of Calgary, Calgary, AB T2N1N4, Canada.

出版信息

Sensors (Basel). 2024 Mar 21;24(6):1996. doi: 10.3390/s24061996.

Abstract

Speaker recognition is a challenging problem in behavioral biometrics that has been rigorously investigated over the last decade. Although numerous supervised closed-set systems inherit the power of deep neural networks, limited studies have been made on open-set speaker recognition. This paper proposes a self-supervised open-set speaker recognition that leverages the geometric properties of speaker distribution for accurate and robust speaker verification. The proposed framework consists of a deep neural network incorporating a wider viewpoint of temporal speech features and Laguerre-Voronoi diagram-based speech feature extraction. The deep neural network is trained with a specialized clustering criterion that only requires positive pairs during training. The experiments validated that the proposed system outperformed current state-of-the-art methods in open-set speaker recognition and cluster representation.

摘要

说话人识别是行为生物识别领域中一个具有挑战性的问题,在过去十年中受到了严格的研究。尽管许多有监督的闭集系统继承了深度神经网络的强大功能,但对开集说话人识别的研究却很有限。本文提出了一种自监督的开集说话人识别方法,该方法利用说话人分布的几何特性进行准确且稳健的说话人验证。所提出的框架由一个深度神经网络组成,该网络结合了更广泛的时间语音特征观点和基于拉盖尔 - 沃罗诺伊图的语音特征提取。深度神经网络通过一种专门的聚类准则进行训练,该准则在训练期间只需要正样本对。实验验证了所提出的系统在开集说话人识别和聚类表示方面优于当前最先进的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e354/10975617/0c38d8d77142/sensors-24-01996-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验