Institute of Optoelectronics, Military University of Technology, 2 Kaliski Street, 00-908 Warsaw, Poland.
BITRES Sp. z o.o., 9/2 Chałubiński Street, 02-004 Warsaw, Poland.
Sensors (Basel). 2022 Dec 1;22(23):9370. doi: 10.3390/s22239370.
This article presents the Automatic Speaker Recognition System (ASR System), which successfully resolves problems such as identification within an open set of speakers and the verification of speakers in difficult recording conditions similar to telephone transmission conditions. The article provides complete information on the architecture of the various internal processing modules of the ASR System. The speaker recognition system proposed in the article, has been compared very closely to other competing systems, achieving improved speaker identification and verification results, on known certified voice dataset. The ASR System owes this to the dual use of genetic algorithms both in the feature selection process and in the optimization of the system's internal parameters. This was also influenced by the proprietary feature generation and corresponding classification process using Gaussian mixture models. This allowed the development of a system that makes an important contribution to the current state of the art in speaker recognition systems for telephone transmission applications with known speech coding standards.
本文提出了自动说话人识别系统(ASR 系统),成功解决了在开放式说话人集内的识别问题,以及在类似于电话传输条件的困难录音条件下对说话人的验证问题。本文提供了 ASR 系统各个内部处理模块的架构的完整信息。本文提出的说话人识别系统与其他竞争系统进行了非常密切的比较,在已知的认证语音数据集上,取得了改进的说话人识别和验证结果。ASR 系统之所以能够实现这一点,是因为遗传算法在特征选择过程和系统内部参数优化中都得到了双重应用。这也受到了使用高斯混合模型的专有特征生成和相应分类过程的影响。这使得开发出的系统为当前具有已知语音编码标准的电话传输应用中的说话人识别系统的最新技术水平做出了重要贡献。