评估语音识别解决方案在电子商务应用中的性能。

Evaluating the Performance of Speaker Recognition Solutions in E-Commerce Applications.

机构信息

Department for Information Technology, Faculty of Organizational Sciences, University of Belgrade, 11000 Belgrade, Serbia.

出版信息

Sensors (Basel). 2021 Sep 17;21(18):6231. doi: 10.3390/s21186231.

DOI:10.3390/s21186231

PMID:34577440

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8473232/

Abstract

Two important tasks in many e-commerce applications are identity verification of the user accessing the system and determining the level of rights that the user has for accessing and manipulating system's resources. The performance of these tasks is directly dependent on the certainty of establishing the identity of the user. The main research focus of this paper is user identity verification approach based on voice recognition techniques. The paper presents research results connected to the usage of open-source speaker recognition technologies in e-commerce applications with an emphasis on evaluating the performance of the algorithms they use. Four open-source speaker recognition solutions (SPEAR, MARF, ALIZE, and HTK) have been evaluated in cases of mismatched conditions during training and recognition phases. In practice, mismatched conditions are influenced by various lengths of spoken sentences, different types of recording devices, and the usage of different languages in training and recognition phases. All tests conducted in this research were performed in laboratory conditions using the specially designed framework for multimodal biometrics. The obtained results show consistency with the findings of recent research which proves that i-vectors and solutions based on probabilistic linear discriminant analysis (PLDA) continue to be the dominant speaker recognition approaches for text-independent tasks.

摘要

在许多电子商务应用中，有两个重要任务，分别是验证访问系统的用户身份和确定用户访问和操作系统资源的权限级别。这些任务的执行直接取决于确定用户身份的确定性。本文的主要研究重点是基于语音识别技术的用户身份验证方法。本文介绍了与在电子商务应用中使用开源说话人识别技术相关的研究结果，重点评估了它们所使用的算法的性能。在训练和识别阶段存在不匹配条件的情况下，评估了四个开源说话人识别解决方案（SPEAR、MARF、ALIZE 和 HTK）。在实践中，不匹配条件会受到各种长度的语音句子、不同类型的录音设备以及训练和识别阶段中使用不同语言的影响。本研究中进行的所有测试都是在实验室条件下使用专门设计的多模态生物识别框架进行的。所得结果与最近的研究结果一致，证明了 i-向量和基于概率线性判别分析（PLDA）的解决方案仍然是文本无关任务中占主导地位的说话人识别方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估语音识别解决方案在电子商务应用中的性能。

Evaluating the Performance of Speaker Recognition Solutions in E-Commerce Applications.

机构信息

出版信息

相似文献

引用本文的文献

评估语音识别解决方案在电子商务应用中的性能。

Evaluating the Performance of Speaker Recognition Solutions in E-Commerce Applications.

机构信息

出版信息

相似文献

引用本文的文献