School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China.
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China.
Neural Netw. 2021 Jul;139:237-245. doi: 10.1016/j.neunet.2021.03.012. Epub 2021 Mar 18.
The existing keyword spotting (KWS) techniques can recognize pre-defined keywords well but have a poor recognition accuracy for user-defined keywords. In real use cases, there is a high demand for users to define their keywords for various reasons. To address the problem, in this work, three techniques have been proposed, including incremental training with revised loss function, data augmentation, and fine-grained training, to improve the accuracy for the user-defined keywords while maintaining high accuracy for pre-defined keywords. The proposed techniques are applied to a classical KWS model (cnn-trad-fpool3) and a state-of-the-art KWS model (res15) respectively. The experimental results show that the proposed techniques have better recognition accuracy than several existing methods for the recognition of use-defined keywords. With the proposed techniques, the recognition accuracy of user-defined keywords on cnn-trad-fpool3 and res15 are significantly improved by 21.78% and 24.42%, respectively.
现有的关键词检测(KWS)技术可以很好地识别预定义的关键词,但对用户定义的关键词的识别准确率较差。在实际应用中,由于各种原因,用户对定义自己的关键词有很高的需求。针对这个问题,在这项工作中,提出了三种技术,包括带有修订损失函数的增量训练、数据增强和细粒度训练,以提高用户定义关键词的准确率,同时保持对预定义关键词的高准确率。所提出的技术分别应用于一个经典的 KWS 模型(cnn-trad-fpool3)和一个最先进的 KWS 模型(res15)。实验结果表明,所提出的技术在识别用户定义的关键词方面比现有的几种方法具有更好的识别准确率。在所提出的技术的帮助下,cnn-trad-fpool3 和 res15 上用户定义关键词的识别准确率分别显著提高了 21.78%和 24.42%。