SAIF：一种用于个人助理的纠错检测深度学习架构。

SAIF: A Correction-Detection Deep-Learning Architecture for Personal Assistants.

机构信息

Data Science Center, Ariel University, Ariel 40700, Israel.

出版信息

Sensors (Basel). 2020 Sep 29;20(19):5577. doi: 10.3390/s20195577.

DOI:10.3390/s20195577

PMID:33003380

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7582502/

Abstract

Intelligent agents that can interact with users using natural language are becoming increasingly common. Sometimes an intelligent agent may not correctly understand a user command or may not perform it properly. In such cases, the user might try a second time by giving the agent another, slightly different command. Giving an agent the ability to detect such user corrections might help it fix its own mistakes and avoid making them in the future. In this work, we consider the problem of automatically detecting user corrections using deep learning. We develop a multimodal architecture called SAIF, which detects such user corrections, taking as inputs the user's voice commands as well as their transcripts. Voice inputs allow SAIF to take advantage of sound cues, such as tone, speed, and word emphasis. In addition to sound cues, our model uses transcripts to determine whether a command is a correction to the previous command. Our model also obtains internal input from the agent, indicating whether the previous command was executed successfully or not. Finally, we release a unique dataset in which users interacted with an intelligent agent assistant, by giving it commands. This dataset includes labels on pairs of consecutive commands, which indicate whether the latter command is in fact a correction of the former command. We show that SAIF outperforms current state-of-the-art methods on this dataset.

摘要

能够使用自然语言与用户进行交互的智能代理越来越普遍。有时，智能代理可能无法正确理解用户的命令，或者无法正确执行。在这种情况下，用户可能会尝试第二次，给代理一个略有不同的命令。赋予代理检测此类用户纠正的能力可能有助于它纠正自己的错误，并避免将来再犯。在这项工作中，我们考虑使用深度学习自动检测用户纠正的问题。我们开发了一种称为 SAIF 的多模态架构，该架构通过输入用户的语音命令及其转录本，检测此类用户纠正。语音输入使 SAIF 能够利用声音线索，如语调、语速和单词强调。除了声音线索外，我们的模型还使用转录本来确定命令是否是对上一个命令的纠正。我们的模型还从代理处获取内部输入，指示上一个命令是否成功执行。最后，我们发布了一个独特的数据集，用户通过向智能代理助手发出命令与该数据集进行交互。该数据集包括连续两条命令的标签，这些标签指示后一条命令是否实际上是前一条命令的纠正。我们表明，SAIF 在这个数据集上的表现优于当前的最先进方法。