Irfan Bahar, Hellou Mehdi, Belpaeme Tony
Centre for Robotics and Neural Systems, University of Plymouth, Plymouth, United Kingdom.
Polytech Sorbonne, Paris, France.
Front Robot AI. 2021 Sep 28;8:676814. doi: 10.3389/frobt.2021.676814. eCollection 2021.
While earlier research in human-robot interaction pre-dominantly uses rule-based architectures for natural language interaction, these approaches are not flexible enough for long-term interactions in the real world due to the large variation in user utterances. In contrast, data-driven approaches map the user input to the agent output directly, hence, provide more flexibility with these variations without requiring any set of rules. However, data-driven approaches are generally applied to single dialogue exchanges with a user and do not build up a memory over long-term conversation with different users, whereas long-term interactions require remembering users and their preferences incrementally and continuously and recalling previous interactions with users to adapt and personalise the interactions, known as the problem. In addition, it is desirable to learn user preferences from a few samples of interactions (i.e., ). These are known to be challenging problems in machine learning, while they are trivial for rule-based approaches, creating a trade-off between flexibility and robustness. Correspondingly, in this work, we present the text-based Barista Datasets generated to evaluate the potential of data-driven approaches in generic and personalised long-term human-robot interactions with simulated real-world problems, such as recognition errors, incorrect recalls and changes to the user preferences. Based on these datasets, we explore the performance and the underlying inaccuracies of the state-of-the-art data-driven dialogue models that are strong baselines in other domains of personalisation in single interactions, namely Supervised Embeddings, Sequence-to-Sequence, End-to-End Memory Network, Key-Value Memory Network, and Generative Profile Memory Network. The experiments show that while data-driven approaches are suitable for generic task-oriented dialogue and real-time interactions, no model performs sufficiently well to be deployed in personalised long-term interactions in the real world, because of their inability to learn and use new identities, and their poor performance in recalling user-related data.
虽然早期关于人机交互的研究主要使用基于规则的架构进行自然语言交互,但由于用户话语的巨大差异,这些方法在现实世界中进行长期交互时不够灵活。相比之下,数据驱动的方法直接将用户输入映射到智能体输出,因此,在面对这些差异时提供了更大的灵活性,而无需任何规则集。然而,数据驱动的方法通常应用于与用户的单次对话交流,在与不同用户的长期对话中不会建立记忆,而长期交互需要逐步并持续地记住用户及其偏好,并回忆与用户之前的交互以调整和个性化交互,即所谓的 问题。此外,期望从少量交互样本中学习用户偏好(即 )。这些在机器学习中是具有挑战性的问题,而对于基于规则的方法来说则微不足道,这在灵活性和鲁棒性之间形成了权衡。相应地,在这项工作中,我们展示了基于文本的咖啡师数据集,该数据集用于评估数据驱动方法在模拟现实世界问题(如识别错误、错误召回和用户偏好变化)的通用和个性化长期人机交互中的潜力。基于这些数据集,我们探索了最先进的数据驱动对话模型的性能和潜在的不准确性,这些模型在单次交互的其他个性化领域中是强大的基线,即监督嵌入、序列到序列、端到端记忆网络、键值记忆网络和生成概要记忆网络。实验表明,虽然数据驱动的方法适用于通用的面向任务的对话和实时交互,但由于它们无法学习和使用新身份,以及在召回与用户相关数据方面表现不佳,没有模型在现实世界的个性化长期交互中表现得足够好以进行部署。