Department of Computer Science and Engineering, Indian Institute of Technology Patna, Bihta, Bihar, India.
PLoS One. 2023 Jan 6;18(1):e0278323. doi: 10.1371/journal.pone.0278323. eCollection 2023.
In a task-oriented dialogue setting, user's mood and demands can change in an ongoing dialogue, which may lead to a non-informative conversation or may result in conversation drop-off. To rectify such scenarios, a conversational agent should be able to learn the user's behaviour online, and form informative, empathetic and interactive responses. To incorporate these three aspects, we propose a novel end-to-end dialogue system GenPADS. First, we build and train two models, viz. a politeness classifier to extract polite information present in user's and agent's utterances and a generation model (G) to generate varying but semantically correct responses. We then incorporate both of these models in a reinforcement learning (RL) setting using two different politeness oriented reward algorithms to adapt and generate polite responses. To train our politeness classifier, we annotate recently released Taskmaster dataset into four fine-grained classes depicting politeness and impoliteness. Further, to train our generator model, we prepare a GenDD dataset using the same Taskmaster dataset. Lastly, we train GenPADS and perform automatic and human evaluation by building seven different user simulators. Detailed analysis reveals that GenPADS performs better than the two considered baselines,viz. a transformer based seq2seq generator model for user's and agent's utterance and a retrieval based politeness adaptive dialogue system (PADS).
在面向任务的对话环境中,用户的情绪和需求会在持续的对话中发生变化,这可能导致非信息性的对话或导致对话中断。为了解决这些情况,对话代理应该能够在线学习用户的行为,并形成信息丰富、富有同理心和互动性的响应。为了结合这三个方面,我们提出了一种新颖的端到端对话系统 GenPADS。首先,我们构建并训练了两个模型,即礼貌分类器,用于提取用户和代理话语中存在的礼貌信息,以及生成模型(G),用于生成不同但语义正确的响应。然后,我们在强化学习(RL)设置中使用两种不同的礼貌导向奖励算法将这两个模型结合起来,以适应和生成礼貌响应。为了训练我们的礼貌分类器,我们将最近发布的 Taskmaster 数据集标注为四个细粒度的类别,描绘了礼貌和不礼貌。此外,为了训练我们的生成器模型,我们使用相同的 Taskmaster 数据集准备了 GenDD 数据集。最后,我们训练了 GenPADS,并通过构建七个不同的用户模拟器进行了自动和人工评估。详细分析表明,GenPADS 优于我们考虑的两个基线,即基于转换器的 seq2seq 生成器模型,用于用户和代理的话语,以及基于检索的礼貌自适应对话系统(PADS)。