Roca Surya, Rosset Sophie, García José, Alesanco Álvaro
Aragón Institute of Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, Spain.
Laboratoire Interdisciplinaire des Sciences du Numérique, CNRS, Université Paris-Saclay, 91405 Orsay, France.
Sensors (Basel). 2022 Mar 18;22(6):2364. doi: 10.3390/s22062364.
This study evaluates the impacts of slot tagging and training data length on joint natural language understanding (NLU) models for medication management scenarios using chatbots in Spanish. In this study, we define the intents (purposes of the sentences) for medication management scenarios and two types of slot tags. For training the model, we generated four datasets, combining long/short sentences with long/short slots, while for testing, we collect the data from real interactions of users with a chatbot. For the comparative analysis, we chose six joint NLU models (SlotRefine, stack-propagation framework, SF-ID network, capsule-NLU, slot-gated modeling, and a joint SLU-LM model) from the literature. The results show that the best performance (with a sentence-level semantic accuracy of 68.6%, an F1-score of 76.4% for slot filling, and an accuracy of 79.3% for intent detection) is achieved using short sentences and short slots. Our results suggest that joint NLU models trained with short slots yield better results than those trained with long slots for the slot filling task. The results also indicate that short slots could be a better choice for the dialog system because of their simplicity. Importantly, the work demonstrates that the performance of the joint NLU models can be improved by selecting the correct slot configuration according to the usage scenario.
本研究评估了槽位标记和训练数据长度对使用西班牙语聊天机器人进行药物管理场景的联合自然语言理解(NLU)模型的影响。在本研究中,我们定义了药物管理场景的意图(句子的目的)和两种类型的槽位标签。为了训练模型,我们生成了四个数据集,将长/短句子与长/短槽位相结合,而在测试时,我们从用户与聊天机器人的真实交互中收集数据。为了进行比较分析,我们从文献中选择了六个联合NLU模型(SlotRefine、堆栈传播框架、SF-ID网络、胶囊-NLU、槽位门控建模和联合SLU-LM模型)。结果表明,使用短句子和短槽位可实现最佳性能(句子级语义准确率为68.6%,槽位填充的F1分数为76.4%,意图检测准确率为79.3%)。我们的结果表明,对于槽位填充任务,使用短槽位训练的联合NLU模型比使用长槽位训练的模型产生更好的结果。结果还表明,由于短槽位的简单性,它们可能是对话系统的更好选择。重要的是,这项工作表明,通过根据使用场景选择正确的槽位配置,可以提高联合NLU模型的性能。