• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对话系统评估方法调查

Survey on evaluation methods for dialogue systems.

作者信息

Deriu Jan, Rodrigo Alvaro, Otegi Arantxa, Echegoyen Guillermo, Rosset Sophie, Agirre Eneko, Cieliebak Mark

机构信息

Zurich University of Applied Sciences (ZHAW), Steinberggasse 13, 8400 Winterthur, Switzerland.

NLP & IRGroup, UNED, C/Juan del Rosal 16, 28040 Madrid, Spain.

出版信息

Artif Intell Rev. 2021;54(1):755-810. doi: 10.1007/s10462-020-09866-x. Epub 2020 Jun 25.

DOI:10.1007/s10462-020-09866-x
PMID:33505103
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7817575/
Abstract

In this paper, we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation, in and of itself, is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost- and time-intensive. Thus, much work has been put into finding methods which allow a reduction in involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented, conversational, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then present the evaluation methods regarding that class.

摘要

在本文中,我们综述了为评估对话系统而开发的方法和概念。评估本身就是开发过程中的一个关键部分。通常,对话系统是通过人工评估和问卷调查来进行评估的。然而,这往往成本很高且耗时很长。因此,人们投入了大量工作来寻找能够减少人工参与的方法。在本次综述中,我们介绍主要的概念和方法。为此,我们区分了不同类别的对话系统(面向任务的、会话式的和问答式对话系统)。我们通过介绍为对话系统开发的主要技术来涵盖每一类系统,然后介绍针对该类系统的评估方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/806a8cc488b6/10462_2020_9866_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/5f7a4457c7f9/10462_2020_9866_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/a5191afd3efe/10462_2020_9866_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/db3732a2220f/10462_2020_9866_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/759b2d53db80/10462_2020_9866_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/81e3c41a0719/10462_2020_9866_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/b32e8865b97e/10462_2020_9866_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/806a8cc488b6/10462_2020_9866_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/5f7a4457c7f9/10462_2020_9866_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/a5191afd3efe/10462_2020_9866_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/db3732a2220f/10462_2020_9866_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/759b2d53db80/10462_2020_9866_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/81e3c41a0719/10462_2020_9866_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/b32e8865b97e/10462_2020_9866_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/460c/7817575/806a8cc488b6/10462_2020_9866_Fig7_HTML.jpg

相似文献

1
Survey on evaluation methods for dialogue systems.对话系统评估方法调查
Artif Intell Rev. 2021;54(1):755-810. doi: 10.1007/s10462-020-09866-x. Epub 2020 Jun 25.
2
Conversational ontology operator: patient-centric vaccine dialogue management engine for spoken conversational agents.会话本体运算符:以患者为中心的疫苗对话管理引擎,用于口语对话代理。
BMC Med Inform Decis Mak. 2020 Dec 14;20(Suppl 4):259. doi: 10.1186/s12911-020-01267-y.
3
Dialogue Systems and Conversational Agents for Patients with Dementia: The Human-Robot Interaction.对话系统和会话代理在痴呆症患者中的应用:人机交互。
Rejuvenation Res. 2019 Apr;22(2):109-120. doi: 10.1089/rej.2018.2075. Epub 2018 Sep 20.
4
An Ontology-Powered Dialogue Engine For Patient Communication of Vaccines.一种用于疫苗患者沟通的本体驱动对话引擎。
CEUR Workshop Proc. 2019 Oct;2427:24-30.
5
Socio-conversational systems: Three challenges at the crossroads of fields.社会对话系统:处于多领域交叉点的三大挑战。
Front Robot AI. 2022 Dec 15;9:937825. doi: 10.3389/frobt.2022.937825. eCollection 2022.
6
Editorial: Conversational AI.社论:对话式人工智能
Front Artif Intell. 2023 May 10;6:1203910. doi: 10.3389/frai.2023.1203910. eCollection 2023.
7
How to Evaluate Health Applications with Conversational User Interface?如何评估具有对话式用户界面的健康应用程序?
Stud Health Technol Inform. 2020 Jun 16;270:976-980. doi: 10.3233/SHTI200307.
8
Conversational evidence in therapeutic dialogue.治疗性对话中的会话证据。
J Marital Fam Ther. 2008 Jul;34(3):388-405. doi: 10.1111/j.1752-0606.2008.00079.x.
9
Technical Aspects of Developing Chatbots for Medical Applications: Scoping Review.开发医疗应用聊天机器人的技术方面:范围综述。
J Med Internet Res. 2020 Dec 18;22(12):e19127. doi: 10.2196/19127.
10
A study of interactive robot architecture through the practical implementation of conversational android.通过对话式安卓机器人的实际应用对交互式机器人架构进行的一项研究。
Front Robot AI. 2022 Oct 11;9:905030. doi: 10.3389/frobt.2022.905030. eCollection 2022.

引用本文的文献

1
Virtual Patients Using Large Language Models: Scalable, Contextualized Simulation of Clinician-Patient Dialogue With Feedback.使用大语言模型的虚拟患者:具有反馈功能的临床医生-患者对话的可扩展、情境化模拟
J Med Internet Res. 2025 Apr 4;27:e68486. doi: 10.2196/68486.
2
A Case Study on Assessing AI Assistant Competence in Narrative Interviews.评估人工智能助手在叙事访谈中的能力的案例研究。
F1000Res. 2024 Oct 4;13:601. doi: 10.12688/f1000research.151952.2. eCollection 2024.
3
The implementation of chatbot-mediated immediacy for synchronous communication in an online chemistry course.

本文引用的文献

1
A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering.一种基于概率信息检索模型和统一医学语言系统(UMLS)概念的生物医学问答中的段落检索方法。
J Biomed Inform. 2017 Apr;68:96-103. doi: 10.1016/j.jbi.2017.03.001. Epub 2017 Mar 7.
2
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.
在一门在线化学课程中实施聊天机器人介导的即时同步通信。
Educ Inf Technol (Dordr). 2023 Feb 3:1-26. doi: 10.1007/s10639-023-11602-1.
4
A systematic literature review on persuasive technology at the workplace.关于工作场所说服性技术的系统文献综述。
Patterns (N Y). 2022 Aug 12;3(8):100545. doi: 10.1016/j.patter.2022.100545.
5
Understanding Is a Process.理解是一个过程。
Front Syst Neurosci. 2022 Mar 31;16:800280. doi: 10.3389/fnsys.2022.800280. eCollection 2022.
6
Conversational Agents: Goals, Technologies, Vision and Challenges.对话代理:目标、技术、愿景与挑战。
Sensors (Basel). 2021 Dec 17;21(24):8448. doi: 10.3390/s21248448.
7
Assessing Open-Ended Human-Computer Collaboration Systems: Applying a Hallmarks Approach.评估开放式人机协作系统:应用标志性方法。
Front Artif Intell. 2021 Oct 18;4:670009. doi: 10.3389/frai.2021.670009. eCollection 2021.
8
A dynamic goal adapted task oriented dialogue agent.动态目标适应任务导向对话代理。
PLoS One. 2021 Apr 1;16(4):e0249030. doi: 10.1371/journal.pone.0249030. eCollection 2021.