Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.
J Am Med Inform Assoc. 2018 Sep 1;25(9):1248-1258. doi: 10.1093/jamia/ocy072.
Our objective was to review the characteristics, current applications, and evaluation measures of conversational agents with unconstrained natural language input capabilities used for health-related purposes.
We searched PubMed, Embase, CINAHL, PsycInfo, and ACM Digital using a predefined search strategy. Studies were included if they focused on consumers or healthcare professionals; involved a conversational agent using any unconstrained natural language input; and reported evaluation measures resulting from user interaction with the system. Studies were screened by independent reviewers and Cohen's kappa measured inter-coder agreement.
The database search retrieved 1513 citations; 17 articles (14 different conversational agents) met the inclusion criteria. Dialogue management strategies were mostly finite-state and frame-based (6 and 7 conversational agents, respectively); agent-based strategies were present in one type of system. Two studies were randomized controlled trials (RCTs), 1 was cross-sectional, and the remaining were quasi-experimental. Half of the conversational agents supported consumers with health tasks such as self-care. The only RCT evaluating the efficacy of a conversational agent found a significant effect in reducing depression symptoms (effect size d = 0.44, p = .04). Patient safety was rarely evaluated in the included studies.
The use of conversational agents with unconstrained natural language input capabilities for health-related purposes is an emerging field of research, where the few published studies were mainly quasi-experimental, and rarely evaluated efficacy or safety. Future studies would benefit from more robust experimental designs and standardized reporting.
The protocol for this systematic review is registered at PROSPERO with the number CRD42017065917.
本研究旨在综述具有自然语言输入功能的对话代理的特点、当前应用及健康相关评估指标。
我们通过预先设定的检索策略在 PubMed、Embase、CINAHL、PsycInfo 和 ACM Digital 数据库中进行检索。纳入研究需满足以下标准:以消费者或医疗保健专业人员为研究对象;使用任意自然语言输入的对话代理;并报告系统用户交互的评估指标。研究由独立评审人员进行筛选,采用 Cohen's kappa 检验评估两位评审员之间的编码一致性。
数据库检索共获得 1513 条引用,17 篇文章(涉及 14 种不同的对话代理)符合纳入标准。对话管理策略主要为有限状态和基于框架的策略(分别涉及 6 种和 7 种对话代理);基于代理的策略仅存在于一种系统中。2 项研究为随机对照试验(RCT),1 项为横断面研究,其余均为准实验研究。有一半的对话代理支持消费者进行自我保健等健康任务。唯一一项评估对话代理疗效的 RCT 发现其在降低抑郁症状方面具有显著效果(效应量 d=0.44,p=0.04)。纳入研究中很少评估患者安全性。
在健康相关领域使用具有自然语言输入功能的对话代理是一个新兴的研究领域,目前已发表的研究主要为准实验研究,且很少评估疗效或安全性。未来的研究将受益于更稳健的实验设计和标准化的报告。
本系统评价的方案已在 PROSPERO 注册,注册号为 CRD42017065917。