RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, Australia.
STARS Education and Research Alliance, Surgical Treatment and Rehabilitation Service (STARS), The University of Queensland and Metro North Health, Brisbane, QLD, Australia.
J Am Med Inform Assoc. 2024 Feb 16;31(3):746-761. doi: 10.1093/jamia/ocad222.
Conversational agents (CAs) with emerging artificial intelligence present new opportunities to assist in health interventions but are difficult to evaluate, deterring their applications in the real world. We aimed to synthesize existing evidence and knowledge and outline an evaluation framework for CA interventions.
We conducted a systematic scoping review to investigate designs and outcome measures used in the studies that evaluated CAs for health interventions. We then nested the results into an overarching digital health framework proposed by the World Health Organization (WHO).
The review included 81 studies evaluating CAs in experimental (n = 59), observational (n = 15) trials, and other research designs (n = 7). Most studies (n = 72, 89%) were published in the past 5 years. The proposed CA-evaluation framework includes 4 evaluation stages: (1) feasibility/usability, (2) efficacy, (3) effectiveness, and (4) implementation, aligning with WHO's stepwise evaluation strategy. Across these stages, this article presents the essential evidence of different study designs (n = 8), sample sizes, and main evaluation categories (n = 7) with subcategories (n = 40). The main evaluation categories included (1) functionality, (2) safety and information quality, (3) user experience, (4) clinical and health outcomes, (5) costs and cost benefits, (6) usage, adherence, and uptake, and (7) user characteristics for implementation research. Furthermore, the framework highlighted the essential evaluation areas (potential primary outcomes) and gaps across the evaluation stages.
This review presents a new framework with practical design details to support the evaluation of CA interventions in healthcare research.
The Open Science Framework (https://osf.io/9hq2v) on March 22, 2021.
具有新兴人工智能的会话代理 (CA) 为辅助健康干预提供了新的机会,但由于难以评估,阻碍了它们在现实世界中的应用。我们旨在综合现有证据和知识,并概述 CA 干预措施的评估框架。
我们进行了系统的范围综述,以调查评估健康干预措施的 CA 的研究中使用的设计和结果测量。然后,我们将结果嵌套到世界卫生组织 (WHO) 提出的总体数字健康框架中。
该综述包括 81 项评估 CA 在实验(n=59)、观察(n=15)试验和其他研究设计(n=7)中应用的研究。大多数研究(n=72,89%)是在过去 5 年内发表的。所提出的 CA 评估框架包括 4 个评估阶段:(1)可行性/可用性,(2)功效,(3)效果,(4)实施,与 WHO 的逐步评估策略一致。在这些阶段中,本文介绍了不同研究设计(n=8)、样本量和主要评估类别(n=7)的基本证据,以及子类别(n=40)。主要评估类别包括(1)功能,(2)安全性和信息质量,(3)用户体验,(4)临床和健康结果,(5)成本和成本效益,(6)使用、依从性和采用,以及(7)实施研究的用户特征。此外,该框架突出了评估阶段的基本评估领域(潜在的主要结果)和差距。
本综述提出了一个新的框架,具有实用的设计细节,以支持医疗保健研究中 CA 干预措施的评估。
2021 年 3 月 22 日在开放科学框架(https://osf.io/9hq2v)上。