BMJ Med. 2025 Aug 1;4(1):e001632. doi: 10.1136/bmjmed-2025-001632. eCollection 2025.
The Chatbot Assessment Reporting Tool (CHART) is a reporting guideline developed to provide reporting recommendations for studies evaluating the performance of generative artificial intelligence (AI)-driven chatbots when summarising clinical evidence and providing health advice, referred to as chatbot health advice studies. CHART was developed in several phases after performing a comprehensive systematic review to identify variation in the conduct, reporting, and method in chatbot health advice studies. Findings from the review were used to develop a draft checklist that was revised through an international, multidisciplinary, modified, asynchronous Delphi consensus process of 531 stakeholders, three synchronous panel consensus meetings of 48 stakeholders, and subsequent pilot testing of the checklist. CHART includes 12 items and 39 subitems to promote transparent and comprehensive reporting of chatbot health advice studies. These include title (subitem 1a), abstract/summary (subitem 1b), background (subitems 2a,b), model identifiers (subitems 3a,b), model details (subitems 4a-c), prompt engineering (subitems 5a,b), query strategy (subitems 6a-d), performance evaluation (subitems 7a,b), sample size (subitem 8), data analysis (subitem 9a), results (subitems 10a-c), discussion (subitems 11a-c), disclosures (subitem 12a), funding (subitem 12b), ethics (subitem 12c), protocol (subitem 12d), and data availability (subitem 12e). The CHART checklist and corresponding diagram of the method were designed to support key stakeholders including clinicians, researchers, editors, peer reviewers, and readers in reporting, understanding, and interpreting the findings of chatbot health advice studies.
聊天机器人评估报告工具(CHART)是一项报告指南,旨在为评估生成式人工智能(AI)驱动的聊天机器人在总结临床证据和提供健康建议时的性能的研究提供报告建议,此类研究被称为聊天机器人健康建议研究。CHART是在进行全面系统评价以确定聊天机器人健康建议研究在实施、报告和方法方面的差异后,分几个阶段制定的。该评价的结果被用于制定一份清单草案,该草案通过531名利益相关者参与的国际多学科、改进的异步德尔菲共识过程、48名利益相关者参与的三次同步小组共识会议以及随后对该清单的试点测试进行了修订。CHART包括12项和39个子项,以促进对聊天机器人健康建议研究进行透明和全面的报告。这些项目包括标题(子项1a)、摘要/概述(子项1b)、背景(子项2a、b)、模型标识符(子项3a、b)、模型细节(子项4a - c)、提示工程(子项5a、b)、查询策略(子项6a - d)、性能评估(子项7a、b)、样本量(子项8)、数据分析(子项9a)、结果(子项10a - c)、讨论(子项11a - c)、披露(子项12a)、资金(子项12b)、伦理(子项12c)、方案(子项12d)以及数据可用性(子项12e)。CHART清单及相应的方法示意图旨在支持包括临床医生、研究人员、编辑、同行评审人员和读者在内的关键利益相关者报告、理解和解释聊天机器人健康建议研究的结果。