Suppr超能文献

ChatGPT 为炎症性肠病患者提供的信息与 ECCO 指南的准确性比较。

Accuracy of Information given by ChatGPT for Patients with Inflammatory Bowel Disease in Relation to ECCO Guidelines.

机构信息

Department of Medicine, Division of Gastroenterology, Mater Dei Hospital, Msida, Malta.

Department of Gastroenterology, Barts Health NHS Trust, London, UK.

出版信息

J Crohns Colitis. 2024 Aug 14;18(8):1215-1221. doi: 10.1093/ecco-jcc/jjae040.

Abstract

BACKGROUND

As acceptance of artificial intelligence [AI] platforms increases, more patients will consider these tools as sources of information. The ChatGPT architecture utilizes a neural network to process natural language, thus generating responses based on the context of input text. The accuracy and completeness of ChatGPT3.5 in the context of inflammatory bowel disease [IBD] remains unclear.

METHODS

In this prospective study, 38 questions worded by IBD patients were inputted into ChatGPT3.5. The following topics were covered: [1] Crohn's disease [CD], ulcerative colitis [UC], and malignancy; [2] maternal medicine; [3] infection and vaccination; and [4] complementary medicine. Responses given by ChatGPT were assessed for accuracy [1-completely incorrect to 5-completely correct] and completeness [3-point Likert scale; range 1-incomplete to 3-complete] by 14 expert gastroenterologists, in comparison with relevant ECCO guidelines.

RESULTS

In terms of accuracy, most replies [84.2%] had a median score of ≥4 (interquartile range [IQR]: 2) and a mean score of 3.87 [SD: ±0.6]. For completeness, 34.2% of the replies had a median score of 3 and 55.3% had a median score of between 2 and <3. Overall, the mean rating was 2.24 [SD: ±0.4, median: 2, IQR: 1]. Though groups 3 and 4 had a higher mean for both accuracy and completeness, there was no significant scoring variation between the four question groups [Kruskal-Wallis test p > 0.05]. However, statistical analysis for the different individual questions revealed a significant difference for both accuracy [p < 0.001] and completeness [p < 0.001]. The questions which rated the highest for both accuracy and completeness were related to smoking, while the lowest rating was related to screening for malignancy and vaccinations especially in the context of immunosuppression and family planning.

CONCLUSION

This is the first study to demonstrate the capability of an AI-based system to provide accurate and comprehensive answers to real-world patient queries in IBD. AI systems may serve as a useful adjunct for patients, in addition to standard of care in clinics and validated patient information resources. However, responses in specialist areas may deviate from evidence-based guidance and the replies need to give more firm advice.

摘要

背景

随着人们对人工智能[AI]平台的接受程度不断提高,越来越多的患者将这些工具视为信息来源。ChatGPT 架构利用神经网络处理自然语言,从而根据输入文本的上下文生成响应。ChatGPT3.5 在炎症性肠病[IBD]方面的准确性和完整性尚不清楚。

方法

在这项前瞻性研究中,38 个由 IBD 患者提出的问题被输入到 ChatGPT3.5 中。涵盖的主题包括:[1]克罗恩病[CD]、溃疡性结肠炎[UC]和恶性肿瘤;[2]孕产妇医学;[3]感染和疫苗接种;和[4]补充医学。由 14 名专家胃肠病学家评估 ChatGPT 给出的回复的准确性[1-完全不正确到 5-完全正确]和完整性[3 分李克特量表;范围 1-不完整到 3-完整],并与相关的 ECCO 指南进行比较。

结果

就准确性而言,大多数回复[84.2%]的中位数评分为≥4(四分位距[IQR]:2),平均评分为 3.87[标准差:±0.6]。就完整性而言,34.2%的回复的中位数评分为 3,55.3%的回复的中位数评分为 2 至<3。总体而言,平均评分[2.24 标准差:±0.4,中位数:2,IQR:1]。虽然第 3 组和第 4 组在准确性和完整性方面的平均得分都较高,但四个问题组之间的评分差异没有统计学意义[Kruskal-Wallis 检验 p>0.05]。然而,对个别问题的统计分析显示,准确性[p<0.001]和完整性[p<0.001]都有显著差异。在准确性和完整性方面得分最高的问题与吸烟有关,而得分最低的问题与恶性肿瘤筛查和疫苗接种有关,尤其是在免疫抑制和计划生育的背景下。

结论

这是第一项研究,证明了基于人工智能的系统能够为 IBD 患者提供准确和全面的真实世界患者查询答案。AI 系统可以作为患者的有用补充,除了诊所的标准护理和经过验证的患者信息资源之外。然而,在专业领域的回复可能偏离基于证据的指导,并且回复需要提供更坚定的建议。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验