Suppr超能文献

基于在线聊天的人工智能模型在结直肠癌筛查中的适用性。

Applicability of Online Chat-Based Artificial Intelligence Models to Colorectal Cancer Screening.

机构信息

Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA.

Department of Biostatistics and Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

出版信息

Dig Dis Sci. 2024 Mar;69(3):791-797. doi: 10.1007/s10620-024-08274-3. Epub 2024 Jan 24.

Abstract

BACKGROUND

Over the past year, studies have shown potential in the applicability of ChatGPT in various medical specialties including cardiology and oncology. However, the application of ChatGPT and other online chat-based AI models to patient education and patient-physician communication on colorectal cancer screening has not been critically evaluated which is what we aimed to do in this study.

METHODS

We posed 15 questions on important colorectal cancer screening concepts and 5 common questions asked by patients to the 3 most commonly used freely available artificial intelligence (AI) models. The responses provided by the AI models were graded for appropriateness and reliability using American College of Gastroenterology guidelines. The responses to each question provided by an AI model were graded as reliably appropriate (RA), reliably inappropriate (RI) and unreliable. Grader assessments were validated by the joint probability of agreement for two raters.

RESULTS

ChatGPT and YouChat™ provided RA responses to the questions posed more often than BingChat. There were two questions that > 1 AI model provided unreliable responses to. ChatGPT did not provide references. BingChat misinterpreted some of the information it referenced. The age of CRC screening provided by YouChat™ was not consistently up-to-date. Inter-rater reliability for 2 raters was 89.2%.

CONCLUSION

Most responses provided by AI models on CRC screening were appropriate. Some limitations exist in their ability to correctly interpret medical literature and provide updated information in answering queries. Patients should consult their physicians for context on the recommendations made by these AI models.

摘要

背景

在过去的一年中,研究表明 ChatGPT 在包括心脏病学和肿瘤学在内的各种医学专业中的应用具有潜力。然而,ChatGPT 和其他基于在线聊天的人工智能模型在结直肠癌筛查中的患者教育和医患沟通中的应用尚未得到严格评估,这正是我们在这项研究中旨在做的。

方法

我们向三个最常用的免费人工智能模型提出了 15 个关于重要结直肠癌筛查概念的问题和 5 个患者常问的问题。使用美国胃肠病学学院的指南对人工智能模型提供的回答进行适当性和可靠性评分。人工智能模型对每个问题的回答分为可靠适当(RA)、可靠不适当(RI)和不可靠。通过两位评分者的一致性联合概率验证评分者的评估。

结果

ChatGPT 和 YouChat™ 比 BingChat 更频繁地提供 RA 回答。有两个问题超过一个 AI 模型提供了不可靠的回答。ChatGPT 没有提供参考。BingChat 错误地解释了它引用的一些信息。YouChat™ 提供的 CRC 筛查年龄不一致。两位评分者的组内可靠性为 89.2%。

结论

人工智能模型在 CRC 筛查方面提供的大多数回答是适当的。它们在正确解释医学文献和提供查询的更新信息方面存在一些限制。患者应向医生咨询这些人工智能模型建议的背景信息。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验