基于在线聊天的人工智能模型在结直肠癌筛查中的适用性。

Applicability of Online Chat-Based Artificial Intelligence Models to Colorectal Cancer Screening.

机构信息

Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA.

Department of Biostatistics and Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

出版信息

Dig Dis Sci. 2024 Mar;69(3):791-797. doi: 10.1007/s10620-024-08274-3. Epub 2024 Jan 24.

DOI:10.1007/s10620-024-08274-3

PMID:38267726

Abstract

BACKGROUND

Over the past year, studies have shown potential in the applicability of ChatGPT in various medical specialties including cardiology and oncology. However, the application of ChatGPT and other online chat-based AI models to patient education and patient-physician communication on colorectal cancer screening has not been critically evaluated which is what we aimed to do in this study.

METHODS

We posed 15 questions on important colorectal cancer screening concepts and 5 common questions asked by patients to the 3 most commonly used freely available artificial intelligence (AI) models. The responses provided by the AI models were graded for appropriateness and reliability using American College of Gastroenterology guidelines. The responses to each question provided by an AI model were graded as reliably appropriate (RA), reliably inappropriate (RI) and unreliable. Grader assessments were validated by the joint probability of agreement for two raters.

RESULTS

ChatGPT and YouChat™ provided RA responses to the questions posed more often than BingChat. There were two questions that > 1 AI model provided unreliable responses to. ChatGPT did not provide references. BingChat misinterpreted some of the information it referenced. The age of CRC screening provided by YouChat™ was not consistently up-to-date. Inter-rater reliability for 2 raters was 89.2%.

CONCLUSION

Most responses provided by AI models on CRC screening were appropriate. Some limitations exist in their ability to correctly interpret medical literature and provide updated information in answering queries. Patients should consult their physicians for context on the recommendations made by these AI models.

摘要

背景

在过去的一年中，研究表明 ChatGPT 在包括心脏病学和肿瘤学在内的各种医学专业中的应用具有潜力。然而，ChatGPT 和其他基于在线聊天的人工智能模型在结直肠癌筛查中的患者教育和医患沟通中的应用尚未得到严格评估，这正是我们在这项研究中旨在做的。

方法

我们向三个最常用的免费人工智能模型提出了 15 个关于重要结直肠癌筛查概念的问题和 5 个患者常问的问题。使用美国胃肠病学学院的指南对人工智能模型提供的回答进行适当性和可靠性评分。人工智能模型对每个问题的回答分为可靠适当（RA）、可靠不适当（RI）和不可靠。通过两位评分者的一致性联合概率验证评分者的评估。

结果

ChatGPT 和 YouChat™ 比 BingChat 更频繁地提供 RA 回答。有两个问题超过一个 AI 模型提供了不可靠的回答。ChatGPT 没有提供参考。BingChat 错误地解释了它引用的一些信息。YouChat™ 提供的 CRC 筛查年龄不一致。两位评分者的组内可靠性为 89.2%。

结论

人工智能模型在 CRC 筛查方面提供的大多数回答是适当的。它们在正确解释医学文献和提供查询的更新信息方面存在一些限制。患者应向医生咨询这些人工智能模型建议的背景信息。

相似文献

Applicability of Online Chat-Based Artificial Intelligence Models to Colorectal Cancer Screening.

Dig Dis Sci. 2024 Mar;69(3):791-797. doi: 10.1007/s10620-024-08274-3. Epub 2024 Jan 24.

Evaluation of online chat-based artificial intelligence responses about inflammatory bowel disease and diet.

Eur J Gastroenterol Hepatol. 2024 Sep 1;36(9):1109-1112. doi: 10.1097/MEG.0000000000002815. Epub 2024 Jul 8.

Use and Application of Large Language Models for Patient Questions Following Total Knee Arthroplasty.

J Arthroplasty. 2024 Sep;39(9):2289-2294. doi: 10.1016/j.arth.2024.03.017. Epub 2024 Mar 13.

Generative Artificial Intelligence in Patient Education: ChatGPT Takes on Hypertension Questions.

Cureus. 2024 Feb 2;16(2):e53441. doi: 10.7759/cureus.53441. eCollection 2024 Feb.

Chat Generative Pretrained Transformer (ChatGPT) and Bard: Artificial Intelligence Does not yet Provide Clinically Supported Answers for Hip and Knee Osteoarthritis.

J Arthroplasty. 2024 May;39(5):1184-1190. doi: 10.1016/j.arth.2024.01.029. Epub 2024 Jan 17.

Online artificial intelligence platforms and their applicability to gastrointestinal surgical operations.

J Gastrointest Surg. 2024 Jan;28(1):64-69. doi: 10.1016/j.gassur.2023.11.019.

Appropriateness and Reliability of an Online Artificial Intelligence Platform's Responses to Common Questions Regarding Distal Radius Fractures.

J Hand Surg Am. 2024 Feb;49(2):91-98. doi: 10.1016/j.jhsa.2023.10.019. Epub 2023 Dec 8.

Evaluating the accuracy and reliability of AI chatbots in disseminating the content of current resuscitation guidelines: a comparative analysis between the ERC 2021 guidelines and both ChatGPTs 3.5 and 4.

Scand J Trauma Resusc Emerg Med. 2024 Sep 26;32(1):95. doi: 10.1186/s13049-024-01266-2.

Can ChatGPT help patients answer their otolaryngology questions?

Laryngoscope Investig Otolaryngol. 2023 Dec 9;9(1):e1193. doi: 10.1002/lio2.1193. eCollection 2024 Feb.

Evaluating Chatbot Efficacy for Answering Frequently Asked Questions in Plastic Surgery: A ChatGPT Case Study Focused on Breast Augmentation.

Aesthet Surg J. 2023 Sep 14;43(10):1126-1135. doi: 10.1093/asj/sjad140.

引用本文的文献

Large language model integrations in cancer decision-making: a systematic review and meta-analysis.

NPJ Digit Med. 2025 Jul 17;8(1):450. doi: 10.1038/s41746-025-01824-7.

Large language models in oncology: a review.

BMJ Oncol. 2025 May 15;4(1):e000759. doi: 10.1136/bmjonc-2025-000759. eCollection 2025.

Medical accuracy of artificial intelligence chatbots in oncology: a scoping review.

Oncologist. 2025 Apr 4;30(4). doi: 10.1093/oncolo/oyaf038.

Emerging applications of NLP and large language models in gastroenterology and hepatology: a systematic review.

Front Med (Lausanne). 2025 Jan 22;11:1512824. doi: 10.3389/fmed.2024.1512824. eCollection 2024.

Assessing online chat-based artificial intelligence models for weight loss recommendation appropriateness and bias in the presence of guideline incongruence.

Int J Obes (Lond). 2025 Jan 27. doi: 10.1038/s41366-025-01717-5.

The Potential Clinical Utility of the Customized Large Language Model in Gastroenterology: A Pilot Study.

Bioengineering (Basel). 2024 Dec 24;12(1):1. doi: 10.3390/bioengineering12010001.

Large Language Models in Gastroenterology: Systematic Review.

J Med Internet Res. 2024 Dec 20;26:e66648. doi: 10.2196/66648.

Large language models in patient education: a scoping review of applications in medicine.

Front Med (Lausanne). 2024 Oct 29;11:1477898. doi: 10.3389/fmed.2024.1477898. eCollection 2024.

Large language model use in clinical oncology.

NPJ Precis Oncol. 2024 Oct 23;8(1):240. doi: 10.1038/s41698-024-00733-4.

The Application of Large Language Models in Gastroenterology: A Review of the Literature.

Cancers (Basel). 2024 Sep 28;16(19):3328. doi: 10.3390/cancers16193328.

本文引用的文献

Assessing Knowledge, Acceptance, and Anticipated Impact of Telepathology in Saudi Arabia: Insights From Healthcare Workers and Patients.

Cureus. 2023 Nov 22;15(11):e49218. doi: 10.7759/cureus.49218. eCollection 2023 Nov.

Large-scale meta-genome-wide association study reveals common genetic factors linked to radiation-induced acute toxicities across cancer types.

JNCI Cancer Spectr. 2023 Oct 31;7(6). doi: 10.1093/jncics/pkad088.

Examining the Potential of ChatGPT on Biomedical Information Retrieval: Fact-Checking Drug-Disease Associations.

Ann Biomed Eng. 2024 Aug;52(8):1919-1927. doi: 10.1007/s10439-023-03385-w. Epub 2023 Oct 19.

Application of artificial intelligence in diagnosis and treatment of colorectal cancer: A novel Prospect.

Front Med (Lausanne). 2023 Mar 8;10:1128084. doi: 10.3389/fmed.2023.1128084. eCollection 2023.

Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios.

J Med Syst. 2023 Mar 4;47(1):33. doi: 10.1007/s10916-023-01925-4.

Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model.

JAMA. 2023 Mar 14;329(10):842-844. doi: 10.1001/jama.2023.1044.

Colorectal cancer in the 45-to-50 age group in the United States: a National Cancer Database (NCDB) analysis.

Surg Endosc. 2022 Sep;36(9):6629-6637. doi: 10.1007/s00464-021-08929-6. Epub 2021 Dec 9.

Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.

CA Cancer J Clin. 2021 May;71(3):209-249. doi: 10.3322/caac.21660. Epub 2021 Feb 4.

A survey on colonoscopy shows poor understanding of its protective value and widespread misconceptions across Europe.

PLoS One. 2020 May 21;15(5):e0233490. doi: 10.1371/journal.pone.0233490. eCollection 2020.

A 3-day low-fibre diet does not improve colonoscopy preparation results compared to a 1-day diet: A randomized, single-blind, controlled trial.

United European Gastroenterol J. 2019 Dec;7(10):1321-1329. doi: 10.1177/2050640619883176. Epub 2019 Oct 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于在线聊天的人工智能模型在结直肠癌筛查中的适用性。

Applicability of Online Chat-Based Artificial Intelligence Models to Colorectal Cancer Screening.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献