评估大型语言模型在识别胃肠病学领域顶级研究问题中的应用。

Evaluating the use of large language model in identifying top research questions in gastroenterology.

机构信息

Department of Gastroenterology, Chaim Sheba Medical Center, Affiliated to Tel Aviv University, Tel Aviv, Israel.

Hasso Plattner Institute for Digital Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

出版信息

Sci Rep. 2023 Mar 13;13(1):4164. doi: 10.1038/s41598-023-31412-2.

DOI:10.1038/s41598-023-31412-2

PMID:36914821

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10011374/

Abstract

The field of gastroenterology (GI) is constantly evolving. It is essential to pinpoint the most pressing and important research questions. To evaluate the potential of chatGPT for identifying research priorities in GI and provide a starting point for further investigation. We queried chatGPT on four key topics in GI: inflammatory bowel disease, microbiome, Artificial Intelligence in GI, and advanced endoscopy in GI. A panel of experienced gastroenterologists separately reviewed and rated the generated research questions on a scale of 1-5, with 5 being the most important and relevant to current research in GI. chatGPT generated relevant and clear research questions. Yet, the questions were not considered original by the panel of gastroenterologists. On average, the questions were rated 3.6 ± 1.4, with inter-rater reliability ranging from 0.80 to 0.98 (p < 0.001). The mean grades for relevance, clarity, specificity, and originality were 4.9 ± 0.1, 4.6 ± 0.4, 3.1 ± 0.2, 1.5 ± 0.4, respectively. Our study suggests that Large Language Models (LLMs) may be a useful tool for identifying research priorities in the field of GI, but more work is needed to improve the novelty of the generated research questions.

摘要

胃肠病学（GI）领域在不断发展。确定最紧迫和最重要的研究问题至关重要。为了评估 chatGPT 在确定 GI 研究重点方面的潜力，并为进一步研究提供起点。我们就 GI 中的四个关键主题向 chatGPT 提问：炎症性肠病、微生物组、GI 中的人工智能和 GI 中的高级内镜。一组经验丰富的胃肠病学家分别对生成的研究问题进行了 1-5 分的评估，5 分表示与 GI 中的当前研究最相关和最重要。chatGPT 生成了相关且明确的研究问题。然而，这些问题并没有被胃肠病学家小组认为是原创的。平均而言，这些问题的评分是 3.6±1.4，组内评分者之间的可靠性从 0.80 到 0.98（p<0.001）。相关性、清晰度、特异性和新颖性的平均分数分别为 4.9±0.1、4.6±0.4、3.1±0.2、1.5±0.4。我们的研究表明，大型语言模型（LLM）可能是确定 GI 领域研究重点的有用工具，但需要做更多的工作来提高生成研究问题的新颖性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0467/10011374/f273d3a33cd1/41598_2023_31412_Fig1_HTML.jpg

相似文献

Evaluating the use of large language model in identifying top research questions in gastroenterology.评估大型语言模型在识别胃肠病学领域顶级研究问题中的应用。

Sci Rep. 2023 Mar 13;13(1):4164. doi: 10.1038/s41598-023-31412-2.

May ChatGPT be a tool producing medical information for common inflammatory bowel disease patients' questions? An evidence-controlled analysis.ChatGPT 能否成为一种为常见炎症性肠病患者问题提供医疗信息的工具？一项基于证据的分析。

World J Gastroenterol. 2024 Jan 7;30(1):17-33. doi: 10.3748/wjg.v30.i1.17.

Large language models: a primer and gastroenterology applications.大语言模型：入门介绍及胃肠病学应用

Therap Adv Gastroenterol. 2024 Feb 22;17:17562848241227031. doi: 10.1177/17562848241227031. eCollection 2024.

Evaluating the Utility of a Large Language Model in Answering Common Patients' Gastrointestinal Health-Related Questions: Are We There Yet?评估大语言模型在回答常见患者胃肠道健康相关问题中的效用：我们做到了吗？

Diagnostics (Basel). 2023 Jun 2;13(11):1950. doi: 10.3390/diagnostics13111950.

Comparative evaluation of a language model and human specialists in the application of European guidelines for the management of inflammatory bowel diseases and malignancies.比较语言模型和人类专家在应用欧洲炎症性肠病和恶性肿瘤管理指南方面的效果。

Endoscopy. 2024 Sep;56(9):706-709. doi: 10.1055/a-2289-5732. Epub 2024 Mar 18.

Musculoskeletal Injuries Are Commonly Reported Among Gastroenterology Trainees: Results of a National Survey.胃肠病学受训者中常见肌肉骨骼损伤报告：一项全国性调查结果。

Dig Dis Sci. 2019 Jun;64(6):1439-1447. doi: 10.1007/s10620-019-5463-7. Epub 2019 Jan 25.

A Novel Use of Artificial Intelligence to Examine Diversity and Hospital Performance.人工智能在评估多样性和医院绩效方面的新应用

J Surg Res. 2021 Apr;260:377-382. doi: 10.1016/j.jss.2020.07.081. Epub 2020 Oct 21.

ChatGPT: Chances and Challenges for Dentistry.ChatGPT：牙科领域的机遇与挑战。

Compend Contin Educ Dent. 2023 Apr;44(4):220-224.

How is inflammatory bowel disease managed in Spanish gastroenterology departments? The results of the GESTIONA-EII survey.西班牙胃肠病科如何管理炎症性肠病？GESTIONA-EII调查结果。

Rev Esp Enferm Dig. 2016 Oct;108(10):618-626. doi: 10.17235/reed.2016.4410/2016.

Updates in artificial intelligence in gastroenterology endoscopy in 2020.2020 年胃肠病学内镜人工智能的新进展。

Curr Opin Gastroenterol. 2021 Sep 1;37(5):428-433. doi: 10.1097/MOG.0000000000000774.

引用本文的文献

Medical Students' Perceptions of Large Language Models in Healthcare: A Multinational Cross-Sectional Study.医学生对医疗保健领域大语言模型的认知：一项跨国横断面研究。

J Med Educ Curric Dev. 2025 May 21;12:23821205251331124. doi: 10.1177/23821205251331124. eCollection 2025 Jan-Dec.

Assessing ChatGPT-v4 for Guideline-Concordant Inflammatory Bowel Disease: Accuracy, Completeness, and Temporal Drift.评估ChatGPT-v4在符合指南的炎症性肠病方面的表现：准确性、完整性和时间漂移

J Clin Med. 2025 Jun 29;14(13):4599. doi: 10.3390/jcm14134599.

Comparing ChatGPT3.5 and Bard recommendations for colonoscopy intervals: Bridging the gap in healthcare settings.比较ChatGPT3.5和Bard关于结肠镜检查间隔的建议：弥合医疗环境中的差距。

Endosc Int Open. 2025 Jun 17;13:a25865912. doi: 10.1055/a-2586-5912. eCollection 2025.

Public Versus Academic Discourse on ChatGPT in Health Care: Mixed Methods Study.医疗保健领域中关于ChatGPT的公众与学术话语：混合方法研究

JMIR Infodemiology. 2025 Jun 23;5:e64509. doi: 10.2196/64509.

Assessing the feasibility of large language models to identify top research priorities in enhanced external counterpulsation.评估大语言模型确定增强型体外反搏研究重点的可行性。

PLoS One. 2025 Apr 15;20(4):e0305442. doi: 10.1371/journal.pone.0305442. eCollection 2025.

Chat GPT vs an experienced ophthalmologist: evaluating chatbot writing performance in ophthalmology.Chat GPT与经验丰富的眼科医生：评估眼科领域聊天机器人的写作表现

Eye (Lond). 2025 Apr 1. doi: 10.1038/s41433-025-03779-1.

Emerging applications of NLP and large language models in gastroenterology and hepatology: a systematic review.自然语言处理和大语言模型在胃肠病学和肝病学中的新兴应用：一项系统综述

Front Med (Lausanne). 2025 Jan 22;11:1512824. doi: 10.3389/fmed.2024.1512824. eCollection 2024.

Evaluating ChatGPT-4 for the Interpretation of Images from Several Diagnostic Techniques in Gastroenterology.评估ChatGPT-4对来自多种胃肠病诊断技术的图像的解读能力。

J Clin Med. 2025 Jan 17;14(2):572. doi: 10.3390/jcm14020572.

Applications and Future Prospects of Medical LLMs: A Survey Based on the M-KAT Conceptual Framework.医学大语言模型的应用与未来前景：基于M-KAT概念框架的综述

J Med Syst. 2024 Dec 27;48(1):112. doi: 10.1007/s10916-024-02132-5.

Large Language Models in Gastroenterology: Systematic Review.胃肠病学中的大语言模型：系统评价

J Med Internet Res. 2024 Dec 20;26:e66648. doi: 10.2196/66648.

本文引用的文献

Abstracts written by ChatGPT fool scientists.由ChatGPT撰写的摘要愚弄了科学家。

Nature. 2023 Jan;613(7944):423. doi: 10.1038/d41586-023-00056-7.

Are ChatGPT and AlphaCode going to replace programmers?ChatGPT和AlphaCode会取代程序员吗？

Nature. 2022 Dec 8. doi: 10.1038/d41586-022-04383-z.

Topic Modeling for Interpretable Text Classification From EHRs.用于电子健康记录可解释文本分类的主题建模

Front Big Data. 2022 May 4;5:846930. doi: 10.3389/fdata.2022.846930. eCollection 2022.

Innovation in Gastroenterology-Can We Do Better?胃肠病学的创新——我们能做得更好吗？

Biomimetics (Basel). 2022 Mar 19;7(1):33. doi: 10.3390/biomimetics7010033.

The Effectiveness of Artificial Intelligence Conversational Agents in Health Care: Systematic Review.人工智能对话代理在医疗保健中的有效性：系统评价

J Med Internet Res. 2020 Oct 22;22(10):e20346. doi: 10.2196/20346.

A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research.可靠性研究中组内相关系数选择与报告指南

J Chiropr Med. 2016 Jun;15(2):155-63. doi: 10.1016/j.jcm.2016.02.012. Epub 2016 Mar 31.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估大型语言模型在识别胃肠病学领域顶级研究问题中的应用。

Evaluating the use of large language model in identifying top research questions in gastroenterology.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献