• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用大型语言模型(如 ChatGPT)进行诊断医学的挑战和障碍,重点是数字病理学——近期的范围综述。

Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology - a recent scoping review.

机构信息

Anatomical Pathology, Department of Pathology and Laboratory Medicine, Te Toka Tumai Auckland, Te Whatu Ora (Health New Zealand), Auckland, New Zealand.

Department of Pathology, Wexner Medical Center, The Ohio State University, Columbus, OH, USA.

出版信息

Diagn Pathol. 2024 Feb 27;19(1):43. doi: 10.1186/s13000-024-01464-7.

DOI:10.1186/s13000-024-01464-7
PMID:38414074
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10898121/
Abstract

BACKGROUND

The integration of large language models (LLMs) like ChatGPT in diagnostic medicine, with a focus on digital pathology, has garnered significant attention. However, understanding the challenges and barriers associated with the use of LLMs in this context is crucial for their successful implementation.

METHODS

A scoping review was conducted to explore the challenges and barriers of using LLMs, in diagnostic medicine with a focus on digital pathology. A comprehensive search was conducted using electronic databases, including PubMed and Google Scholar, for relevant articles published within the past four years. The selected articles were critically analyzed to identify and summarize the challenges and barriers reported in the literature.

RESULTS

The scoping review identified several challenges and barriers associated with the use of LLMs in diagnostic medicine. These included limitations in contextual understanding and interpretability, biases in training data, ethical considerations, impact on healthcare professionals, and regulatory concerns. Contextual understanding and interpretability challenges arise due to the lack of true understanding of medical concepts and lack of these models being explicitly trained on medical records selected by trained professionals, and the black-box nature of LLMs. Biases in training data pose a risk of perpetuating disparities and inaccuracies in diagnoses. Ethical considerations include patient privacy, data security, and responsible AI use. The integration of LLMs may impact healthcare professionals' autonomy and decision-making abilities. Regulatory concerns surround the need for guidelines and frameworks to ensure safe and ethical implementation.

CONCLUSION

The scoping review highlights the challenges and barriers of using LLMs in diagnostic medicine with a focus on digital pathology. Understanding these challenges is essential for addressing the limitations and developing strategies to overcome barriers. It is critical for health professionals to be involved in the selection of data and fine tuning of the models. Further research, validation, and collaboration between AI developers, healthcare professionals, and regulatory bodies are necessary to ensure the responsible and effective integration of LLMs in diagnostic medicine.

摘要

背景

大型语言模型(LLMs)如 ChatGPT 在诊断医学中的整合,特别是在数字病理学方面,引起了广泛关注。然而,了解在这种情况下使用 LLM 所涉及的挑战和障碍对于成功实施至关重要。

方法

进行了范围综述,以探讨在诊断医学中使用 LLM 的挑战和障碍,重点是数字病理学。使用电子数据库(包括 PubMed 和 Google Scholar)全面搜索了过去四年发表的相关文章。对选定的文章进行了批判性分析,以识别和总结文献中报道的挑战和障碍。

结果

范围综述确定了与在诊断医学中使用 LLM 相关的几个挑战和障碍。这些挑战和障碍包括上下文理解和可解释性方面的限制、训练数据中的偏差、伦理考虑、对医疗保健专业人员的影响以及监管方面的担忧。上下文理解和可解释性方面的挑战源于缺乏对医学概念的真正理解,并且这些模型没有经过专门针对由训练有素的专业人员选择的医疗记录进行训练,以及 LLM 的黑盒性质。训练数据中的偏差存在导致诊断中的差异和不准确问题的风险。伦理考虑因素包括患者隐私、数据安全和负责任的 AI 使用。LLM 的整合可能会影响医疗保健专业人员的自主权和决策能力。监管方面的担忧围绕着需要指导方针和框架来确保安全和道德的实施。

结论

范围综述强调了在诊断医学中使用 LLM 的挑战和障碍,重点是数字病理学。了解这些挑战对于应对局限性和制定克服障碍的策略至关重要。医疗保健专业人员参与数据选择和模型微调至关重要。人工智能开发人员、医疗保健专业人员和监管机构之间需要进一步研究、验证和合作,以确保在诊断医学中负责任和有效地整合 LLM。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f18c/10898121/60fe4c0fe308/13000_2024_1464_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f18c/10898121/60fe4c0fe308/13000_2024_1464_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f18c/10898121/60fe4c0fe308/13000_2024_1464_Fig1_HTML.jpg

相似文献

1
Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology - a recent scoping review.使用大型语言模型(如 ChatGPT)进行诊断医学的挑战和障碍,重点是数字病理学——近期的范围综述。
Diagn Pathol. 2024 Feb 27;19(1):43. doi: 10.1186/s13000-024-01464-7.
2
Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals.大语言模型与用户信任:自我参照学习循环的后果及医疗保健专业人员的技能退化
J Med Internet Res. 2024 Apr 25;26:e56764. doi: 10.2196/56764.
3
The Role of Large Language Models in Transforming Emergency Medicine: Scoping Review.大型语言模型在变革急诊医学中的作用:范围综述
JMIR Med Inform. 2024 May 10;12:e53787. doi: 10.2196/53787.
4
DeepSeek in Healthcare: Revealing Opportunities and Steering Challenges of a New Open-Source Artificial Intelligence Frontier.医疗保健领域的DeepSeek:揭示新开源人工智能前沿的机遇与导向挑战
Cureus. 2025 Feb 18;17(2):e79221. doi: 10.7759/cureus.79221. eCollection 2025 Feb.
5
Large Language Models in Worldwide Medical Exams: Platform Development and Comprehensive Analysis.全球医学考试中的大语言模型:平台开发与综合分析
J Med Internet Res. 2024 Dec 27;26:e66114. doi: 10.2196/66114.
6
Utilizing large language models for gastroenterology research: a conceptual framework.利用大语言模型进行胃肠病学研究:一个概念框架。
Therap Adv Gastroenterol. 2025 Apr 1;18:17562848251328577. doi: 10.1177/17562848251328577. eCollection 2025.
7
A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration.医学教育、临床决策支持与医疗管理中的大语言模型综述
Healthcare (Basel). 2025 Mar 10;13(6):603. doi: 10.3390/healthcare13060603.
8
A Systematic Review of ChatGPT and Other Conversational Large Language Models in Healthcare.ChatGPT及其他对话式大语言模型在医疗保健领域的系统评价
medRxiv. 2024 Apr 27:2024.04.26.24306390. doi: 10.1101/2024.04.26.24306390.
9
Assessing the research landscape and clinical utility of large language models: a scoping review.评估大型语言模型的研究现状和临床实用性:范围综述。
BMC Med Inform Decis Mak. 2024 Mar 12;24(1):72. doi: 10.1186/s12911-024-02459-6.
10
Exploring the benefits and challenges of AI-driven large language models in gastroenterology: Think out of the box.探讨人工智能驱动的大型语言模型在胃肠病学中的益处和挑战:跳出固有思维。
Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub. 2024 Nov;168(4):277-283. doi: 10.5507/bp.2024.027. Epub 2024 Sep 4.

引用本文的文献

1
Clinical decision-making for uveal melanoma radiotherapy: comparative performance of experienced radiation oncologists and leading generative AI models.葡萄膜黑色素瘤放疗的临床决策:经验丰富的放射肿瘤学家与领先的生成式人工智能模型的比较表现
Front Oncol. 2025 Aug 14;15:1605916. doi: 10.3389/fonc.2025.1605916. eCollection 2025.
2
Comment on "Performance and Reproducibility of Large Language Models in Named Entity Recognition: Considerations for the Use in Controlled Environments".关于《大型语言模型在命名实体识别中的性能与可重复性:在受控环境中使用的考量》的评论
Drug Saf. 2025 Sep 2. doi: 10.1007/s40264-025-01592-z.
3
Automatic extraction of SmPC document for IDMP data model construction using foundation LLM and RAG: a preliminary experiment for pharmaceutical regulatory affairs.

本文引用的文献

1
Quilt-1M: One Million Image-Text Pairs for Histopathology.Quilt-1M:用于组织病理学的一百万图像-文本对
Adv Neural Inf Process Syst. 2023 Dec;36(DB1):37995-38017.
2
Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study.GPT-4V(视觉)在日本国家医师资格考试中的能力:评估研究。
JMIR Med Educ. 2024 Mar 12;10:e54393. doi: 10.2196/54393.
3
Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2).
使用基础语言模型和检索增强生成(RAG)自动提取用于国际药品标识数据库(IDMP)数据模型构建的药品说明书文档:药物监管事务的初步实验
Front Med (Lausanne). 2025 Aug 13;12:1598979. doi: 10.3389/fmed.2025.1598979. eCollection 2025.
4
Clinical Failure of General-Purpose AI in Photographic Scoliosis Assessment: A Diagnostic Accuracy Study.通用人工智能在脊柱侧弯摄影评估中的临床失败:一项诊断准确性研究。
Medicina (Kaunas). 2025 Jul 25;61(8):1342. doi: 10.3390/medicina61081342.
5
Systematic Review on Large Language Models in Orthopaedic Surgery.骨科手术中大型语言模型的系统评价
J Clin Med. 2025 Aug 20;14(16):5876. doi: 10.3390/jcm14165876.
6
How Accurate Is AI? A Critical Evaluation of Commonly Used Large Language Models in Responding to Patient Concerns About Incidental Kidney Tumors.人工智能的准确性如何?对常用大语言模型回应患者对偶然发现的肾肿瘤担忧的批判性评估。
J Clin Med. 2025 Aug 12;14(16):5697. doi: 10.3390/jcm14165697.
7
Large Language Models in Medical Image Analysis: A Systematic Survey and Future Directions.医学图像分析中的大语言模型:系统综述与未来方向
Bioengineering (Basel). 2025 Jul 29;12(8):818. doi: 10.3390/bioengineering12080818.
8
Large language models in ophthalmology: a scoping review on their utility for clinicians, researchers, patients, and educators.眼科领域的大语言模型:关于其对临床医生、研究人员、患者和教育工作者的效用的范围综述
Eye (Lond). 2025 Aug 25. doi: 10.1038/s41433-025-03935-7.
9
Large language models for clinical decision support in gastroenterology and hepatology.用于胃肠病学和肝病学临床决策支持的大语言模型
Nat Rev Gastroenterol Hepatol. 2025 Aug 22. doi: 10.1038/s41575-025-01108-1.
10
Digital and Artificial Intelligence-based Pathology: Not for Every Laboratory - A Mini-review on the Benefits and Pitfalls of Its Implementation.基于数字和人工智能的病理学:并非适用于每个实验室——关于其实施的益处与陷阱的小型综述
J Clin Transl Pathol. 2025 Jun;5(2):79-85. doi: 10.14218/jctp.2025.00007. Epub 2025 Apr 3.
ChatGPT 在医学中作为 AI 辅助决策支持工具的性能:解释常见心脏疾病症状和管理的概念验证研究 (AMSTELHEART-2)。
Acta Cardiol. 2024 May;79(3):358-366. doi: 10.1080/00015385.2024.2303528. Epub 2024 Feb 13.
4
The role of large language models in medical image processing: a narrative review.大语言模型在医学图像处理中的作用:一项叙述性综述。
Quant Imaging Med Surg. 2024 Jan 3;14(1):1108-1121. doi: 10.21037/qims-23-892. Epub 2023 Nov 23.
5
Medical visual question answering: A survey.医学视觉问答:综述。
Artif Intell Med. 2023 Sep;143:102611. doi: 10.1016/j.artmed.2023.102611. Epub 2023 Jun 8.
6
Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders.评估大型语言模型的性能:ChatGPT 和 Google Bard 在神经退行性疾病临床病理会议中生成鉴别诊断的能力。
Brain Pathol. 2024 May;34(3):e13207. doi: 10.1111/bpa.13207. Epub 2023 Aug 8.
7
Ability of ChatGPT to generate competent radiology reports for distal radius fracture by use of RSNA template items and integrated AO classifier.ChatGPT 利用 RSNA 模板项目和集成的 AO 分类器生成桡骨远端骨折有能力的放射学报告。
Curr Probl Diagn Radiol. 2024 Jan-Feb;53(1):102-110. doi: 10.1067/j.cpradiol.2023.04.001. Epub 2023 Apr 17.
8
Large language model (ChatGPT) as a support tool for breast tumor board.大语言模型(ChatGPT)作为乳腺肿瘤多学科诊疗团队的辅助工具。
NPJ Breast Cancer. 2023 May 30;9(1):44. doi: 10.1038/s41523-023-00557-8.
9
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT:其应用、优势、局限性、未来前景及伦理考量概述
Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.
10
Applicability of ChatGPT in Assisting to Solve Higher Order Problems in Pathology.ChatGPT在协助解决病理学高阶问题中的适用性。
Cureus. 2023 Feb 20;15(2):e35237. doi: 10.7759/cureus.35237. eCollection 2023 Feb.