• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT与DeepSeek大语言模型在心包炎诊断中的比较

Comparison of ChatGPT and DeepSeek large language models in the diagnosis of pericarditis.

作者信息

Goyal Aman, Sulaiman Samia Aziz, Alaarag Abdallah, Hoshan Waseem, Goyal Priya, Shah Viraj, Daoud Mohamed, Mahalwar Gauranga, Sheikh Abu Baker

机构信息

Department of Internal Medicine, Cleveland Clinic Foundation, Cleveland, OH 44195, United States.

School of Medicine, The University of Jordan, Amman 11942, Jordan.

出版信息

World J Cardiol. 2025 Aug 26;17(8):110489. doi: 10.4330/wjc.v17.i8.110489.

DOI:10.4330/wjc.v17.i8.110489
PMID:40949931
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12426987/
Abstract

BACKGROUND

The integration of sophisticated large language models (LLMs) into healthcare has recently garnered significant attention due to their ability to leverage deep learning techniques to process vast datasets and generate contextually accurate, human-like responses. These models have been previously applied in medical diagnostics, such as in the evaluation of oral lesions. Given the high rate of missed diagnoses in pericarditis, LLMs may support clinicians in generating differential diagnoses-particularly in atypical cases where risk stratification and early identification are critical to preventing serious complications such as constrictive pericarditis and pericardial tamponade.

AIM

To compare the accuracy of LLMs in assisting the diagnosis of pericarditis as risk stratification tools.

METHODS

A PubMed search was conducted using the keyword "pericarditis", applying filters for "case reports". Data from relevant cases were extracted. Inclusion criteria consisted of English-language reports involving patients aged 18 years or older with a confirmed diagnosis of acute pericarditis. The diagnostic capabilities of ChatGPT o1 and DeepThink-R1 were assessed by evaluating whether pericarditis was included in the top three differential diagnoses and as the sole provisional diagnosis. Each case was classified as either "yes" or "no" for inclusion.

RESULTS

From the initial search, 220 studies were identified, of which 16 case reports met the inclusion criteria. In assessing risk stratification for acute pericarditis, ChatGPT o1 correctly identified the condition in 10 of 16 cases (62.5%) in the differential diagnosis and in 8 of 16 cases (50.0%) as the provisional diagnosis. DeepThink-R1 identified it in 8 of 16 cases (50.0%) and 6 of 16 cases (37.5%), respectively. ChatGPT o1 demonstrated higher accuracy than DeepThink-R1 in identifying pericarditis.

CONCLUSION

Further research with larger sample sizes and optimized prompt engineering is warranted to improve diagnostic accuracy, particularly in atypical presentations.

摘要

背景

复杂的大语言模型(LLMs)融入医疗保健领域最近备受关注,因为它们能够利用深度学习技术处理大量数据集,并生成上下文准确、类似人类的回应。这些模型此前已应用于医学诊断,如口腔病变评估。鉴于心包炎漏诊率高,大语言模型可能有助于临床医生进行鉴别诊断,特别是在非典型病例中,风险分层和早期识别对于预防诸如缩窄性心包炎和心包填塞等严重并发症至关重要。

目的

比较大语言模型作为风险分层工具辅助诊断心包炎的准确性。

方法

使用关键词“心包炎”在PubMed上进行搜索,并筛选“病例报告”。提取相关病例的数据。纳入标准包括涉及18岁及以上确诊急性心包炎患者的英文报告。通过评估心包炎是否被列入前三个鉴别诊断以及作为唯一的临时诊断来评估ChatGPT o1和DeepThink-R1的诊断能力。每个病例根据是否纳入分为“是”或“否”。

结果

从初步搜索中,识别出220项研究,其中16篇病例报告符合纳入标准。在评估急性心包炎的风险分层时,ChatGPT o1在鉴别诊断中16例中的10例(62.5%)正确识别病情,在临时诊断中16例中的8例(50.0%)正确识别。DeepThink-R1分别在16例中的8例(50.0%)和16例中的6例(37.5%)中识别出病情。ChatGPT o1在识别心包炎方面表现出比DeepThink-R1更高的准确性。

结论

有必要进行更大样本量和优化提示工程的进一步研究,以提高诊断准确性,特别是在非典型表现中。

相似文献

1
Comparison of ChatGPT and DeepSeek large language models in the diagnosis of pericarditis.ChatGPT与DeepSeek大语言模型在心包炎诊断中的比较
World J Cardiol. 2025 Aug 26;17(8):110489. doi: 10.4330/wjc.v17.i8.110489.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Stench of Errors or the Shine of Potential: The Challenge of (Ir)Responsible Use of ChatGPT in Speech-Language Pathology.错误的恶臭还是潜力的光辉:言语病理学中(不)负责任地使用ChatGPT的挑战。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70088. doi: 10.1111/1460-6984.70088.
4
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.
5
Performance of ChatGPT-4o and Four Open-Source Large Language Models in Generating Diagnoses Based on China's Rare Disease Catalog: Comparative Study.ChatGPT-4o与四个开源大语言模型基于中国罕见病目录生成诊断的性能:比较研究
J Med Internet Res. 2025 Jun 18;27:e69929. doi: 10.2196/69929.
6
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
7
A multi-dimensional performance evaluation of large language models in dental implantology: comparison of ChatGPT, DeepSeek, Grok, Gemini and Qwen across diverse clinical scenarios.牙种植学中大型语言模型的多维性能评估:ChatGPT、百川智能、Grok、Gemini和通义千问在不同临床场景下的比较
BMC Oral Health. 2025 Jul 28;25(1):1272. doi: 10.1186/s12903-025-06619-6.
8
Are clinical improvements in large language models a reality? Longitudinal comparisons of ChatGPT models and DeepSeek-R1 for psychiatric assessments and interventions.大语言模型在临床上的改进成为现实了吗?ChatGPT模型与DeepSeek-R1在精神科评估与干预方面的纵向比较。
Int J Soc Psychiatry. 2025 Jul 31:207640251358071. doi: 10.1177/00207640251358071.
9
The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.儿科言语和语言治疗师转写语音样本的音标转录的一致性。
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.
10
AI in Medical Questionnaires: Innovations, Diagnosis, and Implications.医学问卷中的人工智能:创新、诊断及影响
J Med Internet Res. 2025 Jun 23;27:e72398. doi: 10.2196/72398.

本文引用的文献

1
DeepSeek: Another step forward in the diagnosis of oral lesions.深度搜索:口腔病变诊断的又一进步。
J Dent Sci. 2025 Jul;20(3):1904-1907. doi: 10.1016/j.jds.2025.02.023. Epub 2025 Mar 9.
2
DeepSeek versus ChatGPT: Multimodal artificial intelligence revolutionizing scientific discovery. From language editing to autonomous content generation-Redefining innovation in research and practice.深度求索与ChatGPT:多模态人工智能正在革新科学发现。从语言编辑到自主内容生成——重新定义研究与实践中的创新。
Knee Surg Sports Traumatol Arthrosc. 2025 May;33(5):1553-1556. doi: 10.1002/ksa.12628. Epub 2025 Feb 12.
3
A Deep Learning Algorithm for Detecting Acute Pericarditis by Electrocardiogram.一种用于通过心电图检测急性心包炎的深度学习算法。
J Pers Med. 2022 Jul 15;12(7):1150. doi: 10.3390/jpm12071150.
4
An Atypical Etiology of Acute Pericarditis: A Case Report.急性心包炎的一种非典型病因:病例报告
Cureus. 2021 Feb 19;13(2):e13440. doi: 10.7759/cureus.13440.
5
Pericardial disease: diagnosis and management.心包疾病:诊断与管理。
Mayo Clin Proc. 2010 Jun;85(6):572-93. doi: 10.4065/mcp.2010.0046.