• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于确定死因的大语言模型的诊断性能:临床病史、尸检计算机断层扫描结果及其整合的比较分析

Diagnostic Performance of a Large Language Model for Determining the Cause of Death: A Comparative Analysis of Clinical History, Postmortem Computed Tomography Findings, and Their Integration.

作者信息

Ishida Masanori, Gonoi Wataru, Nyunoya Keisuke, Abe Hiroyuki, Shirota Go, Okimoto Naomasa, Fujimoto Kotaro, Kurokawa Mariko, Katayama Akira, Takahashi-Mizuki Masumi, Inui Shohei, Saito Kazuhiro, Ushiku Tetsuo, Abe Osamu

机构信息

Radiology, Tokyo Medical University Hospital, Tokyo, JPN.

Radiology, The University of Tokyo Hospital, Tokyo, JPN.

出版信息

Cureus. 2025 May 8;17(5):e83721. doi: 10.7759/cureus.83721. eCollection 2025 May.

DOI:10.7759/cureus.83721
PMID:40486463
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12145502/
Abstract

INTRODUCTION

This study evaluates the diagnostic performance of a large language model (LLM) in determining causes of death by comparing three different information sources.

METHODS

A total of 150 consecutive adult in-hospital cadavers underwent postmortem CT and pathological autopsy (2009-2013). The diagnostic accuracy of Claude 3.5 Sonnet (Anthropic, San Francisco, California) was evaluated in determining both underlying and immediate causes of death using three different information sources (clinical history alone, postmortem CT findings alone as documented by radiologists in their reports, and their integration). For each case, the LLM provided a primary diagnosis and two differential diagnoses. The autopsy result was used as the reference standard to assess accuracy.

RESULTS

For underlying causes, the integration of both sources achieved significantly higher accuracy (78.0%) compared with the clinical history alone (69.3%) or the CT findings alone (42.0%) (p<0.001). When considering either primary or differential diagnoses, the accuracy reached 84.7% with integrated sources, 78.0% with clinical history alone, and 58.7% with CT findings alone. For immediate causes, the integrated approach showed higher accuracy in the primary diagnosis (61.3%) than the clinical history alone (52.0%) and CT findings alone (46.7%) (p<0.001). Disease-specific diagnostic accuracy analyses revealed marked variations, with hematologic malignancies showing the most significant differences among information sources (clinical history: 78.9%, CT findings alone: 36.8%, integrated analysis: 85.7%; p=0.003).

CONCLUSION

Integrating postmortem CT findings with clinical history enhances LLM-based cause-of-death determination accuracy, demonstrating the value of multiple information sources while highlighting opportunities for disease-specific diagnostic optimization.

摘要

引言

本研究通过比较三种不同的信息来源,评估了大语言模型(LLM)在确定死因方面的诊断性能。

方法

共有150例连续的成年住院尸体接受了尸检CT和病理解剖(2009 - 2013年)。使用三种不同的信息来源(仅临床病史、放射科医生在报告中记录的仅尸检CT结果以及两者结合),评估了Claude 3.5 Sonnet(Anthropic,加利福尼亚州旧金山)在确定根本死因和直接死因方面的诊断准确性。对于每个病例,大语言模型提供一个初步诊断和两个鉴别诊断。尸检结果用作评估准确性的参考标准。

结果

对于根本死因,与仅临床病史(69.3%)或仅CT结果(42.0%)相比,两种来源结合的准确性显著更高(78.0%)(p<0.001)。在考虑初步诊断或鉴别诊断时,结合来源的准确性达到84.7%,仅临床病史为78.0%,仅CT结果为58.7%。对于直接死因,综合方法在初步诊断中的准确性(61.3%)高于仅临床病史(52.0%)和仅CT结果(46.7%)(p<0.001)。特定疾病的诊断准确性分析显示出显著差异,血液系统恶性肿瘤在信息来源之间的差异最为显著(临床病史:78.9%,仅CT结果:36.8%,综合分析:85.7%;p = 0.003)。

结论

将尸检CT结果与临床病史相结合可提高基于大语言模型的死因确定准确性,证明了多种信息来源的价值,同时突出了特定疾病诊断优化的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/021c/12145502/cec0ea625239/cureus-0017-00000083721-i02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/021c/12145502/3e16f188dd55/cureus-0017-00000083721-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/021c/12145502/cec0ea625239/cureus-0017-00000083721-i02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/021c/12145502/3e16f188dd55/cureus-0017-00000083721-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/021c/12145502/cec0ea625239/cureus-0017-00000083721-i02.jpg

相似文献

1
Diagnostic Performance of a Large Language Model for Determining the Cause of Death: A Comparative Analysis of Clinical History, Postmortem Computed Tomography Findings, and Their Integration.用于确定死因的大语言模型的诊断性能:临床病史、尸检计算机断层扫描结果及其整合的比较分析
Cureus. 2025 May 8;17(5):e83721. doi: 10.7759/cureus.83721. eCollection 2025 May.
2
Diagnostic Performance of GPT-4o and Claude 3 Opus in Determining Causes of Death From Medical Histories and Postmortem CT Findings.GPT-4o和Claude 3 Opus根据病史和尸检CT结果确定死因的诊断性能
Cureus. 2024 Aug 20;16(8):e67306. doi: 10.7759/cureus.67306. eCollection 2024 Aug.
3
Postmortem CT is more accurate than clinical diagnosis for identifying the immediate cause of death in hospitalized patients: a prospective autopsy-based study.在确定住院患者的直接死因方面,尸检CT比临床诊断更准确:一项基于前瞻性尸检的研究。
Virchows Arch. 2016 Jul;469(1):101-9. doi: 10.1007/s00428-016-1937-6. Epub 2016 Apr 16.
4
Diagnostic performances of Claude 3 Opus and Claude 3.5 Sonnet from patient history and key images in Radiology's "Diagnosis Please" cases.Claude 3 Opus 和 Claude 3.5 Sonnet 基于病史和放射科“诊断请”病例关键图像的诊断性能。
Jpn J Radiol. 2024 Dec;42(12):1399-1402. doi: 10.1007/s11604-024-01634-z. Epub 2024 Aug 3.
5
Contrast-enhanced postmortem computed tomography in clinical pathology: enhanced value of 20 clinical autopsies.临床病理学中的对比增强尸检计算机断层扫描:20例临床尸检的增强价值
Hum Pathol. 2014 Sep;45(9):1813-23. doi: 10.1016/j.humpath.2014.05.007. Epub 2014 Jun 4.
6
Performance of postmortem CT in the diagnosis of natural death from out-of-hospital cardiac arrest.死后 CT 在诊断院外心脏骤停自然死亡中的应用。
Jpn J Radiol. 2024 Aug;42(8):825-831. doi: 10.1007/s11604-024-01559-7. Epub 2024 Apr 16.
7
Evaluating Large Language Models in Dental Anesthesiology: A Comparative Analysis of ChatGPT-4, Claude 3 Opus, and Gemini 1.0 on the Japanese Dental Society of Anesthesiology Board Certification Exam.评估牙科麻醉学中的大语言模型:ChatGPT-4、Claude 3 Opus和Gemini 1.0在日本麻醉学牙科协会委员会认证考试中的比较分析。
Cureus. 2024 Sep 27;16(9):e70302. doi: 10.7759/cureus.70302. eCollection 2024 Sep.
8
Can virtual autopsy with postmortem CT improve clinical diagnosis of cause of death? A retrospective observational cohort study in a Dutch tertiary referral centre.尸检CT虚拟解剖能否改善死因的临床诊断?荷兰一家三级转诊中心的回顾性观察队列研究。
BMJ Open. 2018 Mar 16;8(3):e018834. doi: 10.1136/bmjopen-2017-018834.
9
Efficacy of postmortem CT and tissue sampling in establishing the cause of death in clinical practice: a prospective observational study.死后 CT 和组织取样在临床实践中确定死因的效果:一项前瞻性观察研究。
J Clin Pathol. 2024 Mar 20;77(4):259-265. doi: 10.1136/jcp-2021-207946.
10
Postmortem imaging findings and cause of death determination compared with autopsy: a systematic review of diagnostic test accuracy and meta-analysis.尸检后影像学检查结果与尸检确定死因的比较:诊断试验准确性的系统评价和荟萃分析
Int J Legal Med. 2020 Jan;134(1):321-337. doi: 10.1007/s00414-019-02140-y. Epub 2019 Aug 27.

本文引用的文献

1
Structured clinical reasoning prompt enhances LLM's diagnostic capabilities in diagnosis please quiz cases.结构化临床推理提示增强了大语言模型在诊断测验病例中的诊断能力。
Jpn J Radiol. 2025 Apr;43(4):586-592. doi: 10.1007/s11604-024-01712-2. Epub 2024 Dec 3.
2
"This Is a Quiz" Premise Input: A Key to Unlocking Higher Diagnostic Accuracy in Large Language Models.《“这是一个测验”前提输入:解锁大语言模型更高诊断准确性的关键》
Cureus. 2024 Oct 25;16(10):e72383. doi: 10.7759/cureus.72383. eCollection 2024 Oct.
3
Response accuracy of GPT-4 across languages: insights from an expert-level diagnostic radiology examination in Japan.
GPT-4在多种语言中的回答准确性:来自日本专家级诊断放射学考试的见解。
Jpn J Radiol. 2025 Feb;43(2):319-329. doi: 10.1007/s11604-024-01673-6. Epub 2024 Oct 28.
4
Large language models for structured reporting in radiology: past, present, and future.用于放射学结构化报告的大语言模型:过去、现在和未来。
Eur Radiol. 2025 May;35(5):2589-2602. doi: 10.1007/s00330-024-11107-6. Epub 2024 Oct 23.
5
Diagnostic Performance of GPT-4o and Claude 3 Opus in Determining Causes of Death From Medical Histories and Postmortem CT Findings.GPT-4o和Claude 3 Opus根据病史和尸检CT结果确定死因的诊断性能
Cureus. 2024 Aug 20;16(8):e67306. doi: 10.7759/cureus.67306. eCollection 2024 Aug.
6
Preliminary assessment of TNM classification performance for pancreatic cancer in Japanese radiology reports using GPT-4.使用GPT-4对日本放射学报告中胰腺癌的TNM分类性能进行初步评估。
Jpn J Radiol. 2025 Jan;43(1):51-55. doi: 10.1007/s11604-024-01643-y. Epub 2024 Aug 20.
7
Diagnostic performances of Claude 3 Opus and Claude 3.5 Sonnet from patient history and key images in Radiology's "Diagnosis Please" cases.Claude 3 Opus 和 Claude 3.5 Sonnet 基于病史和放射科“诊断请”病例关键图像的诊断性能。
Jpn J Radiol. 2024 Dec;42(12):1399-1402. doi: 10.1007/s11604-024-01634-z. Epub 2024 Aug 3.
8
Diagnostic accuracy of vision-language models on Japanese diagnostic radiology, nuclear medicine, and interventional radiology specialty board examinations.视觉语言模型在日本放射诊断学、核医学和介入放射学专业委员会考试中的诊断准确性。
Jpn J Radiol. 2024 Dec;42(12):1392-1398. doi: 10.1007/s11604-024-01633-0. Epub 2024 Jul 20.
9
Diagnostic performances of GPT-4o, Claude 3 Opus, and Gemini 1.5 Pro in "Diagnosis Please" cases.GPT-4o、Claude 3 Opus 和 Gemini 1.5 Pro 在“诊断请”案例中的诊断性能。
Jpn J Radiol. 2024 Nov;42(11):1231-1235. doi: 10.1007/s11604-024-01619-y. Epub 2024 Jul 1.
10
GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination.GPT-4 Turbo with Vision 在日本诊断放射学委员会考试中未能优于仅文本的 GPT-4 Turbo。
Jpn J Radiol. 2024 Aug;42(8):918-926. doi: 10.1007/s11604-024-01561-z. Epub 2024 May 11.