使用自然语言处理（GPT-4）进行放射学中脑溢血的计算机断层扫描图像分析：回顾性分析。

Using Natural Language Processing (GPT-4) for Computed Tomography Image Analysis of Cerebral Hemorrhages in Radiology: Retrospective Analysis.

机构信息

Brain Injury Centre, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

Shanghai Institute of Head Trauma, Shanghai, China.

出版信息

J Med Internet Res. 2024 Sep 26;26:e58741. doi: 10.2196/58741.

DOI:10.2196/58741

PMID:39326037

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11467597/

Abstract

BACKGROUND

Cerebral hemorrhage is a critical medical condition that necessitates a rapid and precise diagnosis for timely medical intervention, including emergency operation. Computed tomography (CT) is essential for identifying cerebral hemorrhage, but its effectiveness is limited by the availability of experienced radiologists, especially in resource-constrained regions or when shorthanded during holidays or at night. Despite advancements in artificial intelligence-driven diagnostic tools, most require technical expertise. This poses a challenge for widespread adoption in radiological imaging. The introduction of advanced natural language processing (NLP) models such as GPT-4, which can annotate and analyze images without extensive algorithmic training, offers a potential solution.

OBJECTIVE

This study investigates GPT-4's capability to identify and annotate cerebral hemorrhages in cranial CT scans. It represents a novel application of NLP models in radiological imaging.

METHODS

In this retrospective analysis, we collected 208 CT scans with 6 types of cerebral hemorrhages at Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, between January and September 2023. All CT images were mixed together and sequentially numbered, so each CT image had its own corresponding number. A random sequence from 1 to 208 was generated, and all CT images were inputted into GPT-4 for analysis in the order of the random sequence. The outputs were subsequently examined using Photoshop and evaluated by experienced radiologists on a 4-point scale to assess identification completeness, accuracy, and success.

RESULTS

The overall identification completeness percentage for the 6 types of cerebral hemorrhages was 72.6% (SD 18.6%). Specifically, GPT-4 achieved higher identification completeness in epidural and intraparenchymal hemorrhages (89.0%, SD 19.1% and 86.9%, SD 17.7%, respectively), yet its identification completeness percentage in chronic subdural hemorrhages was very low (37.3%, SD 37.5%). The misidentification percentages for complex hemorrhages (54.0%, SD 28.0%), epidural hemorrhages (50.2%, SD 22.7%), and subarachnoid hemorrhages (50.5%, SD 29.2%) were relatively high, whereas they were relatively low for acute subdural hemorrhages (32.6%, SD 26.3%), chronic subdural hemorrhages (40.3%, SD 27.2%), and intraparenchymal hemorrhages (26.2%, SD 23.8%). The identification completeness percentages in both massive and minor bleeding showed no significant difference (P=.06). However, the misidentification percentage in recognizing massive bleeding was significantly lower than that for minor bleeding (P=.04). The identification completeness percentages and misidentification percentages for cerebral hemorrhages at different locations showed no significant differences (all P>.05). Lastly, radiologists showed relative acceptance regarding identification completeness (3.60, SD 0.54), accuracy (3.30, SD 0.65), and success (3.38, SD 0.64).

CONCLUSIONS

GPT-4, a standout among NLP models, exhibits both promising capabilities and certain limitations in the realm of radiological imaging, particularly when it comes to identifying cerebral hemorrhages in CT scans. This opens up new directions and insights for the future development of NLP models in radiology.

TRIAL REGISTRATION

ClinicalTrials.gov NCT06230419; https://clinicaltrials.gov/study/NCT06230419.

摘要

背景

脑出血是一种危急的医疗状况，需要快速、准确的诊断以便及时进行医疗干预，包括紧急手术。计算机断层扫描（CT）是识别脑出血的重要手段，但由于经验丰富的放射科医生资源有限，尤其是在资源匮乏的地区或节假日或夜间人手不足时，其效果受到限制。尽管人工智能驱动的诊断工具取得了进展，但大多数工具都需要技术专业知识。这在放射影像学的广泛应用中构成了挑战。引入先进的自然语言处理（NLP）模型，如 GPT-4，它可以在无需广泛算法训练的情况下对图像进行注释和分析，提供了一种潜在的解决方案。

目的

本研究旨在调查 GPT-4 识别和标注颅 CT 扫描中脑出血的能力。这代表了 NLP 模型在放射影像学中的新应用。

方法

在这项回顾性分析中，我们收集了 208 例上海交通大学医学院附属仁济医院 2023 年 1 月至 9 月期间的 6 种类型的脑出血 CT 扫描。所有 CT 图像混合在一起并按顺序编号，因此每个 CT 图像都有自己的编号。生成一个 1 到 208 的随机序列，然后按照随机序列将所有 CT 图像输入 GPT-4 进行分析。使用 Photoshop 检查输出，并由经验丰富的放射科医生对其进行 4 分制评估，以评估识别的完整性、准确性和成功率。

结果

6 种类型脑出血的总体识别完整性百分比为 72.6%（SD 18.6%）。具体来说，GPT-4 在硬膜外和脑实质内出血中的识别完整性较高（89.0%，SD 19.1%和 86.9%，SD 17.7%），但慢性硬膜下血肿的识别完整性百分比非常低（37.3%，SD 37.5%）。复杂出血（54.0%，SD 28.0%）、硬膜外出血（50.2%，SD 22.7%）和蛛网膜下腔出血（50.5%，SD 29.2%）的误识别率较高，而急性硬膜下出血（32.6%，SD 26.3%）、慢性硬膜下血肿（40.3%，SD 27.2%）和脑实质内出血（26.2%，SD 23.8%）的误识别率较低。大量出血和少量出血的识别完整性百分比无显著差异（P=.06）。然而，识别大量出血的误识别率显著低于识别少量出血的误识别率（P=.04）。不同部位脑出血的识别完整性百分比和误识别率无显著差异（均 P>.05）。最后，放射科医生对识别的完整性（3.60，SD 0.54）、准确性（3.30，SD 0.65）和成功率（3.38，SD 0.64）的接受程度相对较高。

结论

GPT-4 作为 NLP 模型中的佼佼者，在放射影像学领域展现出了令人瞩目的能力和一定的局限性，特别是在 CT 扫描中识别脑出血方面。这为 NLP 模型在放射学中的未来发展开辟了新的方向和思路。

试验注册

ClinicalTrials.gov NCT06230419；https://clinicaltrials.gov/study/NCT06230419。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef18/11467597/e7aa89e24137/jmir_v26i1e58741_fig1.jpg

相似文献

Using Natural Language Processing (GPT-4) for Computed Tomography Image Analysis of Cerebral Hemorrhages in Radiology: Retrospective Analysis.使用自然语言处理（GPT-4）进行放射学中脑溢血的计算机断层扫描图像分析：回顾性分析。

J Med Internet Res. 2024 Sep 26;26:e58741. doi: 10.2196/58741.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study.基于人工智能的心脏CT衰减扫描检测肝脂肪变性及综合肝脏评估可增强全因死亡风险分层：一项多中心研究

medRxiv. 2025 Jun 11:2025.06.09.25329157. doi: 10.1101/2025.06.09.25329157.

PET-CT for assessing mediastinal lymph node involvement in patients with suspected resectable non-small cell lung cancer.正电子发射断层显像-计算机断层扫描用于评估疑似可切除非小细胞肺癌患者的纵隔淋巴结受累情况。

Cochrane Database Syst Rev. 2014 Nov 13;2014(11):CD009519. doi: 10.1002/14651858.CD009519.pub2.

Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤：系统评价与经济学评估

Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.利用晚期癌症患者腹部和骨盆 CT 图像建立卷积神经网络模型预测股骨近端病理性骨折的研究

Clin Orthop Relat Res. 2023 Nov 1;481(11):2247-2256. doi: 10.1097/CORR.0000000000002771. Epub 2023 Aug 23.

123I-MIBG scintigraphy and 18F-FDG-PET imaging for diagnosing neuroblastoma.用于诊断神经母细胞瘤的123I-间碘苄胍闪烁扫描术和18F-氟代脱氧葡萄糖正电子发射断层显像

Cochrane Database Syst Rev. 2015 Sep 29;2015(9):CD009263. doi: 10.1002/14651858.CD009263.pub2.

Sertindole for schizophrenia.用于治疗精神分裂症的舍吲哚。

Cochrane Database Syst Rev. 2005 Jul 20;2005(3):CD001715. doi: 10.1002/14651858.CD001715.pub2.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗：一项网状荟萃分析。

Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.

引用本文的文献

Performance of Large Language Models in Recognizing Brain MRI Sequences: A Comparative Analysis of ChatGPT-4o, Claude 4 Opus, and Gemini 2.5 Pro.大语言模型在识别脑部磁共振成像序列方面的表现：ChatGPT-4o、Claude 4 Opus和Gemini 2.5 Pro的比较分析

Diagnostics (Basel). 2025 Jul 30;15(15):1919. doi: 10.3390/diagnostics15151919.

Do LLMs Have 'the Eye' for MRI? Evaluating GPT-4o, Grok, and Gemini on Brain MRI Performance: First Evaluation of Grok in Medical Imaging and a Comparative Analysis.大型语言模型对磁共振成像有“洞察力”吗？评估GPT-4o、Grok和Gemini在脑部磁共振成像性能方面的表现：Grok在医学成像中的首次评估及比较分析

Diagnostics (Basel). 2025 May 24;15(11):1320. doi: 10.3390/diagnostics15111320.

Predicting Immunotherapy Response in Unresectable Hepatocellular Carcinoma: A Comparative Study of Large Language Models and Human Experts.预测不可切除肝细胞癌的免疫治疗反应：大语言模型与人类专家的比较研究

J Med Syst. 2025 May 15;49(1):64. doi: 10.1007/s10916-025-02192-1.

Smart Pharmaceutical Monitoring System With Personalized Medication Schedules and Self-Management Programs for Patients With Diabetes: Development and Evaluation Study.具有个性化用药计划和自我管理方案的糖尿病患者智能药物监测系统：开发与评估研究

J Med Internet Res. 2025 Feb 11;27:e56737. doi: 10.2196/56737.

Detection of Intracranial Hemorrhage from Computed Tomography Images: Diagnostic Role and Efficacy of ChatGPT-4o.从计算机断层扫描图像中检测颅内出血：ChatGPT-4o的诊断作用和效能

Diagnostics (Basel). 2025 Jan 9;15(2):143. doi: 10.3390/diagnostics15020143.

本文引用的文献

Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis.幻觉发生率和 ChatGPT 与 Bard 用于系统评价的参考准确性：比较分析。

J Med Internet Res. 2024 May 22;26:e53164. doi: 10.2196/53164.

Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases.ChatGPT根据患者病史和影像学检查结果对神经放射学病例进行诊断的准确性。

Neuroradiology. 2024 Jan;66(1):73-79. doi: 10.1007/s00234-023-03252-4. Epub 2023 Nov 23.

A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports.GPT-4 在提供 MRI 报告中的骨科治疗建议方面的功效的初步研究。

Sci Rep. 2023 Nov 17;13(1):20159. doi: 10.1038/s41598-023-47500-2.

Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力

Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.

Can GPT-4 be a viable alternative for discussing complex cases in digital oral radiology? A critical analysis.GPT-4能否成为数字口腔放射学中讨论复杂病例的可行替代方案？一项批判性分析。

EXCLI J. 2023 Aug 1;22:749-751. doi: 10.17179/excli2023-6373. eCollection 2023.

Evaluating Health Disparities in Radiology Practices in New Jersey: Exploring Radiologist Geographical Distribution.评估新泽西州放射学实践中的健康差异：探究放射科医生的地理分布

Cureus. 2023 Aug 14;15(8):e43474. doi: 10.7759/cureus.43474. eCollection 2023 Aug.

Evaluating GPT as an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot.评估 GPT 作为放射学决策辅助工具：GPT-4 与 GPT-3.5 在乳腺成像试点中的比较。

J Am Coll Radiol. 2023 Oct;20(10):990-997. doi: 10.1016/j.jacr.2023.05.003. Epub 2023 Jun 21.

Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study.利用GPT-4将自由文本放射学报告进行事后转换为结构化报告：一项多语言可行性研究。

Radiology. 2023 May;307(4):e230725. doi: 10.1148/radiol.230725. Epub 2023 Apr 4.

Radiology residency training in China: results from the first retrospective nationwide survey.中国放射科住院医师培训：首次全国性回顾性调查结果

Insights Imaging. 2021 Feb 17;12(1):25. doi: 10.1186/s13244-021-00970-2.

Workload for radiologists during on-call hours: dramatic increase in the past 15 years.放射科医生值班期间的工作量：在过去15年中急剧增加。

Insights Imaging. 2020 Nov 23;11(1):121. doi: 10.1186/s13244-020-00925-z.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用自然语言处理（GPT-4）进行放射学中脑溢血的计算机断层扫描图像分析：回顾性分析。

Using Natural Language Processing (GPT-4) for Computed Tomography Image Analysis of Cerebral Hemorrhages in Radiology: Retrospective Analysis.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

TRIAL REGISTRATION

背景

目的

方法

结果

结论

试验注册

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献