• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT在院前急性缺血性卒中及大血管闭塞(LVO)性卒中筛查中的表现。

Performance of ChatGPT on prehospital acute ischemic stroke and large vessel occlusion (LVO) stroke screening.

作者信息

Wang Xinhao, Ye Shisheng, Feng Jinwen, Feng Kaiyan, Yang Heng, Li Hao

机构信息

Department of Neurology, Maoming People's Hospital, Maoming, Guangdong, China.

出版信息

Digit Health. 2024 Nov 5;10:20552076241297127. doi: 10.1177/20552076241297127. eCollection 2024 Jan-Dec.

DOI:10.1177/20552076241297127
PMID:39507012
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11539183/
Abstract

BACKGROUND

The management of acute ischemic stroke (AIS) is time-sensitive, yet prehospital delays remain prevalent. The application of large language models (LLMs) for medical text analysis may play a potential role in clinical decision support. We assess the performance of LLMs on prehospital AIS and large vessel occlusion (LVO) stroke screening.

METHODS

This retrospective study sourced cases from the electronic medical record database of the emergency department (ED) at Maoming People's Hospital, encompassing patients who presented to the ED between June and November 2023. We evaluate the diagnostic accuracy of GPT-3.5 and GPT-4 for the detection of AIS and LVO stroke by comparing the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and positive likelihood ratio and AUC of both LLMs. The neurological reasoning of LLMs was rated on a five-point Likert scale for factual correctness and the occurrence of errors.

RESULT

On 400 records from 400 patients (mean age, 70.0 years ± 12.5 [SD]; 273 male), GPT-4 outperformed GPT-3.5 in AIS screening (AUC 0.75 (0.65-0.84) vs 0.59 (0.50-0.69), P = 0.015) and LVO identification (AUC 0.71 (0.65-0.77) vs 0.60 (0.53-0.66), P < 0.001). GPT-4 achieved higher accuracy than GPT-3.5 in screening of AIS (89.3% [95% CI: 85.8, 91.9] vs 86.5% [95% CI: 82.8, 89.5]) and LVO stroke identification (67.0% [95% CI: 62.3%, 71.4%] vs 47.3% [95% CI: 42.4%, 52.2%]). In neurological reasoning, GPT-4 had higher Likert scale scores for factual correctness (4.24 vs 3.62), with a lower rate of error (6.8% vs 24.8%) than GPT-3.5 (all P < 0.001).

CONCLUSIONS

The result demonstrates that LLMs possess diagnostic capability in the prehospital identification of ischemic stroke, with the ability to exhibit neurologically informed reasoning processes. Notably, GPT-4 outperforms GPT-3.5 in the recognition of AIS and LVO stroke, achieving results comparable to prehospital scales. LLMs are supposed to become a promising supportive decision-making tool for EMS practitioners in screening prehospital stroke.

摘要

背景

急性缺血性卒中(AIS)的治疗对时间敏感,但院前延误仍然普遍存在。大语言模型(LLMs)在医学文本分析中的应用可能在临床决策支持中发挥潜在作用。我们评估了大语言模型在院前AIS和大血管闭塞(LVO)性卒中筛查中的性能。

方法

这项回顾性研究从茂名市人民医院急诊科的电子病历数据库中提取病例,涵盖2023年6月至11月期间到急诊科就诊的患者。我们通过比较GPT-3.5和GPT-4检测AIS和LVO性卒中的敏感性、特异性、准确性、阳性预测值、阴性预测值、阳性似然比和AUC,评估它们的诊断准确性。大语言模型的神经推理能力根据事实正确性和错误发生率在五点李克特量表上进行评分。

结果

在400例患者的400份记录中(平均年龄70.0岁±12.5[标准差];男性273例),GPT-4在AIS筛查(AUC 0.75[0.65 - 0.84] vs 0.59[0.50 - 0.69],P = 0.015)和LVO识别(AUC 0.71[0.65 - 0.77] vs 0.60[0.53 - 0.66],P < 0.001)方面优于GPT-3.5。GPT-4在AIS筛查(89.3%[95%置信区间:85.8,91.9] vs 86.5%[95%置信区间:82.8,89.5])和LVO性卒中识别(67.0%[95%置信区间:62.3%,71.4%] vs 47.3%[95%置信区间:42.4%,52.2%])方面比GPT-3.5具有更高的准确性。在神经推理方面,GPT-4在事实正确性方面的李克特量表得分更高(4.24 vs 3.62),错误率低于GPT-3.5(6.8% vs 24.8%)(所有P < 0.001)。

结论

结果表明,大语言模型在院前缺血性卒中识别中具有诊断能力,能够展现基于神经学的推理过程。值得注意的是,GPT-4在识别AIS和LVO性卒中方面优于GPT-3.5,其结果与院前量表相当。大语言模型有望成为急救医疗服务人员筛查院前卒中的一种有前景的辅助决策工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/87005fed83bc/10.1177_20552076241297127-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/7ecba1a60573/10.1177_20552076241297127-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/4962042431e3/10.1177_20552076241297127-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/87005fed83bc/10.1177_20552076241297127-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/7ecba1a60573/10.1177_20552076241297127-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/4962042431e3/10.1177_20552076241297127-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5e4e/11539183/87005fed83bc/10.1177_20552076241297127-fig3.jpg

相似文献

1
Performance of ChatGPT on prehospital acute ischemic stroke and large vessel occlusion (LVO) stroke screening.ChatGPT在院前急性缺血性卒中及大血管闭塞(LVO)性卒中筛查中的表现。
Digit Health. 2024 Nov 5;10:20552076241297127. doi: 10.1177/20552076241297127. eCollection 2024 Jan-Dec.
2
Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力
Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.
3
Cincinnati Prehospital Stroke Scale Can Identify Large Vessel Occlusion Stroke.辛辛那提院前卒中量表可识别大血管闭塞性卒中。
Prehosp Emerg Care. 2018 May-Jun;22(3):312-318. doi: 10.1080/10903127.2017.1387629. Epub 2018 Jan 3.
4
The Cincinnati Prehospital Stroke Scale Compared to Stroke Severity Tools for Large Vessel Occlusion Stroke Prediction.辛辛那提院前卒中量表与大血管闭塞性卒中严重程度预测工具的比较。
Prehosp Emerg Care. 2021 Jan-Feb;25(1):67-75. doi: 10.1080/10903127.2020.1725198. Epub 2020 Feb 25.
5
Performance of the RACE Score for the Prehospital Identification of Large Vessel Occlusion Stroke in a Suburban/Rural EMS Service.院前 RACE 评分对城郊/农村 EMS 服务中大动脉闭塞性卒中的识别效能。
Prehosp Emerg Care. 2019 Sep-Oct;23(5):612-618. doi: 10.1080/10903127.2019.1573281. Epub 2019 Feb 20.
6
Systematic Review and Meta-Analysis of Prehospital Machine Learning Scores as Screening Tools for Early Detection of Large Vessel Occlusion in Patients With Suspected Stroke.系统评价和荟萃分析:院前机器学习评分作为疑似卒中患者早期检测大血管闭塞的筛查工具。
J Am Heart Assoc. 2024 Jun 18;13(12):e033298. doi: 10.1161/JAHA.123.033298. Epub 2024 Jun 14.
7
Field Assessment of Critical Stroke by Emergency Services for Acute Delivery to a Comprehensive Stroke Center: FACEAD.现场评估急救服务对急性送达综合卒中中心的危急卒中:FACEAD。
Transl Stroke Res. 2020 Aug;11(4):664-670. doi: 10.1007/s12975-019-00751-6. Epub 2019 Dec 12.
8
Comparing validated stroke screening scales for identifying large and medium vessel occlusions: a prospective observational cohort study.比较用于识别大中型血管闭塞的经验证的卒中筛查量表:一项前瞻性观察性队列研究。
J Neurointerv Surg. 2025 Jan 27. doi: 10.1136/jnis-2024-022309.
9
Numerical Cincinnati Stroke Scale versus Stroke Severity Screening Tools for the Prehospital Determination of LVO.用于院前大血管闭塞(LVO)判定的辛辛那提卒中量表数值与卒中严重程度筛查工具对比
medRxiv. 2024 May 4:2024.05.02.24306794. doi: 10.1101/2024.05.02.24306794.
10
The Accuracy of Large Vessel Occlusion Recognition Scales in Telestroke Setting.远程卒中环境下大血管闭塞识别量表的准确性
Telemed J E Health. 2019 Nov;25(11):1071-1076. doi: 10.1089/tmj.2018.0232. Epub 2019 Feb 12.

引用本文的文献

1
Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education.绘制急诊医学中的人工智能模型:关于人工智能在急诊护理和教育中表现的范围综述。
Turk J Emerg Med. 2025 Apr 1;25(2):67-91. doi: 10.4103/tjem.tjem_45_25. eCollection 2025 Apr-Jun.

本文引用的文献

1
Delayed diagnosis of a transient ischemic attack caused by ChatGPT.因 ChatGPT 导致的短暂性脑缺血发作被延误诊断。
Wien Klin Wochenschr. 2024 Apr;136(7-8):236-238. doi: 10.1007/s00508-024-02329-1. Epub 2024 Feb 2.
2
Performance of Large Language Models on a Neurology Board-Style Examination.大语言模型在神经科 board-style 考试中的表现。
JAMA Netw Open. 2023 Dec 1;6(12):e2346721. doi: 10.1001/jamanetworkopen.2023.46721.
3
The Role of ChatGPT in the Advancement of Diagnosis, Management, and Prognosis of Cardiovascular and Cerebrovascular Disease.
ChatGPT在心血管和脑血管疾病诊断、管理及预后评估中的作用
Healthcare (Basel). 2023 Nov 6;11(21):2906. doi: 10.3390/healthcare11212906.
4
Large Language Models in Neurology Research and Future Practice.大语言模型在神经病学研究和未来实践中的应用。
Neurology. 2023 Dec 4;101(23):1058-1067. doi: 10.1212/WNL.0000000000207967.
5
Large Language Models Answer Medical Questions Accurately, but Can't Match Clinicians' Knowledge.大型语言模型能准确回答医学问题,但无法与临床医生的知识相媲美。
JAMA. 2023 Sep 5;330(9):792-794. doi: 10.1001/jama.2023.14311.
6
The Challenges for Regulating Medical Use of ChatGPT and Other Large Language Models.规范ChatGPT及其他大语言模型在医学领域应用的挑战
JAMA. 2023 Jul 25;330(4):315-316. doi: 10.1001/jama.2023.9651.
7
Prehospital Stroke Care Part 2: On-Scene Evaluation and Management by Emergency Medical Services Practitioners.院前卒中护理第 2 部分:急救医疗服务人员的现场评估与管理。
Stroke. 2023 May;54(5):1416-1425. doi: 10.1161/STROKEAHA.123.039792. Epub 2023 Mar 3.
8
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试(USMLE)中的表现如何?大语言模型对医学教育和知识评估的影响。
JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.
9
Prehospital Stroke Care Part 1: Emergency Medical Services and the Stroke Systems of Care.院前卒中护理第一部分:急诊医疗服务和卒中护理系统。
Stroke. 2023 Apr;54(4):1138-1147. doi: 10.1161/STROKEAHA.122.039586. Epub 2022 Nov 29.
10
Treatment Delays and Chance of Reperfusion Therapy in Patients with Acute Stroke: A Danish Nationwide Study.急性脑卒中患者的治疗延迟与再灌注治疗机会:一项丹麦全国性研究。
Cerebrovasc Dis. 2023;52(3):275-282. doi: 10.1159/000526733. Epub 2022 Oct 31.