• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于增强中风管理的自由文本手术记录的精准结构化:大语言模型的比较评估

Precision Structuring of Free-Text Surgical Record for Enhanced Stroke Management: A Comparative Evaluation of Large Language Models.

作者信息

Wang Mengfei, Wei Jianyong, Zeng Yao, Dai Lisong, Yan Bicong, Zhu Yueqi, Wei Xiaoer, Jin Yidong, Li Yuehua

机构信息

School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, People's Republic of China.

Clinical Research Center, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China.

出版信息

J Multidiscip Healthc. 2024 Nov 14;17:5163-5175. doi: 10.2147/JMDH.S486449. eCollection 2024.

DOI:10.2147/JMDH.S486449
PMID:39558925
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11572044/
Abstract

INTRODUCTION

Mechanical thrombectomy (MTB) is a critical procedure for acute ischemic stroke (AIS) patients. However, the free-text format of MTB surgical records limits the formulation of effective postoperative patient management and rehabilitation plans. This study compares the efficacy of large language models (LLMs) in structuring data from these free-text MTB surgical record.

METHODS

This retrospective study collected a total of 382 MTB surgical records from a tertiary hospital. An initial analysis of 30 surgical record from these records provided a guiding prompt for LLMs, focusing on basic and advanced characteristics, such as occlusion locations, thrombectomy maneuvers, reperfusion status, and intraoperative complications. Six LLMs-ChatGPT, GPT-4, GeminiPro, ChatGLM4, Spark3, and QwenMax-were assessed against data extracted by neuroradiologists and a junior physician for comparison. The all 382 surgical records were used to test the performance of LLMs. The performance of the LLMs was quantified using Accuracy, Sensitivity, Specificity, AUC, and MSE as an additional metric for advanced characteristics.

RESULTS

All LLMs showed high performance in characteristic extraction, achieving an average accuracy of 95.09 ± 4.98% across 48 items, and 78.05 ± 4.2% overall. GLM4 and GPT-4 were most accurate in advanced characteristics extraction, with accuracies of 84.03% and 82.20%, respectively. The processing time for LLMs averaged 73.10 ± 10.86 seconds of six models, significantly faster than the 427.88 seconds for manual extraction by physicians.

CONCLUSION

LLMs, particularly GLM4 and GPT-4, efficiently and accurately structured both general and advanced characteristics from MTB surgical record, outperforming manual extraction methods and demonstrating potential for enhancing clinical data management in AIS treatment.

摘要

引言

机械取栓术(MTB)是急性缺血性卒中(AIS)患者的关键治疗手段。然而,MTB手术记录的自由文本格式限制了有效的术后患者管理和康复计划的制定。本研究比较了大语言模型(LLMs)在构建这些自由文本MTB手术记录数据方面的效果。

方法

这项回顾性研究共收集了一家三级医院的382份MTB手术记录。对其中30份手术记录的初步分析为LLMs提供了指导提示,重点关注基本和高级特征,如闭塞部位、取栓操作、再灌注状态和术中并发症。将六个LLMs——ChatGPT、GPT-4、GeminiPro、ChatGLM4、Spark3和QwenMax——与神经放射科医生和一名初级医生提取的数据进行评估比较。使用所有382份手术记录来测试LLMs的性能。通过准确率、灵敏度、特异性、AUC以及作为高级特征附加指标的MSE对LLMs的性能进行量化。

结果

所有LLMs在特征提取方面表现出高性能,48项特征的平均准确率为95.09±4.98%,总体准确率为78.05±4.2%。ChatGLM4和GPT-4在高级特征提取方面最为准确,准确率分别为84.03%和82.20%。六个模型的LLMs平均处理时间为73.10±10.86秒,明显快于医生手动提取的427.88秒。

结论

LLMs,尤其是ChatGLM4和GPT-4,能够高效、准确地构建MTB手术记录中的一般和高级特征,优于手动提取方法,并显示出在AIS治疗中加强临床数据管理的潜力。

相似文献

1
Precision Structuring of Free-Text Surgical Record for Enhanced Stroke Management: A Comparative Evaluation of Large Language Models.用于增强中风管理的自由文本手术记录的精准结构化:大语言模型的比较评估
J Multidiscip Healthc. 2024 Nov 14;17:5163-5175. doi: 10.2147/JMDH.S486449. eCollection 2024.
2
Performance of ChatGPT on prehospital acute ischemic stroke and large vessel occlusion (LVO) stroke screening.ChatGPT在院前急性缺血性卒中及大血管闭塞(LVO)性卒中筛查中的表现。
Digit Health. 2024 Nov 5;10:20552076241297127. doi: 10.1177/20552076241297127. eCollection 2024 Jan-Dec.
3
Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis.利用 ChatGPT 从急性缺血性脑卒中机械取栓的自由文本报告中提取数据:一项回顾性分析。
Radiology. 2024 Apr;311(1):e232741. doi: 10.1148/radiol.232741.
4
Using Large Language Models to Annotate Complex Cases of Social Determinants of Health in Longitudinal Clinical Records.使用大语言模型注释纵向临床记录中健康社会决定因素的复杂病例。
medRxiv. 2024 Apr 27:2024.04.25.24306380. doi: 10.1101/2024.04.25.24306380.
5
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.
6
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型:GPT-3.5、GPT-4 和 Bard 的比较分析。
JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.
7
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.生成式大语言模型与同行用户对解释非专业患者实验室检测结果的答案质量比较:评估研究。
J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.
8
Triage Performance Across Large Language Models, ChatGPT, and Untrained Doctors in Emergency Medicine: Comparative Study.分诊表现比较:大型语言模型、ChatGPT 和未经训练的急诊医生:一项对比研究。
J Med Internet Res. 2024 Jun 14;26:e53297. doi: 10.2196/53297.
9
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.评估大语言模型在与《乳腺影像报告和数据系统》第5版相关问题上的文本和视觉诊断能力。
Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.
10
Examining the Role of Large Language Models in Orthopedics: Systematic Review.检查大型语言模型在骨科中的作用:系统评价。
J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.

引用本文的文献

1
Current Landscape and Future Directions Regarding Generative Large Language Models in Stroke Care: Scoping Review.中风护理中生成式大语言模型的当前现状与未来方向:范围综述
JMIR Med Inform. 2025 Aug 7;13:e76636. doi: 10.2196/76636.
2
Large Language Models in Healthcare: A Bibliometric Analysis and Examination of Research Trends.医疗保健领域的大语言模型:文献计量分析与研究趋势考察
J Multidiscip Healthc. 2025 Jan 17;18:223-238. doi: 10.2147/JMDH.S502351. eCollection 2025.

本文引用的文献

1
Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis.利用 ChatGPT 从急性缺血性脑卒中机械取栓的自由文本报告中提取数据:一项回顾性分析。
Radiology. 2024 Apr;311(1):e232741. doi: 10.1148/radiol.232741.
2
Comparing the performance of ChatGPT GPT-4, Bard, and Llama-2 in the Taiwan Psychiatric Licensing Examination and in differential diagnosis with multi-center psychiatrists.将 ChatGPT GPT-4、Bard 和 Llama-2 在台湾精神科医师执照考试中的表现与多中心精神科医生的鉴别诊断进行比较。
Psychiatry Clin Neurosci. 2024 Jun;78(6):347-352. doi: 10.1111/pcn.13656. Epub 2024 Feb 26.
3
Zero-shot information extraction from radiological reports using ChatGPT.
使用 ChatGPT 从放射报告中进行零样本信息提取。
Int J Med Inform. 2024 Mar;183:105321. doi: 10.1016/j.ijmedinf.2023.105321. Epub 2023 Dec 21.
4
Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports.使用隐私保护的大型语言模型 Vicuna 对放射科报告进行标注的可行性研究。
Radiology. 2023 Oct;309(1):e231147. doi: 10.1148/radiol.231147.
5
Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力
Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.
6
Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations.ChatGPT和GPT-4在神经外科笔试中的表现。
Neurosurgery. 2023 Dec 1;93(6):1353-1365. doi: 10.1227/neu.0000000000002632. Epub 2023 Aug 15.
7
China stroke surveillance report 2021.中国卒中监测报告 2021。
Mil Med Res. 2023 Jul 19;10(1):33. doi: 10.1186/s40779-023-00463-x.
8
Large language models in medicine.医学中的大型语言模型。
Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.
9
Evaluating GPT4 on Impressions Generation in Radiology Reports.评估GPT4在生成放射学报告印象方面的表现。
Radiology. 2023 Jun;307(5):e231259. doi: 10.1148/radiol.231259.
10
The Potential of GPT-4 as an AI-Powered Virtual Assistant for Surgeons Specialized in Joint Arthroplasty.GPT-4 作为一种人工智能驱动的关节置换手术医师虚拟助手的潜力。
Ann Biomed Eng. 2023 Jul;51(7):1366-1370. doi: 10.1007/s10439-023-03207-z. Epub 2023 Apr 18.