• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用机器学习和自然语言处理实现缺血性中风亚型分类的自动化

Automating Ischemic Stroke Subtype Classification Using Machine Learning and Natural Language Processing.

作者信息

Garg Ravi, Oh Elissa, Naidech Andrew, Kording Konrad, Prabhakaran Shyam

机构信息

Department of Neurology, Northwestern University, Feinberg School of Medicine, Chicago, Illinois.

University of Pennsylvania, Philadelphia, Pennsylvania.

出版信息

J Stroke Cerebrovasc Dis. 2019 Jul;28(7):2045-2051. doi: 10.1016/j.jstrokecerebrovasdis.2019.02.004. Epub 2019 May 15.

DOI:10.1016/j.jstrokecerebrovasdis.2019.02.004
PMID:31103549
Abstract

OBJECTIVE

The manual adjudication of disease classification is time-consuming, error-prone, and limits scaling to large datasets. In ischemic stroke (IS), subtype classification is critical for management and outcome prediction. This study sought to use natural language processing of electronic health records (EHR) combined with machine learning methods to automate IS subtyping.

METHODS

Among IS patients from an observational registry with TOAST subtyping adjudicated by board-certified vascular neurologists, we analyzed unstructured text-based EHR data including neurology progress notes and neuroradiology reports using natural language processing. We performed several feature selection methods to reduce the high dimensionality of the features and 5-fold cross validation to test generalizability of our methods and minimize overfitting. We used several machine learning methods and calculated the kappa values for agreement between each machine learning approach to manual adjudication. We then performed a blinded testing of the best algorithm against a held-out subset of 50 cases.

RESULTS

Compared to manual classification, the best machine-based classification achieved a kappa of .25 using radiology reports alone, .57 using progress notes alone, and .57 using combined data. Kappa values varied by subtype being highest for cardioembolic (.64) and lowest for cryptogenic cases (.47). In the held-out test subset, machine-based classification agreed with rater classification in 40 of 50 cases (kappa .72).

CONCLUSIONS

Automated machine learning approaches using textual data from the EHR shows agreement with manual TOAST classification. The automated pipeline, if externally validated, could enable large-scale stroke epidemiology research.

摘要

目的

疾病分类的人工判定耗时、易出错,且限制了对大型数据集的扩展。在缺血性卒中(IS)中,亚型分类对于治疗管理和预后预测至关重要。本研究旨在利用电子健康记录(EHR)的自然语言处理技术结合机器学习方法,实现IS亚型分类的自动化。

方法

在一个观察性登记研究的IS患者中,由获得委员会认证的血管神经科医生对其进行TOAST亚型判定,我们使用自然语言处理技术分析了基于文本的非结构化EHR数据,包括神经科病程记录和神经放射学报告。我们采用了几种特征选择方法来降低特征的高维度,并进行5折交叉验证以测试我们方法的通用性并最小化过拟合。我们使用了几种机器学习方法,并计算了每种机器学习方法与人工判定之间一致性的kappa值。然后,我们对最佳算法针对50例预留病例进行了盲法测试。

结果

与人工分类相比,最佳的基于机器的分类单独使用放射学报告时kappa值为0.25,单独使用病程记录时为0.57,使用组合数据时为0.57。kappa值因亚型而异,心源性栓塞型最高(0.64),隐源性病例最低(0.47)。在预留测试子集中,基于机器的分类在50例中有40例与评估者分类一致(kappa值为0.72)。

结论

使用EHR文本数据的自动化机器学习方法与人工TOAST分类显示出一致性。如果经过外部验证,这种自动化流程可用于大规模卒中流行病学研究。

相似文献

1
Automating Ischemic Stroke Subtype Classification Using Machine Learning and Natural Language Processing.使用机器学习和自然语言处理实现缺血性中风亚型分类的自动化
J Stroke Cerebrovasc Dis. 2019 Jul;28(7):2045-2051. doi: 10.1016/j.jstrokecerebrovasdis.2019.02.004. Epub 2019 May 15.
2
Automated Electronic Phenotyping of Cardioembolic Stroke.自动化电子心源性卒中表型分析。
Stroke. 2021 Jan;52(1):181-189. doi: 10.1161/STROKEAHA.120.030663. Epub 2020 Dec 10.
3
Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.自然语言处理和机器学习可实现从电子病历中自动提取和分类患者的吸烟状况。
Ups J Med Sci. 2020 Nov;125(4):316-324. doi: 10.1080/03009734.2020.1792010. Epub 2020 Jul 22.
4
Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation.自然语言处理和机器学习在电子健康记录中识别中风事件:算法开发和验证。
J Med Internet Res. 2021 Mar 8;23(3):e22951. doi: 10.2196/22951.
5
Assessing stroke severity using electronic health record data: a machine learning approach.利用电子健康记录数据评估中风严重程度:一种机器学习方法。
BMC Med Inform Decis Mak. 2020 Jan 8;20(1):8. doi: 10.1186/s12911-019-1010-x.
6
Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning.利用自然语言处理和机器学习有效识别国家规定的应报告癌症病例
J Am Med Inform Assoc. 2016 Nov;23(6):1077-1084. doi: 10.1093/jamia/ocw006. Epub 2016 Mar 28.
7
Machine learning and natural language processing methods to identify ischemic stroke, acuity and location from radiology reports.基于机器学习和自然语言处理方法,从放射学报告中识别缺血性脑卒中、发病急缓和病变部位。
PLoS One. 2020 Jun 19;15(6):e0234908. doi: 10.1371/journal.pone.0234908. eCollection 2020.
8
Automation of penicillin adverse drug reaction categorisation and risk stratification with machine learning natural language processing.利用机器学习自然语言处理实现青霉素药物不良反应分类和风险分层的自动化。
Int J Med Inform. 2021 Dec;156:104611. doi: 10.1016/j.ijmedinf.2021.104611. Epub 2021 Oct 5.
9
EMR-Based Phenotyping of Ischemic Stroke Using Supervised Machine Learning and Text Mining Techniques.基于电子病历的缺血性脑卒中表型分析:监督机器学习和文本挖掘技术的应用
IEEE J Biomed Health Inform. 2020 Oct;24(10):2922-2931. doi: 10.1109/JBHI.2020.2976931. Epub 2020 Feb 28.
10
A computerized algorithm for etiologic classification of ischemic stroke: the Causative Classification of Stroke System.一种用于缺血性卒中病因分类的计算机算法:卒中病因分类系统
Stroke. 2007 Nov;38(11):2979-84. doi: 10.1161/STROKEAHA.107.490896. Epub 2007 Sep 27.

引用本文的文献

1
A review of machine learning applications in heart health.机器学习在心脏健康中的应用综述。
Biomed Eng Online. 2025 Aug 11;24(1):99. doi: 10.1186/s12938-025-01430-4.
2
Enhanced effective convolutional attention network with squeeze-and-excitation inception module for multi-label clinical document classification.基于挤压激励 inception 模块的增强型有效卷积注意力网络用于多标签临床文档分类
Sci Rep. 2025 May 16;15(1):16988. doi: 10.1038/s41598-025-98719-0.
3
Clinical applications of artificial intelligence and machine learning in neurocardiology: a comprehensive review.
人工智能与机器学习在神经心脏病学中的临床应用:综述
Front Cardiovasc Med. 2025 Apr 3;12:1525966. doi: 10.3389/fcvm.2025.1525966. eCollection 2025.
4
Deep Learning: A Primer for Neurosurgeons.深度学习:神经外科医生入门。
Adv Exp Med Biol. 2024;1462:39-70. doi: 10.1007/978-3-031-64892-2_4.
5
TECRR: a benchmark dataset of radiological reports for BI-RADS classification with machine learning, deep learning, and large language model baselines.TECRR:一个基于机器学习、深度学习和大语言模型基线的用于 BI-RADS 分类的放射学报告基准数据集。
BMC Med Inform Decis Mak. 2024 Oct 24;24(1):310. doi: 10.1186/s12911-024-02717-7.
6
Tele-stroke: a strategy to improve acute stroke care in low- and middle-income countries.远程卒中:改善低收入和中等收入国家急性卒中护理的一项策略。
Ann Med Surg (Lond). 2024 May 17;86(7):3808-3811. doi: 10.1097/MS9.0000000000002187. eCollection 2024 Jul.
7
Extraction of Radiological Characteristics From Free-Text Imaging Reports Using Natural Language Processing Among Patients With Ischemic and Hemorrhagic Stroke: Algorithm Development and Validation.使用自然语言处理从缺血性和出血性中风患者的自由文本影像报告中提取放射学特征:算法开发与验证
JMIR AI. 2023 Jun 6;2:e42884. doi: 10.2196/42884.
8
Deep Learning-Based Automatic Classification of Ischemic Stroke Subtype Using Diffusion-Weighted Images.基于深度学习的利用扩散加权图像对缺血性中风亚型进行自动分类
J Stroke. 2024 May;26(2):300-311. doi: 10.5853/jos.2024.00535. Epub 2024 May 30.
9
StrokeClassifier: ischemic stroke etiology classification by ensemble consensus modeling using electronic health records.中风分类器:通过使用电子健康记录的集成共识模型进行缺血性中风病因分类。
NPJ Digit Med. 2024 May 17;7(1):130. doi: 10.1038/s41746-024-01120-w.
10
Stroke classification and treatment support system artificial intelligence for usefulness of stroke diagnosis.用于中风诊断有效性的中风分类与治疗支持系统人工智能
Front Neurol. 2023 Dec 14;14:1295642. doi: 10.3389/fneur.2023.1295642. eCollection 2023.