• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

来自电子病历的中风结局测量:神经和非神经分类器有效性的横断面研究

Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers.

作者信息

Zanotto Bruna Stella, Beck da Silva Etges Ana Paula, Dal Bosco Avner, Cortes Eduardo Gabriel, Ruschel Renata, De Souza Ana Claudia, Andrade Claudio M V, Viegas Felipe, Canuto Sergio, Luiz Washington, Ouriques Martins Sheila, Vieira Renata, Polanczyk Carisi, André Gonçalves Marcos

机构信息

National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.

Graduate Program in Epidemiology, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.

出版信息

JMIR Med Inform. 2021 Nov 1;9(11):e29120. doi: 10.2196/29120.

DOI:10.2196/29120
PMID:34723829
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8593798/
Abstract

BACKGROUND

With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management.

OBJECTIVE

This study aims to compare the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs.

METHODS

Our study addressed the computational problems of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: tier 1 (achieved health care status), tier 2 (recovery process), care related (clinical management and risk scores), and baseline characteristics. The analyzed data set was retrospectively extracted from the EMRs of patients with stroke from a private Brazilian hospital between 2018 and 2019. A total of 44,206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning methods, including state-of-the-art neural and nonneural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with subject-wise sampling. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1 score), supported by statistical significance tests. A feature importance analysis was conducted to provide insights into the results.

RESULTS

The top-performing models were support vector machines trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR textual representations. The support vector machine models produced statistically superior results in 71% (17/24) of tasks, with an F1 score >80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally or ambulate and communicate), health care status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional nonneural methods, given the characteristics of the data set. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future.

CONCLUSIONS

Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to clinical conditions of stroke victims, and thus ultimately assess the possibility of proactively using these machine learning techniques in real-world situations.

摘要

背景

随着电子病历(EMR)的迅速普及,从电子病历中收集数据并提取知识以支持以患者为中心的中风管理的机会越来越多。

目的

本研究旨在比较最先进的自动文本分类方法在对数据进行分类以支持临床患者结局预测和从电子病历中提取患者特征方面的有效性。

方法

我们的研究解决了信息提取和自动文本分类的计算问题。我们确定了基于缺血性中风价值的计划中要考虑的基本任务。根据以下价值议程对选定的30项任务进行分类(由专家手动标注):一级(实现的医疗保健状态)、二级(恢复过程)、护理相关(临床管理和风险评分)以及基线特征。分析的数据集是从2018年至2019年巴西一家私立医院中风患者的电子病历中回顾性提取的。总共44206条来自葡萄牙语自由文本病历的句子用于训练和开发10种有监督的计算机器学习方法,包括最先进的神经和非神经方法以及本体规则。作为实验方案,我们使用了重复6次的5折交叉验证程序以及按受试者抽样。使用热图根据最佳算法有效性(F1分数)显示比较结果分析,并辅以统计显著性检验。进行了特征重要性分析以深入了解结果。

结果

表现最佳的模型是使用词汇和语义文本特征训练的支持向量机,这表明处理电子病历文本表示中的噪声的重要性。支持向量机模型在71%(17/24)的任务中产生了统计学上更优的结果,在护理相关任务(患者治疗地点、跌倒风险、溶栓治疗和压疮风险)、恢复过程(口服进食或行走及沟通能力)、实现的医疗保健状态(死亡率)以及基线特征(糖尿病、肥胖、血脂异常和吸烟状况)方面F1分数>80%。鉴于数据集的特征,神经方法在很大程度上被更传统的非神经方法超越。本体规则在基线特征(酗酒、心房颤动和冠状动脉疾病)和Rankin量表等任务中也有效。模型之间有效性的互补性表明,模型组合可能会在未来提高结果并涵盖更多任务。

结论

信息技术能力的进步对于衡量健康状况结局的可扩展性和敏捷性至关重要。本研究使我们能够衡量有效性,并确定自动分类与中风患者临床状况相关的特定任务结局的机会,从而最终评估在实际情况中主动使用这些机器学习技术的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/6169fcf9450e/medinform_v9i11e29120_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/bdd4b8e92555/medinform_v9i11e29120_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/1e390e5e6290/medinform_v9i11e29120_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/188ff25696c6/medinform_v9i11e29120_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/4b750c7d85d3/medinform_v9i11e29120_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/04991a5339ff/medinform_v9i11e29120_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/6169fcf9450e/medinform_v9i11e29120_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/bdd4b8e92555/medinform_v9i11e29120_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/1e390e5e6290/medinform_v9i11e29120_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/188ff25696c6/medinform_v9i11e29120_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/4b750c7d85d3/medinform_v9i11e29120_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/04991a5339ff/medinform_v9i11e29120_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c578/8593798/6169fcf9450e/medinform_v9i11e29120_fig6.jpg

相似文献

1
Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers.来自电子病历的中风结局测量:神经和非神经分类器有效性的横断面研究
JMIR Med Inform. 2021 Nov 1;9(11):e29120. doi: 10.2196/29120.
2
Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation.基于多语义特征,利用经过稳健优化的基于变换器预训练方法的全词掩码和卷积神经网络从电子病历中进行中文临床命名实体识别:模型开发与验证
JMIR Med Inform. 2023 May 10;11:e44597. doi: 10.2196/44597.
3
Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.自然语言处理和机器学习可实现从电子病历中自动提取和分类患者的吸烟状况。
Ups J Med Sci. 2020 Nov;125(4):316-324. doi: 10.1080/03009734.2020.1792010. Epub 2020 Jul 22.
4
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.
5
Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.使用多任务卷积神经网络从自由文本病理报告中自动提取癌症登记报告信息。
J Am Med Inform Assoc. 2020 Jan 1;27(1):89-98. doi: 10.1093/jamia/ocz153.
6
Pediatric Injury Surveillance From Uncoded Emergency Department Admission Records in Italy: Machine Learning-Based Text-Mining Approach.意大利基于无编码急诊入院记录的儿科伤害监测:基于机器学习的文本挖掘方法。
JMIR Public Health Surveill. 2023 Jul 12;9:e44467. doi: 10.2196/44467.
7
Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study.机器学习在类风湿性关节炎患者电子健康记录识别中的应用:算法流程开发与验证研究。
JMIR Med Inform. 2020 Nov 30;8(11):e23930. doi: 10.2196/23930.
8
EMR-Based Phenotyping of Ischemic Stroke Using Supervised Machine Learning and Text Mining Techniques.基于电子病历的缺血性脑卒中表型分析:监督机器学习和文本挖掘技术的应用
IEEE J Biomed Health Inform. 2020 Oct;24(10):2922-2931. doi: 10.1109/JBHI.2020.2976931. Epub 2020 Feb 28.
9
[A customized method for information extraction from unstructured text data in the electronic medical records].[一种从电子病历非结构化文本数据中提取信息的定制方法]
Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):256-263.
10
Automating Ischemic Stroke Subtype Classification Using Machine Learning and Natural Language Processing.使用机器学习和自然语言处理实现缺血性中风亚型分类的自动化
J Stroke Cerebrovasc Dis. 2019 Jul;28(7):2045-2051. doi: 10.1016/j.jstrokecerebrovasdis.2019.02.004. Epub 2019 May 15.

引用本文的文献

1
Using artificial intelligence to develop a measure of orthopaedic treatment success from clinical notes.利用人工智能从临床记录中开发一种衡量骨科治疗成功与否的方法。
Front Digit Health. 2025 Apr 24;7:1523953. doi: 10.3389/fdgth.2025.1523953. eCollection 2025.
2
Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review.医疗保健语言模型及其在信息提取方面的微调:范围综述。
JMIR Med Inform. 2024 Oct 21;12:e60164. doi: 10.2196/60164.
3
Applications of Natural Language Processing for the Management of Stroke Disorders: Scoping Review.

本文引用的文献

1
Natural Language Processing of Clinical Notes to Identify Mental Illness and Substance Use Among People Living with HIV: Retrospective Cohort Study.利用临床记录的自然语言处理技术识别HIV感染者中的精神疾病和药物使用情况:回顾性队列研究
JMIR Med Inform. 2021 Mar 10;9(3):e23456. doi: 10.2196/23456.
2
Electronic Health Record Use in Swiss Nursing Homes and Its Association With Implicit Rationing of Nursing Care Documentation: Multicenter Cross-sectional Survey Study.瑞士养老院中电子健康记录的使用及其与护理记录隐性配给的关联:多中心横断面调查研究
JMIR Med Inform. 2021 Mar 2;9(3):e22974. doi: 10.2196/22974.
3
An Application of Machine Learning to Etiological Diagnosis of Secondary Hypertension: Retrospective Study Using Electronic Medical Records.
自然语言处理在中风疾病管理中的应用:范围综述
JMIR Med Inform. 2023 Sep 6;11:e48693. doi: 10.2196/48693.
4
A Machine Learning Approach to Support Urgent Stroke Triage Using Administrative Data and Social Determinants of Health at Hospital Presentation: Retrospective Study.一种基于机器学习的方法,利用医院就诊时的行政数据和健康社会决定因素来支持紧急脑卒中分诊:回顾性研究。
J Med Internet Res. 2023 Jan 30;25:e36477. doi: 10.2196/36477.
机器学习在继发性高血压病因诊断中的应用:基于电子病历的回顾性研究
JMIR Med Inform. 2021 Jan 25;9(1):e19739. doi: 10.2196/19739.
4
Classification of the Disposition of Patients Hospitalized with COVID-19: Reading Discharge Summaries Using Natural Language Processing.COVID-19住院患者处置情况分类:使用自然语言处理技术阅读出院小结
JMIR Med Inform. 2021 Feb 10;9(2):e25457. doi: 10.2196/25457.
5
Clinical Term Normalization Using Learned Edit Patterns and Subconcept Matching: System Development and Evaluation.使用学习到的编辑模式和子概念匹配进行临床术语标准化:系统开发与评估
JMIR Med Inform. 2021 Jan 14;9(1):e23104. doi: 10.2196/23104.
6
Improving Primary Care Medication Processes by Using Shared Electronic Medication Plans in Switzerland: Lessons Learned From a Participatory Action Research Study.通过在瑞士使用共享电子用药计划改善初级医疗用药流程:参与式行动研究的经验教训
JMIR Form Res. 2021 Jan 7;5(1):e22319. doi: 10.2196/22319.
7
Model-Based Reasoning of Clinical Diagnosis in Integrative Medicine: Real-World Methodological Study of Electronic Medical Records and Natural Language Processing Methods.中西医结合临床诊断的基于模型的推理:电子病历与自然语言处理方法的真实世界方法学研究
JMIR Med Inform. 2020 Dec 21;8(12):e23082. doi: 10.2196/23082.
8
Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation.基于神经注意力和增强关系侧方案的家族病史信息提取:算法开发与验证
JMIR Med Inform. 2020 Dec 1;8(12):e21750. doi: 10.2196/21750.
9
Federated Learning on Clinical Benchmark Data: Performance Assessment.基于临床基准数据的联邦学习:性能评估。
J Med Internet Res. 2020 Oct 26;22(10):e20891. doi: 10.2196/20891.
10
A systematic review of machine learning models for predicting outcomes of stroke with structured data.基于结构化数据的机器学习模型预测脑卒中结局的系统评价。
PLoS One. 2020 Jun 12;15(6):e0234722. doi: 10.1371/journal.pone.0234722. eCollection 2020.