• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

开发一种利用电子病历中的临床记录识别房颤患者的便携式工具。

Development of a Portable Tool to Identify Patients With Atrial Fibrillation Using Clinical Notes From the Electronic Medical Record.

机构信息

Division of Cardiovascular Medicine, Department of Internal Medicine (R.U.S., B.A.S., R.M.), University of Utah School of Medicine, Salt Lake City.

Division of Cardiology, Department of Medicine (R.K.M., F.S.A., H.C.G.), Northwestern University Feinberg School of Medicine, Chicago, IL.

出版信息

Circ Cardiovasc Qual Outcomes. 2020 Oct;13(10):e006516. doi: 10.1161/CIRCOUTCOMES.120.006516. Epub 2020 Oct 14.

DOI:10.1161/CIRCOUTCOMES.120.006516
PMID:33079591
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7646941/
Abstract

BACKGROUND

The electronic medical record contains a wealth of information buried in free text. We created a natural language processing algorithm to identify patients with atrial fibrillation (AF) using text alone.

METHODS AND RESULTS

We created 3 data sets from patients with at least one AF billing code from 2010 to 2017: a training set (n=886), an internal validation set from site no. 1 (n=285), and an external validation set from site no. 2 (n=276). A team of clinicians reviewed and adjudicated patients as AF present or absent, which served as the reference standard. We trained 54 algorithms to classify each patient, varying the model, number of features, number of stop words, and the method used to create the feature set. The algorithm with the highest F-score (the harmonic mean of sensitivity and positive predictive value) in the training set was applied to the validation sets. F-scores and area under the receiver operating characteristic curves were compared between site no. 1 and site no. 2 using bootstrapping. Adjudicated AF prevalence was 75.1% at site no. 1 and 86.2% at site no. 2. Among 54 algorithms, the best performing model was logistic regression, using 1000 features, 100 stop words, and term frequency-inverse document frequency method to create the feature set, with sensitivity 92.8%, specificity 93.9%, and an area under the receiver operating characteristic curve of 0.93 in the training set. The performance at site no. 1 was sensitivity 92.5%, specificity 88.7%, with an area under the receiver operating characteristic curve of 0.91. The performance at site no. 2 was sensitivity 89.5%, specificity 71.1%, with an area under the receiver operating characteristic curve of 0.80. The F-score was lower at site no. 2 compared with site no. 1 (92.5% [SD, 1.1%] versus 94.2% [SD, 1.1%]; <0.001).

CONCLUSIONS

We developed a natural language processing algorithm to identify patients with AF using text alone, with >90% F-score at 2 separate sites. This approach allows better use of the clinical narrative and creates an opportunity for precise, high-throughput cohort identification.

摘要

背景

电子病历中包含大量隐藏在自由文本中的信息。我们创建了一种自然语言处理算法,仅使用文本即可识别房颤(AF)患者。

方法和结果

我们从 2010 年至 2017 年至少有一个 AF 计费代码的患者中创建了 3 个数据集:训练集(n=886)、来自站点 1 的内部验证集(n=285)和来自站点 2 的外部验证集(n=276)。一组临床医生审查并裁决患者是否存在 AF,作为参考标准。我们训练了 54 种算法来对每个患者进行分类,改变模型、特征数量、停用词数量以及创建特征集的方法。在训练集中具有最高 F 分数(敏感性和阳性预测值的调和平均值)的算法应用于验证集。使用 bootstrap 比较站点 1 和站点 2 之间的 F 分数和接收者操作特征曲线下面积。站点 1 的经裁决的 AF 患病率为 75.1%,站点 2 为 86.2%。在 54 种算法中,表现最好的模型是逻辑回归,使用 1000 个特征、100 个停用词和词频-文档频率方法创建特征集,训练集的敏感性为 92.8%,特异性为 93.9%,接收者操作特征曲线下面积为 0.93。站点 1 的性能为敏感性 92.5%,特异性 88.7%,接收者操作特征曲线下面积为 0.91。站点 2 的性能为敏感性 89.5%,特异性 71.1%,接收者操作特征曲线下面积为 0.80。与站点 1 相比,站点 2 的 F 分数较低(92.5%[SD,1.1%]与 94.2%[SD,1.1%];<0.001)。

结论

我们开发了一种自然语言处理算法,仅使用文本即可识别 AF 患者,在两个独立站点的准确率超过 90%。这种方法可以更好地利用临床描述,并为精确、高通量队列识别创造机会。

相似文献

1
Development of a Portable Tool to Identify Patients With Atrial Fibrillation Using Clinical Notes From the Electronic Medical Record.开发一种利用电子病历中的临床记录识别房颤患者的便携式工具。
Circ Cardiovasc Qual Outcomes. 2020 Oct;13(10):e006516. doi: 10.1161/CIRCOUTCOMES.120.006516. Epub 2020 Oct 14.
2
Impact of Different Electronic Cohort Definitions to Identify Patients With Atrial Fibrillation From the Electronic Medical Record.不同电子队列定义对从电子病历中识别房颤患者的影响。
J Am Heart Assoc. 2020 Mar 3;9(5):e014527. doi: 10.1161/JAHA.119.014527. Epub 2020 Feb 26.
3
Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records.基于自然语言处理的电子健康记录在复发性心房颤动识别中的应用
Eur Heart J Qual Care Clin Outcomes. 2024 Jan 12;10(1):77-88. doi: 10.1093/ehjqcco/qcad021.
4
Use of Natural Language Processing to Improve Identification of Patients With Peripheral Artery Disease.利用自然语言处理提高外周动脉疾病患者的识别率。
Circ Cardiovasc Interv. 2020 Oct;13(10):e009447. doi: 10.1161/CIRCINTERVENTIONS.120.009447. Epub 2020 Oct 12.
5
Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing.使用自然语言处理验证电子病历中的银屑病关节炎诊断。
Semin Arthritis Rheum. 2011 Apr;40(5):413-20. doi: 10.1016/j.semarthrit.2010.05.002. Epub 2010 Aug 10.
6
Performance of an electronic health record-based predictive model to identify patients with atrial fibrillation across countries.基于电子健康记录的预测模型在国家间识别房颤患者的性能。
PLoS One. 2022 Jul 8;17(7):e0269867. doi: 10.1371/journal.pone.0269867. eCollection 2022.
7
Assessment of a Machine Learning Model Applied to Harmonized Electronic Health Record Data for the Prediction of Incident Atrial Fibrillation.应用于电子健康记录数据标准化的机器学习模型预测心房颤动事件的评估。
JAMA Netw Open. 2020 Jan 3;3(1):e1919396. doi: 10.1001/jamanetworkopen.2019.19396.
8
Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records.自然语言处理改善基于电子健康记录预测房颤事件
J Am Heart Assoc. 2022 Aug 2;11(15):e026014. doi: 10.1161/JAHA.122.026014. Epub 2022 Jul 29.
9
Atrial Fibrillation Burden Signature and Near-Term Prediction of Stroke: A Machine Learning Analysis.心房颤动负荷特征与卒中的近期预测:一项机器学习分析
Circ Cardiovasc Qual Outcomes. 2019 Oct;12(10):e005595. doi: 10.1161/CIRCOUTCOMES.118.005595. Epub 2019 Oct 15.
10
A natural language processing and deep learning approach to identify child abuse from pediatric electronic medical records.一种基于自然语言处理和深度学习的方法,用于从儿科电子病历中识别儿童虐待。
PLoS One. 2021 Feb 26;16(2):e0247404. doi: 10.1371/journal.pone.0247404. eCollection 2021.

引用本文的文献

1
A Scoping Review of the Use of Artificial Intelligence in the Identification and Diagnosis of Atrial Fibrillation.人工智能在心房颤动识别与诊断中的应用范围综述
J Pers Med. 2024 Oct 24;14(11):1069. doi: 10.3390/jpm14111069.
2
Systematic review of current natural language processing methods and applications in cardiology.系统评价当前自然语言处理方法在心脏病学中的应用。
Heart. 2022 May 25;108(12):909-916. doi: 10.1136/heartjnl-2021-319769.

本文引用的文献

1
Impact of Different Electronic Cohort Definitions to Identify Patients With Atrial Fibrillation From the Electronic Medical Record.不同电子队列定义对从电子病历中识别房颤患者的影响。
J Am Heart Assoc. 2020 Mar 3;9(5):e014527. doi: 10.1161/JAHA.119.014527. Epub 2020 Feb 26.
2
Misclassification of Myocardial Injury as Myocardial Infarction: Implications for Assessing Outcomes in Value-Based Programs.将心肌损伤误诊为心肌梗死:对基于价值的项目评估结果的影响。
JAMA Cardiol. 2019 May 1;4(5):460-464. doi: 10.1001/jamacardio.2019.0716.
3
Comparison of 2 Natural Language Processing Methods for Identification of Bleeding Among Critically Ill Patients.
比较 2 种自然语言处理方法在识别危重症患者出血中的应用。
JAMA Netw Open. 2018 Oct 5;1(6):e183451. doi: 10.1001/jamanetworkopen.2018.3451.
4
Rethinking EHR interfaces to reduce click fatigue and physician burnout.重新思考电子健康记录界面以减少点击疲劳和医生职业倦怠。
CMAJ. 2018 Aug 20;190(33):E994-E995. doi: 10.1503/cmaj.109-5644.
5
A case study evaluating the portability of an executable computable phenotype algorithm across multiple institutions and electronic health record environments.一项评估可执行计算表型算法在多个机构和电子健康记录环境中可移植性的案例研究。
J Am Med Inform Assoc. 2018 Nov 1;25(11):1540-1546. doi: 10.1093/jamia/ocy101.
6
Electronic health records contributing to physician burnout.电子健康记录导致医生职业倦怠。
CMAJ. 2017 Nov 13;189(45):E1405-E1406. doi: 10.1503/cmaj.109-5522.
7
Factors Affecting Physician Professional Satisfaction and Their Implications for Patient Care, Health Systems, and Health Policy.影响医生职业满意度的因素及其对患者护理、卫生系统和卫生政策的影响。
Rand Health Q. 2014 Dec 1;3(4):1. eCollection 2014 Winter.
8
Real-World Evidence - What Is It and What Can It Tell Us?真实世界证据——它是什么以及能告诉我们什么?
N Engl J Med. 2016 Dec 8;375(23):2293-2297. doi: 10.1056/NEJMsb1609216.
9
A Simple and Portable Algorithm for Identifying Atrial Fibrillation in the Electronic Medical Record.一种用于在电子病历中识别心房颤动的简单便携式算法。
Am J Cardiol. 2016 Jan 15;117(2):221-5. doi: 10.1016/j.amjcard.2015.10.031. Epub 2015 Nov 6.
10
Caveats for the use of operational electronic health record data in comparative effectiveness research.使用操作性电子健康记录数据进行比较有效性研究的注意事项。
Med Care. 2013 Aug;51(8 Suppl 3):S30-7. doi: 10.1097/MLR.0b013e31829b1dbd.