Suppr超能文献

利用人工智能与自然语言处理相结合,整合电子健康记录的结构化和自由文本数据,以识别非瓣膜性心房颤动,从而降低中风和死亡风险:评估和病例对照研究。

Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record's Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study.

机构信息

Department of Biomedical Informatics, University at Buffalo, Buffalo, NY, United States.

Bioinformatics Laboratory, Department of Veterans Affairs, VA Western New York Healthcare System, Buffalo, NY, United States.

出版信息

J Med Internet Res. 2021 Nov 9;23(11):e28946. doi: 10.2196/28946.

Abstract

BACKGROUND

Nonvalvular atrial fibrillation (NVAF) affects almost 6 million Americans and is a major contributor to stroke but is significantly undiagnosed and undertreated despite explicit guidelines for oral anticoagulation.

OBJECTIVE

The aim of this study is to investigate whether the use of semisupervised natural language processing (NLP) of electronic health record's (EHR) free-text information combined with structured EHR data improves NVAF discovery and treatment and perhaps offers a method to prevent thousands of deaths and save billions of dollars.

METHODS

We abstracted 96,681 participants from the University of Buffalo faculty practice's EHR. NLP was used to index the notes and compare the ability to identify NVAF, congestive heart failure, hypertension, age ≥75 years, diabetes mellitus, stroke or transient ischemic attack, vascular disease, age 65 to 74 years, sex category (CHADS-VASc), and Hypertension, Abnormal liver/renal function, Stroke history, Bleeding history or predisposition, Labile INR, Elderly, Drug/alcohol usage (HAS-BLED) scores using unstructured data (International Classification of Diseases codes) versus structured and unstructured data from clinical notes. In addition, we analyzed data from 63,296,120 participants in the Optum and Truven databases to determine the NVAF frequency, rates of CHADS‑VASc ≥2, and no contraindications to oral anticoagulants, rates of stroke and death in the untreated population, and first year's costs after stroke.

RESULTS

The structured-plus-unstructured method would have identified 3,976,056 additional true NVAF cases (P<.001) and improved sensitivity for CHADS-VASc and HAS-BLED scores compared with the structured data alone (P=.002 and P<.001, respectively), causing a 32.1% improvement. For the United States, this method would prevent an estimated 176,537 strokes, save 10,575 lives, and save >US $13.5 billion.

CONCLUSIONS

Artificial intelligence-informed bio-surveillance combining NLP of free-text information with structured EHR data improves data completeness, prevents thousands of strokes, and saves lives and funds. This method is applicable to many disorders with profound public health consequences.

摘要

背景

非瓣膜性心房颤动(NVAF)影响近 600 万美国人,是中风的主要诱因,但尽管有明确的口服抗凝治疗指南,NVAF 的诊断和治疗仍严重不足。

目的

本研究旨在探讨是否使用电子病历(EHR)的半监督自然语言处理(NLP)对非结构化文本信息进行处理,并结合结构化 EHR 数据,提高 NVAF 的发现和治疗效果,或许可以提供一种预防数千人死亡和节省数十亿美元的方法。

方法

我们从布法罗大学教职员工的 EHR 中提取了 96681 名参与者。使用 NLP 对笔记进行索引,并比较使用非结构化数据(国际疾病分类代码)与使用临床笔记中的结构化和非结构化数据来识别 NVAF、充血性心力衰竭、高血压、年龄≥75 岁、糖尿病、中风或短暂性脑缺血发作、血管疾病、65 至 74 岁年龄组、性别类别(CHADS-VASc)和高血压、肝/肾功能异常、中风史、出血史或倾向、INR 不稳定、老年人、药物/酒精使用(HAS-BLED)评分的能力。此外,我们分析了 Optum 和 Truven 数据库中 63296120 名参与者的数据,以确定未经治疗人群中 NVAF 的频率、CHADS-VASc 评分≥2 的比率和无口服抗凝剂禁忌的比率、中风和死亡率,以及中风后第一年的费用。

结果

与仅使用结构化数据相比,结构化加非结构化方法将识别出 3976056 例额外的真正 NVAF 病例(P<.001),并提高 CHADS-VASc 和 HAS-BLED 评分的敏感性(分别为 P=.002 和 P<.001),敏感性提高 32.1%。对于美国来说,这种方法可以预防估计 176537 例中风,挽救 10575 条生命,并节省超过 135 亿美元。

结论

将 NLP 对非结构化文本信息的自然语言处理与结构化 EHR 数据相结合的人工智能辅助生物监测,可以提高数据的完整性,预防数千例中风,并拯救生命和资金。这种方法适用于许多具有深远公共卫生影响的疾病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd43/8663460/d380393a752c/jmir_v23i11e28946_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验