Suppr超能文献

基于美国医疗体系的回顾性比较传统与人工智能心力衰竭表型分析以获取真实世界证据。

Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence.

机构信息

Beth Israel Deaconess Medical Center, Department of Medicine, Division of Cardiology, Harvard Medical School, Boston, Massachusetts, USA

Amgen Inc, Thousand Oaks, California, USA.

出版信息

BMJ Open. 2023 Aug 9;13(8):e073178. doi: 10.1136/bmjopen-2023-073178.

Abstract

OBJECTIVE

Quantitatively evaluate the quality of data underlying real-world evidence (RWE) in heart failure (HF).

DESIGN

Retrospective comparison of accuracy in identifying patients with HF and phenotypic information was made using traditional (ie, structured query language applied to structured electronic health record (EHR) data) and advanced (ie, artificial intelligence (AI) applied to unstructured EHR data) RWE approaches. The performance of each approach was measured by the harmonic mean of precision and recall (F score) using manual annotation of medical records as a reference standard.

SETTING

EHR data from a large academic healthcare system in North America between 2015 and 2019, with an expected catchment of approximately 5 00 000 patients.

POPULATION

4288 encounters for 1155 patients aged 18-85 years, with 472 patients identified as having HF.

OUTCOME MEASURES

HF and associated concepts, such as comorbidities, left ventricular ejection fraction, and selected medications.

RESULTS

The average F scores across 19 HF-specific concepts were 49.0% and 94.1% for the traditional and advanced approaches, respectively (p<0.001 for all concepts with available data). The absolute difference in F score between approaches was 45.1% (98.1% relative increase in F score using the advanced approach). The advanced approach achieved superior F scores for HF presence, phenotype and associated comorbidities. Some phenotypes, such as HF with preserved ejection fraction, revealed dramatic differences in extraction accuracy based on technology applied, with a 4.9% F score when using natural language processing (NLP) alone and a 91.0% F score when using NLP plus AI-based inference.

CONCLUSIONS

A traditional RWE generation approach resulted in low data quality in patients with HF. While an advanced approach demonstrated high accuracy, the results varied dramatically based on extraction techniques. For future studies, advanced approaches and accuracy measurement may be required to ensure data are fit-for-purpose.

摘要

目的

定量评估心力衰竭(HF)真实世界证据(RWE)数据的质量。

设计

使用传统(即应用于结构化电子健康记录(EHR)数据的结构化查询语言)和先进(即应用于非结构化 EHR 数据的人工智能(AI))RWE 方法,对 HF 患者识别和表型信息的准确性进行回顾性比较。每种方法的性能均通过使用病历的人工注释作为参考标准的精度和召回率(F 分数)的调和平均值来衡量。

设置

2015 年至 2019 年间,来自北美的一家大型学术医疗保健系统的 EHR 数据,预计患者人数约为 500 万。

人群

1155 名年龄在 18-85 岁的患者的 4288 次就诊,其中 472 名患者被诊断为 HF。

结局指标

HF 及相关概念,如合并症、左心室射血分数和选定药物。

结果

在 19 个特定于 HF 的概念中,传统方法和先进方法的平均 F 分数分别为 49.0%和 94.1%(所有有数据的概念均 P<0.001)。两种方法的 F 分数绝对差值为 45.1%(先进方法的 F 分数相对增加 98.1%)。先进方法在 HF 存在、表型和相关合并症方面获得了更高的 F 分数。一些表型,如射血分数保留的心力衰竭,根据应用的技术,其提取准确性存在显著差异,单独使用自然语言处理(NLP)的 F 分数为 4.9%,而使用 NLP 加基于 AI 的推理的 F 分数为 91.0%。

结论

传统的 RWE 生成方法导致 HF 患者的数据质量较低。虽然先进方法具有很高的准确性,但结果因提取技术而异。对于未来的研究,可能需要使用先进的方法和准确性测量来确保数据满足目的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ed4/10414071/17d7c7329366/bmjopen-2023-073178f01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验