Suppr超能文献

自然语言处理在观察性临床研究电子病历数据检索中的实际应用案例:AssistMED 项目。

Practical use case of natural language processing for observational clinical research data retrieval from electronic health records: AssistMED project.

机构信息

First Department of Cardiology, Medical University of Warsaw, Warszawa, Poland

Doctoral School, Medical University of Warsaw, Warszawa, Poland

出版信息

Pol Arch Intern Med. 2024 May 28;134(5). doi: 10.20452/pamw.16704. Epub 2024 Mar 19.

Abstract

INTRODUCTION

Electronic health records (EHRs) contain data valuable for clinical research. However, they are in textual format and require manual encoding to databases, which is a lengthy and costly process. Natural language processing (NLP) is a computational technique that allows for text analysis.

OBJECTIVES

Our study aimed to demonstrate a practical use case of NLP for a large retrospective study cohort characterization and comparison with human retrieval.

PATIENTS AND METHODS

Anonymized discharge documentation of 10 314 patients from a cardiology tertiary care department was analyzed for inclusion in the CRAFT registry (Multicenter Experience in Atrial Fibrillation Patients Treated with Oral Anticoagulants; NCT02987062). Extensive clinical characteristics regarding concomitant diseases, medications, daily drug dosages, and echocardiography were collected manually and through NLP.

RESULTS

There were 3030 and 3029 patients identified by human and NLP‑based approaches, respectively, reflecting 99.93% accuracy of NLP in detecting AF. Comprehensive baseline patient characteristics by NLP was faster than human analysis (3 h and 15 min vs 71 h and 12 min). The calculated CHA2DS2VASc and HAS‑BLED scores based on both methods did not differ (human vs NLP; median [interquartile range], 3 [2-5] vs 3 [2-5]; P = 0.74 and 1 [1-2] vs 1 [1-2]; P = 0.63, respectively). For most data, an almost perfect agreement between NLP- and human-retrieved characteristics was found; daily dosage identification was the least accurate NLP feature. Similar conclusions on cohort characteristics would be made; however, daily dosage detection for some drug groups would require additional human validation in the NLP‑based cohort.

CONCLUSIONS

NLP utilization in EHRs may accelerate data acquisition and provide accurate information for retrospective studies.

摘要

简介

电子健康记录 (EHR) 包含对临床研究有价值的数据。然而,它们是文本格式,需要手动编码到数据库中,这是一个漫长而昂贵的过程。自然语言处理 (NLP) 是一种允许文本分析的计算技术。

目的

我们的研究旨在展示 NLP 在大型回顾性研究队列特征描述中的实际应用,并与人工检索进行比较。

患者和方法

对来自心脏病学三级护理部门的 10314 名患者的匿名出院记录进行了分析,以纳入 CRAFT 登记处(多中心房颤患者口服抗凝剂治疗经验;NCT02987062)。通过人工和 NLP 广泛收集了关于合并疾病、药物、每日药物剂量和超声心动图的临床特征。

结果

通过人工和基于 NLP 的方法分别识别出 3030 例和 3029 例患者,反映了 NLP 检测 AF 的准确率为 99.93%。基于 NLP 的全面基线患者特征比人工分析更快(3 小时 15 分钟与 71 小时 12 分钟)。基于两种方法计算的 CHA2DS2VASc 和 HAS-BLED 评分无差异(人工 vs NLP;中位数[四分位数范围],3 [2-5] vs 3 [2-5];P=0.74 和 1 [1-2] vs 1 [1-2];P=0.63)。对于大多数数据,基于 NLP 和人工检索特征的一致性非常高;每日剂量识别是最不准确的 NLP 特征。还发现基于 NLP 的队列特征的结论相似;然而,对于某些药物组,需要对基于 NLP 的队列进行额外的人工验证来检测每日剂量。

结论

EHR 中的 NLP 利用可以加速数据采集并为回顾性研究提供准确信息。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验