Suppr超能文献

一种用于自然语言处理的模块化管道——从电子健康记录中筛选出实用试验结果的人工摘要。

A modular pipeline for natural language processing-screened human abstraction of a pragmatic trial outcome from electronic health records.

作者信息

Lee Robert Y, Li Kevin S, Sibley James, Cohen Trevor, Lober William B, O'Brien Janaki, LeDuc Nicole, Andrews Kasey Mallon, Ungar Anna, Walsh Jessica, Nielsen Elizabeth L, Dotolo Danae G, Kross Erin K

机构信息

Division of Pulmonary, Critical Care, and Sleep Medicine, University of Washington, Seattle, USA.

Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle, USA.

出版信息

medRxiv. 2025 Jun 24:2025.06.23.25330134. doi: 10.1101/2025.06.23.25330134.

Abstract

BACKGROUND

Natural language processing (NLP) allows efficient extraction of clinical variables and outcomes from electronic health records (EHR). However, measuring pragmatic clinical trial outcomes may demand accuracy that exceeds NLP performance. Combining NLP with human adjudication can address this gap, yet few software solutions support such workflows. We developed a modular, scalable system for NLP-screened human abstraction to measure the primary outcomes of two clinical trials.

METHODS

In two clinical trials of hospitalized patients with serious illness, a deep-learning NLP model screened EHR passages for documented goals-of-care discussions. Screen-positive passages were referred for human adjudication using a REDCap-based system to measure the trial outcomes. Dynamic pooling of passages using structured query language (SQL) within the REDCap database reduced unnecessary abstraction while ensuring data completeness.

RESULTS

In the first trial (N=2,512), NLP identified 22,187 screen-positive passages (0.8%) from 2.6 million EHR passages. Human reviewers adjudicated 7,494 passages over 34.3 abstractor-hours to measure the cumulative incidence and time to first documented goals-of-care discussion for all patients with 92.6% patient-level sensitivity. In the second trial (N=617), NLP identified 8,952 screen-positive passages (1.6%) from 559,596 passages at a threshold with near-100% sensitivity. Human reviewers adjudicated 3,509 passages over 27.9 abstractor-hours to measure the same outcome for all patients.

CONCLUSION

We present the design and source code for a scalable and efficient pipeline for measuring complex EHR-derived outcomes using NLP-screened human abstraction. This implementation is adaptable to diverse research needs, and its modular pipeline represents a practical middle ground between custom software and commercial platforms.

摘要

背景

自然语言处理(NLP)可从电子健康记录(EHR)中高效提取临床变量和结果。然而,衡量务实的临床试验结果可能需要超出NLP性能的准确性。将NLP与人工判定相结合可以弥补这一差距,但很少有软件解决方案支持此类工作流程。我们开发了一个模块化、可扩展的系统,用于NLP筛选后的人工提取,以衡量两项临床试验的主要结果。

方法

在两项针对重症住院患者的临床试验中,一个深度学习NLP模型对EHR段落进行筛选,以查找记录在案的照护目标讨论。筛选呈阳性的段落会使用基于REDCap的系统进行人工判定,以衡量试验结果。在REDCap数据库中使用结构化查询语言(SQL)对段落进行动态汇总,减少了不必要的提取工作,同时确保了数据完整性。

结果

在第一项试验(N = 2512)中,NLP从260万条EHR段落中识别出22187条筛选呈阳性的段落(0.8%)。人工评审员在34.3个提取工时内对7494条段落进行了判定,以衡量所有患者首次记录照护目标讨论的累积发生率和时间,患者层面的敏感性为92.6%。在第二项试验(N = 617)中,NLP在接近100%敏感性的阈值下,从559596条段落中识别出8952条筛选呈阳性的段落(1.6%)。人工评审员在27.9个提取工时内对3509条段落进行了判定,以衡量所有患者的相同结果。

结论

我们展示了一个可扩展且高效的流程的设计和源代码,该流程使用NLP筛选后的人工提取来衡量复杂的EHR衍生结果。此实施方案适用于各种研究需求,其模块化流程代表了定制软件和商业平台之间的实用中间立场。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7ea/12262768/fa99ce7e5e19/nihpp-2025.06.23.25330134v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验