Suppr超能文献

通过电子健康记录加速检查点抑制剂诱导的结肠炎病例的管理。

Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records.

作者信息

Rahman Protiva, Ye Cheng, Mittendorf Kathleen F, Lenoue-Newton Michele, Micheel Christine, Wolber Jan, Osterman Travis, Fabbri Daniel

机构信息

Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

Vanderbilt Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

出版信息

JAMIA Open. 2023 Apr 1;6(1):ooad017. doi: 10.1093/jamiaopen/ooad017. eCollection 2023 Apr.

Abstract

OBJECTIVE

Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documenting ICI-colitis cases to accelerate data curation.

MATERIALS AND METHODS

We present a data pipeline to automatically identify ICI-colitis from EHR notes, accelerating chart review. The pipeline relies on BERT, a state-of-the-art natural language processing (NLP) model. The first stage of the pipeline segments long notes using keywords identified through a logistic classifier and applies BERT to identify ICI-colitis notes. The next stage uses a second BERT model tuned to identify false positive notes and remove notes that were likely positive for mentioning colitis as a side-effect. The final stage further accelerates curation by highlighting the colitis-relevant portions of notes. Specifically, we use BERT's attention scores to find high-density regions describing colitis.

RESULTS

The overall pipeline identified colitis notes with 84% precision and reduced the curator note review load by 75%. The segment BERT classifier had a high recall of 0.98, which is crucial to identify the low incidence (<10%) of colitis.

DISCUSSION

Curation from EHR notes is a burdensome task, especially when the curation topic is complicated. Methods described in this work are not only useful for ICI colitis but can also be adapted for other domains.

CONCLUSION

Our extraction pipeline reduces manual note review load and makes EHR data more accessible for research.

摘要

目的

自动识别有免疫检查点抑制剂(ICI)诱导性结肠炎风险的患者,有助于医生改善患者护理。然而,预测模型需要从电子健康记录(EHR)中整理出的训练数据。我们的目标是自动识别记录ICI结肠炎病例的笔记,以加速数据整理。

材料与方法

我们提出了一种数据管道,用于从EHR笔记中自动识别ICI结肠炎,从而加速病历审查。该管道依赖于BERT,这是一种先进的自然语言处理(NLP)模型。管道的第一阶段使用通过逻辑分类器识别的关键词对长笔记进行分段,并应用BERT来识别ICI结肠炎笔记。下一阶段使用第二个经过调整的BERT模型来识别假阳性笔记,并删除那些可能因提及结肠炎作为副作用而呈阳性的笔记。最后阶段通过突出显示笔记中与结肠炎相关的部分,进一步加速整理过程。具体而言,我们使用BERT的注意力分数来找到描述结肠炎的高密度区域。

结果

整个管道识别结肠炎笔记的精度为84%,并将整理人员的笔记审查工作量减少了75%。分段BERT分类器的召回率高达0.98,这对于识别低发病率(<10%)的结肠炎至关重要。

讨论

从EHR笔记中进行整理是一项繁重的任务,尤其是当整理主题复杂时。这项工作中描述的方法不仅对ICI结肠炎有用,也可适用于其他领域。

结论

我们的提取管道减少了人工笔记审查工作量,并使EHR数据更便于用于研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/699f/10066800/f0ae35ad5d26/ooad017f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验