Suppr超能文献

在电子医疗记录的非结构化文本中发现潜在不良事件:莎士比亚方法的开发。

Finding Potential Adverse Events in the Unstructured Text of Electronic Health Care Records: Development of the Shakespeare Method.

作者信息

Bright Roselie A, Rankin Summer K, Dowdy Katherine, Blok Sergey V, Bright Susan J, Palmer Lee Anne M

机构信息

US Food and Drug Administration, Silver Spring, MD, United States.

Booz Allen Hamilton, McLean, VA, United States.

出版信息

JMIRx Med. 2021 Aug 11;2(3):e27017. doi: 10.2196/27017.

Abstract

BACKGROUND

Big data tools provide opportunities to monitor adverse events (patient harm associated with medical care) (AEs) in the unstructured text of electronic health care records (EHRs). Writers may explicitly state an apparent association between treatment and adverse outcome ("attributed") or state the simple treatment and outcome without an association ("unattributed"). Many methods for finding AEs in text rely on predefining possible AEs before searching for prespecified words and phrases or manual labeling (standardization) by investigators. We developed a method to identify possible AEs, even if unknown or unattributed, without any prespecifications or standardization of notes. Our method was inspired by word-frequency analysis methods used to uncover the true authorship of disputed works credited to William Shakespeare. We chose two use cases, "transfusion" and "time-based." Transfusion was chosen because new transfusion AE types were becoming recognized during the study data period; therefore, we anticipated an opportunity to find unattributed potential AEs (PAEs) in the notes. With the time-based case, we wanted to simulate near real-time surveillance. We chose time periods in the hope of detecting PAEs due to contaminated heparin from mid-2007 to mid-2008 that were announced in early 2008. We hypothesized that the prevalence of contaminated heparin may have been widespread enough to manifest in EHRs through symptoms related to heparin AEs, independent of clinicians' documentation of attributed AEs.

OBJECTIVE

We aimed to develop a new method to identify attributed and unattributed PAEs using the unstructured text of EHRs.

METHODS

We used EHRs for adult critical care admissions at a major teaching hospital (2001-2012). For each case, we formed a group of interest and a comparison group. We concatenated the text notes for each admission into one document sorted by date, and deleted replicate sentences and lists. We identified statistically significant words in the group of interest versus the comparison group. Documents in the group of interest were filtered to those words, followed by topic modeling on the filtered documents to produce topics. For each topic, the three documents with the maximum topic scores were manually reviewed to identify PAEs.

RESULTS

Topics centered around medical conditions that were unique to or more common in the group of interest, including PAEs. In each use case, most PAEs were unattributed in the notes. Among the transfusion PAEs was unattributed evidence of transfusion-associated cardiac overload and transfusion-related acute lung injury. Some of the PAEs from mid-2007 to mid-2008 were increased unattributed events consistent with AEs related to heparin contamination.

CONCLUSIONS

The Shakespeare method could be a useful supplement to AE reporting and surveillance of structured EHR data. Future improvements should include automation of the manual review process.

摘要

背景

大数据工具为在电子医疗记录(EHR)的非结构化文本中监测不良事件(与医疗护理相关的患者伤害)(AE)提供了机会。作者可能会明确指出治疗与不良结局之间的明显关联(“归因于”),或者仅陈述简单的治疗和结局而无关联(“未归因于”)。许多在文本中查找AE的方法依赖于在搜索预先指定的单词和短语之前预定义可能的AE,或者由研究人员进行手动标注(标准化)。我们开发了一种方法来识别可能的AE,即使是未知的或未归因的,无需对记录进行任何预先设定或标准化。我们的方法受到用于揭示被认为是威廉·莎士比亚所著有争议作品的真正作者的词频分析方法的启发。我们选择了两个用例,“输血”和“基于时间的”。选择输血用例是因为在研究数据期间新的输血AE类型不断被识别出来;因此,我们预计有机会在记录中找到未归因的潜在AE(PAE)。对于基于时间的用例,我们希望模拟近乎实时的监测。我们选择时间段是为了检测2007年年中至2008年年中因肝素污染导致的PAE,这些污染在2008年初被公布。我们假设受污染肝素的流行可能已经广泛到足以通过与肝素AE相关的症状在EHR中显现出来,而与临床医生对归因AE的记录无关。

目的

我们旨在开发一种使用EHR的非结构化文本识别归因和未归因PAE的新方法。

方法

我们使用了一家大型教学医院(2001 - 2012年)成人重症监护入院的EHR。对于每个病例,我们形成了一个感兴趣的组和一个对照组。我们将每次入院时的文本记录按日期排序连接成一个文档,并删除重复的句子和列表。我们识别出感兴趣组与对照组之间具有统计学意义的单词。将感兴趣组中的文档过滤为这些单词,然后对过滤后的文档进行主题建模以生成主题。对于每个主题,手动审查主题得分最高的三份文档以识别PAE。

结果

主题围绕感兴趣组中特有的或更常见的医疗状况,包括PAE。在每个用例中,大多数PAE在记录中未被归因。在输血PAE中,有未归因的输血相关心脏超负荷和输血相关急性肺损伤的证据。2007年年中至2008年年中的一些PAE是与肝素污染相关的AE一致的未归因事件增加。

结论

莎士比亚方法可能是AE报告和结构化EHR数据监测的有用补充。未来的改进应包括手动审查过程的自动化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a82/10414364/04d281b9bbe6/xmed_v2i3e27017_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验