Columbia University School of Nursing, New York, New York, United States of America.
Data Science Institute, Columbia University, New York, New York, United States of America.
PLoS One. 2022 Jul 11;17(7):e0270220. doi: 10.1371/journal.pone.0270220. eCollection 2022.
The prevalence of patients who are Incapacitated with No Evident Advance Directives or Surrogates (INEADS) remains unknown because such data are not routinely captured in structured electronic health records. This study sought to develop and validate a natural language processing (NLP) algorithm to identify information related to being INEADS from clinical notes. We used a publicly available dataset of critical care patients from 2001 through 2012 at a United States academic medical center, which contained 418,393 relevant clinical notes for 23,904 adult admissions. We developed 17 subcategories indicating reduced or elevated potential for being INEADS, and created a vocabulary of terms and expressions within each. We used an NLP application to create a language model and expand these vocabularies. The NLP algorithm was validated against gold standard manual review of 300 notes and showed good performance overall (F-score = 0.83). More than 80% of admissions had notes containing information in at least one subcategory. Thirty percent (n = 7,134) contained at least one of five social subcategories indicating elevated potential for being INEADS, and <1% (n = 81) contained at least four, which we classified as high likelihood of being INEADS. Among these, n = 8 admissions had no subcategory indicating reduced likelihood of being INEADS, and appeared to meet the definition of INEADS following manual review. Among the remaining n = 73 who had at least one subcategory indicating reduced likelihood of being INEADS, manual review of a 10% sample showed that most did not appear to be INEADS. Compared with the full cohort, the high likelihood group was significantly more likely to die during hospitalization and within four years, to have Medicaid, to have an emergency admission, and to be male. This investigation demonstrates potential for NLP to identify INEADS patients, and may inform interventions to enhance advance care planning for patients who lack social support.
因缺乏在结构化电子健康记录中常规获取此类数据的途径,无法明确无能力且无预嘱或代理人的患者(INEADS)的流行率。本研究旨在开发并验证一种自然语言处理(NLP)算法,以从临床记录中识别与成为 INEADS 相关的信息。我们使用了一家美国学术医疗中心 2001 年至 2012 年期间的重症监护患者的公开数据集,该数据集包含 23904 名成人住院患者的 418393 份相关临床记录。我们开发了 17 个亚类别,以指示降低或增加成为 INEADS 的可能性,并在每个类别中创建了术语和表达词汇。我们使用 NLP 应用程序创建了语言模型并扩展了这些词汇。该 NLP 算法在对 300 份记录的金标准手动审查中表现出良好的性能(F 分数=0.83)。超过 80%的住院患者的记录中至少包含一个亚类别的信息。30%(n=7134)的记录至少包含五个社会亚类别中的一个,表明存在较高的成为 INEADS 的可能性,<1%(n=81)的记录至少包含四个亚类别,我们将其归类为具有较高成为 INEADS 的可能性。其中,n=8 名患者的记录中没有一个亚类别表明其成为 INEADS 的可能性降低,且在手动审查后被认为符合 INEADS 的定义。在其余 n=73 名记录中至少有一个亚类别表明其成为 INEADS 的可能性降低的患者中,对 10%的样本进行手动审查后发现,大多数患者似乎并非 INEADS。与整个队列相比,高可能性组患者在住院期间和四年内死亡、拥有医疗补助、急诊入院和为男性的可能性显著更高。本研究证明了 NLP 识别 INEADS 患者的潜力,并可能为增强缺乏社会支持的患者的预嘱提供信息。