Sato Taisuke, Grussing Emily D, Patel Ruchi, Ridgway Jessica, Suzuki Joji, Sweigart Benjamin, Miller Robert, Wurcel Alysse G
Tufts Medical Center, Tupper Building 4F, 800 Washington St, Boston, MA, United States, 1 617 636 4605.
University of Chicago School of Medicine, Chicago, IL, United States.
JMIR AI. 2025 Jul 18;4:e63147. doi: 10.2196/63147.
People who use drugs (PWUD) are at heightened risk of severe injection-related infections. Current research relies on billing codes to identify PWUD-a methodology with suboptimal accuracy that may underestimate the economic, racial, and ethnic diversity of hospitalized PWUD.
The goal of this study is to examine the impact of natural language processing (NLP) on enhancing identification of PWUD in electronic medical records, with a specific focus on determining improved systems of identifying populations who may previously been missed, including people who have low income or those from racially and ethnically minoritized populations.
Health informatics specialists assisted in querying a cohort of likely PWUD hospital admissions at Tufts Medical Center between 2020-2022 using the following criteria: (1) ICD-10 codes indicative of drug use, (2) positive drug toxicology results, (3) prescriptions for medications for opioid use disorder, and (4) applying NLP-detected presence of "token" keywords in the electronic medical records likely indicative of the patient being a PWUD. Hospital admissions were split into two groups: highly documented (all four criteria present) and minimally documented (NLP-only). These groups were examined to assess the impact of race, ethnicity, and social vulnerability index. With chart review as the "gold standard," the positive predictive value was calculated.
The cohort included 4548 hospitalization admissions, with broad heterogeneity in how people entered the cohort and subcohorts; a total of 288 hospital admissions entered the cohort through NLP token presence alone. NLP demonstrated a 54% positive predictive value, outperforming biomarkers, prescription for medications for opioid use disorder, and ICD codes in identifying hospitalizations of PWUD. Additionally, NLP significantly enhanced these methods when integrated into the identification algorithm. The study also found that people from racially and ethnically minoritized communities and those with lower social vulnerability index were significantly more likely to have lower rates of PWUD-related documentation.
NLP proved effective in identifying hospitalizations of PWUD, surpassing traditional methods. While further refinement is needed, NLP shows promising potential in minimizing health care disparities.
吸毒者面临与注射相关的严重感染的风险更高。当前的研究依赖计费代码来识别吸毒者——这种方法的准确性欠佳,可能会低估住院吸毒者的经济、种族和族裔多样性。
本研究的目的是检验自然语言处理(NLP)对加强电子病历中吸毒者识别的影响,特别关注确定改进的系统,以识别之前可能被遗漏的人群,包括低收入人群或来自种族和族裔少数群体的人群。
健康信息学专家协助使用以下标准查询2020年至2022年期间塔夫茨医疗中心可能的吸毒者住院队列:(1)表明吸毒的ICD - 10代码,(2)阳性药物毒理学结果,(3)阿片类药物使用障碍药物处方,以及(4)应用NLP检测电子病历中可能表明患者为吸毒者的“令牌”关键词的存在情况。住院病例分为两组:记录详尽(所有四个标准都存在)和记录最少(仅NLP)。对这些组进行检查以评估种族、族裔和社会脆弱性指数的影响。以病历审查作为“金标准”,计算阳性预测值。
该队列包括4548例住院病例,人们进入队列和亚队列的方式存在广泛的异质性;共有288例住院病例仅通过NLP令牌的存在进入队列。NLP显示出54% 的阳性预测值,在识别吸毒者住院病例方面优于生物标志物、阿片类药物使用障碍药物处方和ICD代码。此外,当NLP集成到识别算法中时,显著增强了这些方法。该研究还发现,来自种族和族裔少数群体社区的人以及社会脆弱性指数较低的人,与吸毒相关记录率较低的可能性显著更高。
NLP被证明在识别吸毒者住院病例方面有效,超过了传统方法。虽然需要进一步完善,但NLP在最小化医疗保健差距方面显示出有希望的潜力。