Stritch School of Medicine, Loyola University Chicago, Maywood, IL, USA.
Department of Computer Science, Loyola University Chicago, Chicago, IL, USA.
Alcohol. 2020 May;84:49-55. doi: 10.1016/j.alcohol.2019.09.008. Epub 2019 Sep 28.
Current modes of identifying alcohol misuse in hospitalized patients rely on self-report questionnaires and diagnostic codes that have limitations, including low sensitivity. Information in the clinical notes of the electronic health record (EHR) may further augment the identification of alcohol misuse. Natural language processing (NLP) with supervised machine learning has been successful at analyzing clinical notes and identifying cases of alcohol misuse in trauma patients.
An alcohol misuse NLP classifier, previously developed on trauma patients who completed the Alcohol Use Disorders Identification Test, was validated in a cohort of 1000 hospitalized patients at a large, tertiary health system between January 1, 2007 and September 1, 2017. The clinical notes were processed using the clinical Text Analysis and Knowledge Extraction System. The National Institute on Alcohol Abuse and Alcoholism (NIAAA) guidelines for alcohol misuse were used during annotation of the medical records in our validation dataset.
The alcohol misuse classifier had an area under the receiver operating characteristic curve of 0.91 (95% CI 0.90-0.93) in the cohort of hospitalized patients. The sensitivity, specificity, positive predictive value, and negative predictive value were 0.88 (95% CI 0.85-0.90), 0.78 (95% CI 0.74-0.82), 0.85 (95% CI 0.82-0.87), and 0.82 (95% CI 0.78-0.86), respectively. The Hosmer-Lemeshow Test (p = 0.13) demonstrates good model fit. Additionally, there was a dose-dependent response in alcohol consumption behaviors across increasing strata of predicted probabilities for alcohol misuse.
The alcohol misuse NLP classifier had good discrimination and test characteristics in hospitalized patients. An approach using the clinical notes with NLP and supervised machine learning may better identify alcohol misuse cases than conventional methods solely relying on billing diagnostic codes.
目前,识别住院患者酗酒的模式依赖于自我报告问卷和诊断代码,这些方法存在局限性,包括敏感性低。电子病历(EHR)中的临床记录信息可能进一步增强酗酒的识别。有监督机器学习的自然语言处理(NLP)已成功用于分析临床记录并识别创伤患者中的酗酒病例。
先前在接受酒精使用障碍识别测试的创伤患者中开发的酗酒 NLP 分类器,在 2007 年 1 月 1 日至 2017 年 9 月 1 日期间,在一家大型三级医疗系统的 1000 名住院患者队列中进行了验证。使用临床文本分析和知识提取系统处理临床记录。在验证数据集中,使用国家酒精滥用和酒精中毒研究所(NIAAA)的酗酒指南对病历进行注释。
在住院患者队列中,酗酒分类器的受试者工作特征曲线下面积为 0.91(95%CI 0.90-0.93)。敏感性、特异性、阳性预测值和阴性预测值分别为 0.88(95%CI 0.85-0.90)、0.78(95%CI 0.74-0.82)、0.85(95%CI 0.82-0.87)和 0.82(95%CI 0.78-0.86)。Hosmer-Lemeshow 检验(p=0.13)表明模型拟合良好。此外,随着酗酒预测概率的增加,饮酒行为呈剂量依赖性。
酗酒 NLP 分类器在住院患者中具有良好的判别和测试特征。与仅依赖计费诊断代码的传统方法相比,使用临床记录和有监督机器学习的方法可能更好地识别酗酒病例。