Division of Cardiology, Department of Medicine, New York University Grossman School of Medicine, New York, New York, USA.
Palliative and Advanced Illness Research (PAIR) Center, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.
J Am Med Inform Assoc. 2021 Dec 28;29(1):109-119. doi: 10.1093/jamia/ocab248.
Frailty is a prevalent risk factor for adverse outcomes among patients with chronic lung disease. However, identifying frail patients who may benefit from interventions is challenging using standard data sources. We therefore sought to identify phrases in clinical notes in the electronic health record (EHR) that describe actionable frailty syndromes.
We used an active learning strategy to select notes from the EHR and annotated each sentence for 4 actionable aspects of frailty: respiratory impairment, musculoskeletal problems, fall risk, and nutritional deficiencies. We compared the performance of regression, tree-based, and neural network models to predict the labels for each sentence. We evaluated performance with the scaled Brier score (SBS), where 1 is perfect and 0 is uninformative, and the positive predictive value (PPV).
We manually annotated 155 952 sentences from 326 patients. Elastic net regression had the best performance across all 4 frailty aspects (SBS 0.52, 95% confidence interval [CI] 0.49-0.54) followed by random forests (SBS 0.49, 95% CI 0.47-0.51), and multi-task neural networks (SBS 0.39, 95% CI 0.37-0.42). For the elastic net model, the PPV for identifying the presence of respiratory impairment was 54.8% (95% CI 53.3%-56.6%) at a sensitivity of 80%.
Classification models using EHR notes can effectively identify actionable aspects of frailty among patients living with chronic lung disease. Regression performed better than random forest and neural network models.
NLP-based models offer promising support to population health management programs that seek to identify and refer community-dwelling patients with frailty for evidence-based interventions.
衰弱是慢性肺部疾病患者发生不良结局的一个普遍风险因素。然而,使用标准数据源识别可能从干预中获益的虚弱患者具有挑战性。因此,我们试图在电子健康记录(EHR)的临床记录中找到描述可操作性衰弱综合征的短语。
我们使用主动学习策略从 EHR 中选择记录,并对每个句子进行 4 个可操作性衰弱方面的注释:呼吸损害、肌肉骨骼问题、跌倒风险和营养缺乏。我们比较了回归、树基和神经网络模型预测每个句子标签的性能。我们使用缩放 Brier 评分(SBS)和阳性预测值(PPV)评估性能,其中 1 表示完美,0 表示无信息。
我们手动注释了 326 名患者的 155952 个句子。弹性网络回归在所有 4 个衰弱方面的表现最佳(SBS 0.52,95%置信区间 [CI] 0.49-0.54),其次是随机森林(SBS 0.49,95% CI 0.47-0.51)和多任务神经网络(SBS 0.39,95% CI 0.37-0.42)。对于弹性网络模型,在灵敏度为 80%时,识别呼吸损害存在的 PPV 为 54.8%(95% CI 53.3%-56.6%)。
使用 EHR 记录的分类模型可以有效地识别慢性肺部疾病患者衰弱的可操作性方面。回归模型的性能优于随机森林和神经网络模型。
基于 NLP 的模型为希望识别和转介衰弱的社区居住患者进行基于证据的干预的人群健康管理计划提供了有前途的支持。