Department of Computer Science, Stanford University, Stanford, California, United States of America.
Department of Bioengineering, Stanford University, Stanford, California, United States of America.
PLoS One. 2022 Jul 28;17(7):e0271487. doi: 10.1371/journal.pone.0271487. eCollection 2022.
Malnutrition is common, morbid, and often correctable, but subject to missed and delayed diagnosis. Better screening and prediction could improve clinical, functional, and economic outcomes. This study aimed to assess the predictability of malnutrition from longitudinal patient records, and the external generalizability of a predictive model. Predictive models were developed and validated on statewide emergency department (ED) and hospital admission databases for California, Florida and New York, including visits from October 1, 2015 to December 31, 2018. Visit features included patient demographics, diagnosis codes, and procedure categories. Models included long short-term memory (LSTM) recurrent neural networks trained on longitudinal trajectories, and gradient-boosted tree and logistic regression models trained on cross-sectional patient data. The dataset used for model training and internal validation (California and Florida) included 62,811 patient trajectories (266,951 visits). Test sets included 63,997 (California), 63,112 (Florida), and 62,472 (New York) trajectories, such that each cohort's composition was proportional to the prevalence of malnutrition in that state. Trajectories contained seven patient characteristics and up to 2,008 diagnosis categories. Area under the receiver-operating characteristic (AUROC) and precision-recall curves (AUPRC) were used to characterize prediction of first malnutrition diagnoses in the test sets. Data analysis was performed from September 2020 to May 2021. Between 4.0% (New York) and 6.2% (California) of patients received malnutrition diagnoses. The longitudinal LSTM model produced the most accurate predictions of malnutrition, with comparable predictive performance in California (AUROC 0.854, AUPRC 0.258), Florida (AUROC 0.869, AUPRC 0.234), and New York (AUROC 0.869, AUPRC 0.190). Deep learning models can reliably predict malnutrition from existing longitudinal patient records, with better predictive performance and lower data-collection requirements than existing instruments. This approach may facilitate early nutritional intervention via automated screening at the point of care.
营养不良很常见,且通常是病态的,而且往往可以纠正,但易被漏诊和延迟诊断。更好的筛查和预测可以改善临床、功能和经济结局。本研究旨在评估从纵向患者记录中预测营养不良的能力,并评估预测模型的外部泛化能力。预测模型是在加利福尼亚州、佛罗里达州和纽约州的全州急诊部(ED)和住院患者数据库中开发和验证的,包括 2015 年 10 月 1 日至 2018 年 12 月 31 日的就诊情况。就诊特征包括患者人口统计学信息、诊断代码和手术类别。模型包括基于纵向轨迹的长短期记忆(LSTM)递归神经网络和基于横截面患者数据的梯度提升树和逻辑回归模型。用于模型训练和内部验证的数据集(加利福尼亚州和佛罗里达州)包括 62811 个患者轨迹(266951 次就诊)。测试集包括 63997 个(加利福尼亚州)、63112 个(佛罗里达州)和 62472 个(纽约州)轨迹,以使每个队列的组成与该州营养不良的患病率成正比。轨迹包含七个患者特征和多达 2008 个诊断类别。接收器工作特征(ROC)曲线下面积(AUROC)和精度-召回曲线(AUPRC)用于描述测试集中首次营养不良诊断的预测情况。数据分析于 2020 年 9 月至 2021 年 5 月进行。有 4.0%(纽约州)和 6.2%(加利福尼亚州)的患者被诊断为营养不良。基于纵向轨迹的 LSTM 模型对营养不良的预测最为准确,在加利福尼亚州(AUROC 0.854,AUPRC 0.258)、佛罗里达州(AUROC 0.869,AUPRC 0.234)和纽约州(AUROC 0.869,AUPRC 0.190)的预测性能相当。深度学习模型可以从现有的纵向患者记录中可靠地预测营养不良,其预测性能优于现有的工具,且数据收集要求更低。这种方法可以通过在护理点进行自动筛查,促进早期营养干预。