Sun Yao, Kaur Ravneet, Gupta Shubham, Paul Rahul, Das Ritu, Cho Su Jin, Anand Saket, Boutilier Justin J, Saria Suchi, Palma Jonathan, Saluja Satish, McAdams Ryan M, Kaur Avneet, Yadav Gautam, Singh Harpreet
Division of Neonatology, Department of Pediatrics, University of California San Francisco, San Francisco, California, USA.
Research and Development, Child Health Imprints (CHIL) Pte. Ltd., Singapore.
JAMIA Open. 2021 Mar 25;4(1):ooab004. doi: 10.1093/jamiaopen/ooab004. eCollection 2021 Jan.
The objectives of this study are to construct the high definition phenotype (HDP), a novel time-series data structure composed of both primary and derived parameters, using heterogeneous clinical sources and to determine whether different predictive models can utilize the HDP in the neonatal intensive care unit (NICU) to improve neonatal mortality prediction in clinical settings.
A total of 49 primary data parameters were collected from July 2018 to May 2020 from eight level-III NICUs. From a total of 1546 patients, 757 patients were found to contain sufficient fixed, intermittent, and continuous data to create HDPs. Two different predictive models utilizing the HDP, one a logistic regression model (LRM) and the other a deep learning long-short-term memory (LSTM) model, were constructed to predict neonatal mortality at multiple time points during the patient hospitalization. The results were compared with previous illness severity scores, including SNAPPE, SNAPPE-II, CRIB, and CRIB-II.
A HDP matrix, including 12 221 536 minutes of patient stay in NICU, was constructed. The LRM model and the LSTM model performed better than existing neonatal illness severity scores in predicting mortality using the area under the receiver operating characteristic curve (AUC) metric. An ablation study showed that utilizing continuous parameters alone results in an AUC score of >80% for both LRM and LSTM, but combining fixed, intermittent, and continuous parameters in the HDP results in scores >85%. The probability of mortality predictive score has recall and precision of 0.88 and 0.77 for the LRM and 0.97 and 0.85 for the LSTM.
The HDP data structure supports multiple analytic techniques, including the statistical LRM approach and the machine learning LSTM approach used in this study. LRM and LSTM predictive models of neonatal mortality utilizing the HDP performed better than existing neonatal illness severity scores. Further research is necessary to create HDP-based clinical decision tools to detect the early onset of neonatal morbidities.
本研究的目的是利用异质性临床来源构建高清表型(HDP),这是一种由主要参数和派生参数组成的新型时间序列数据结构,并确定不同的预测模型能否在新生儿重症监护病房(NICU)中利用HDP来改善临床环境下的新生儿死亡率预测。
2018年7月至2020年5月期间,从8个三级NICU收集了总共49个主要数据参数。在总共1546例患者中,发现757例患者包含足够的固定、间歇和连续数据以创建HDP。构建了两种利用HDP的不同预测模型,一种是逻辑回归模型(LRM),另一种是深度学习长短期记忆(LSTM)模型,以预测患者住院期间多个时间点的新生儿死亡率。将结果与先前的疾病严重程度评分进行比较,包括SNAPPE、SNAPPE-II、CRIB和CRIB-II。
构建了一个HDP矩阵,其中包括患者在NICU停留的12221536分钟。使用受试者操作特征曲线(AUC)下的面积指标,LRM模型和LSTM模型在预测死亡率方面比现有的新生儿疾病严重程度评分表现更好。一项消融研究表明,仅使用连续参数时,LRM和LSTM的AUC得分均>80%,但在HDP中结合固定、间歇和连续参数会使得分>85%。死亡率预测评分的概率,LRM的召回率和精确率分别为0.88和0.77,LSTM的召回率和精确率分别为0.97和0.85。
HDP数据结构支持多种分析技术,包括本研究中使用的统计LRM方法和机器学习LSTM方法。利用HDP的新生儿死亡率LRM和LSTM预测模型比现有的新生儿疾病严重程度评分表现更好。有必要进行进一步研究,以创建基于HDP的临床决策工具,以检测新生儿疾病的早期发作。