Ming Damien K, Tuan Nguyen M, Hernandez Bernard, Sangkaew Sorawat, Vuong Nguyen L, Chanh Ho Q, Chau Nguyen V V, Simmons Cameron P, Wills Bridget, Georgiou Pantelis, Holmes Alison H, Yacoub Sophie
Centre for Antimicrobial Optimisation, Imperial College London, London, United Kingdom.
Children's Hospital 1, Ho Chi Minh City, Vietnam.
Front Digit Health. 2022 Mar 14;4:849641. doi: 10.3389/fdgth.2022.849641. eCollection 2022.
Symptomatic dengue infection can result in a life-threatening shock syndrome and timely diagnosis is essential. Point-of-care tests for non-structural protein 1 and IgM are used widely but performance can be limited. We developed a supervised machine learning model to predict whether patients with acute febrile illnesses had a diagnosis of dengue or other febrile illnesses (OFI). The impact of seasonality on model performance over time was examined.
We analysed data from a prospective observational clinical study in Vietnam. Enrolled patients presented with an acute febrile illness of <72 h duration. A gradient boosting model (XGBoost) was used to predict final diagnosis using age, sex, haematocrit, platelet, white cell, and lymphocyte count collected on enrolment. Data was randomly split 80/20% into a training and hold-out set, respectively, with the latter not used in model development. Cross-validation and hold out set testing was used, with performance over time evaluated through a rolling window approach.
We included 8,100 patients recruited between 16th October 2010 and 10th December 2014. In total 2,240 (27.7%) patients were diagnosed with dengue infection. The optimised model from training data had an overall median area under the receiver operator curve (AUROC) of 0.86 (interquartile range 0.84-0.86), specificity of 0.92, sensitivity of 0.56, positive predictive value of 0.73, negative predictive value (NPV) of 0.84, and Brier score of 0.13 in predicting the final diagnosis, with similar performances in hold-out set testing (AUROC of 0.86). Model performances varied significantly over time as a function of seasonality and other factors. Incorporation of a dynamic threshold which continuously learns from recent cases resulted in a more consistent performance throughout the year (NPV >90%).
Supervised machine learning models are able to discriminate between dengue and OFI diagnoses in patients presenting with an early undifferentiated febrile illness. These models could be of clinical utility in supporting healthcare decision-making and provide passive surveillance across dengue endemic regions. Effects of seasonality and changing disease prevalence must however be taken into account-this is of significant importance given unpredictable effects of human-induced climate change and the impact on health.
有症状的登革热感染可导致危及生命的休克综合征,及时诊断至关重要。针对非结构蛋白1和IgM的即时检验被广泛使用,但性能可能有限。我们开发了一种监督机器学习模型,以预测急性发热性疾病患者是否被诊断为登革热或其他发热性疾病(OFI)。研究了季节性对模型随时间推移的性能的影响。
我们分析了越南一项前瞻性观察性临床研究的数据。纳入的患者患有持续时间<72小时的急性发热性疾病。使用梯度提升模型(XGBoost),根据入组时收集的年龄、性别、血细胞比容、血小板、白细胞和淋巴细胞计数来预测最终诊断。数据以80/20%的比例随机分别分为训练集和保留集,后者不用于模型开发。使用交叉验证和保留集测试,并通过滚动窗口方法评估随时间的性能。
我们纳入了2010年10月16日至2014年12月10日期间招募的8100名患者。共有2240名(27.7%)患者被诊断为登革热感染。来自训练数据的优化模型在预测最终诊断时,受试者操作特征曲线下的总体中位数面积(AUROC)为0.86(四分位间距0.84 - 0.86),特异性为0.92,敏感性为0.56,阳性预测值为0.73,阴性预测值(NPV)为0.84,布里尔评分0.13,在保留集测试中的表现相似(AUROC为0.86)。模型性能随时间因季节性和其他因素而有显著差异。纳入一个从近期病例中持续学习的动态阈值,可使全年的性能更一致(NPV>90%)。
监督机器学习模型能够区分早期未分化发热性疾病患者的登革热诊断和OFI诊断。这些模型在支持医疗决策方面可能具有临床实用性,并能在登革热流行地区提供被动监测。然而,必须考虑季节性和疾病患病率变化的影响——鉴于人为引起的气候变化的不可预测影响及其对健康的影响,这一点至关重要。