Vu David M, Krystosik Amy R, Ndenga Bryson A, Mutuku Francis M, Ripp Kelsey, Liu Elizabeth, Bosire Carren M, Heath Claire, Chebii Philip, Maina Priscilla Watiri, Jembe Zainab, Malumbo Said Lipi, Amugongo Jael Sagina, Ronga Charles, Okuta Victoria, Mutai Noah, Makenzi Nzaro G, Litunda Kennedy A, Mukoko Dunstan, King Charles H, LaBeaud A Desiree
Department of Pediatrics, Division of Infectious Diseases, Stanford University School of Medicine, Stanford, California, United States of America.
Centre for Global Health Research, Kenya Medical Research Institute, Kisumu, Kenya.
PLOS Glob Public Health. 2023 Jul 26;3(7):e0001950. doi: 10.1371/journal.pgph.0001950. eCollection 2023.
Poor access to diagnostic testing in resource limited settings restricts surveillance for emerging infections, such as dengue virus (DENV), to clinician suspicion, based on history and exam observations alone. We investigated the ability of machine learning to detect DENV based solely on data available at the clinic visit. We extracted symptom and physical exam data from 6,208 pediatric febrile illness visits to Kenyan public health clinics from 2014-2019 and created a dataset with 113 clinical features. Malaria testing was available at the clinic site. DENV testing was performed afterwards. We randomly sampled 70% of the dataset to develop DENV and malaria prediction models using boosted logistic regression, decision trees and random forests, support vector machines, naïve Bayes, and neural networks with 10-fold cross validation, tuned to maximize accuracy. 30% of the dataset was reserved to validate the models. 485 subjects (7.8%) had DENV, and 3,145 subjects (50.7%) had malaria. 220 (3.5%) subjects had co-infection with both DENV and malaria. In the validation dataset, clinician accuracy for diagnosis of malaria was high (82% accuracy, 85% sensitivity, 80% specificity). Accuracy of the models for predicting malaria diagnosis ranged from 53-69% (35-94% sensitivity, 11-80% specificity). In contrast, clinicians detected only 21 of 145 cases of DENV (80% accuracy, 14% sensitivity, 85% specificity). Of the six models, only logistic regression identified any DENV case (8 cases, 91% accuracy, 5.5% sensitivity, 98% specificity). Without diagnostic testing, interpretation of clinical findings by humans or machines cannot detect DENV at 8% prevalence. Access to point-of-care diagnostic tests must be prioritized to address global inequities in emerging infections surveillance.
在资源有限的环境中,难以获得诊断检测手段,这使得对登革热病毒(DENV)等新出现感染的监测仅局限于临床医生基于病史和体格检查观察的怀疑。我们研究了机器学习仅根据就诊时可获得的数据检测登革热病毒的能力。我们从2014年至2019年肯尼亚公共卫生诊所的6208例儿科发热疾病就诊病例中提取了症状和体格检查数据,并创建了一个包含113个临床特征的数据集。诊所现场可进行疟疾检测。之后进行登革热病毒检测。我们随机抽取70%的数据集,使用增强逻辑回归、决策树、随机森林、支持向量机、朴素贝叶斯和神经网络,并通过10折交叉验证来开发登革热病毒和疟疾预测模型,调整模型以最大化准确性。保留30%的数据集用于验证模型。485名受试者(7.8%)感染了登革热病毒,3145名受试者(50.7%)感染了疟疾。220名(3.5%)受试者同时感染了登革热病毒和疟疾。在验证数据集中,临床医生诊断疟疾的准确性较高(准确率82%,灵敏度85%,特异度80%)。预测疟疾诊断的模型准确率在53%至69%之间(灵敏度35%至94%,特异度11%至80%)。相比之下,临床医生在145例登革热病毒病例中仅检测出21例(准确率80%,灵敏度14%,特异度85%)。在六个模型中,只有逻辑回归识别出了任何登革热病毒病例(8例,准确率91%,灵敏度5.5%,特异度98%)。在没有诊断检测的情况下,人类或机器对临床发现的解读无法在患病率为8%时检测出登革热病毒。必须优先考虑获得即时诊断检测,以解决新出现感染监测中的全球不平等问题。