Kimaina Allan, Dick Jonathan, DeLong Allison, Chrysanthopoulou Stavroula A, Kantor Rami, Hogan Joseph W
Moi University, Eldoret, Kenya.
Brown University, Providence, RI, USA.
Stat Commun Infect Dis. 2020 Nov 12;12(Suppl1):20190017. doi: 10.1515/scid-2019-0017. eCollection 2020 Sep 1.
Human immunodeficiency virus (HIV) viral failure occurs when antiretroviral therapy fails to suppress and sustain a person's viral load count below 1,000 copies of viral ribonucleic acid per milliliter. For those newly diagnosed with HIV and living in a setting where healthcare resources are limited, such as a low- and middle-income country, the World Health Organization recommends viral load monitoring six months after initiation of antiretroviral treatment and yearly thereafter. Deviations from this schedule are made in cases where viral failure occurs or at the discretion of the clinician. Failure to detect viral failure in a timely fashion can lead to delayed administration of essential interventions. Clinical prediction models based on information available in the patient medical record are increasingly being developed and deployed for decision support in clinical medicine and public health. This raises the possibility that prediction models can be used to detect potential for viral failure in advance of viral measurements, particularly when those measurements occur infrequently.
Our goal is to use electronic health record data from a large HIV care program in Kenya to characterize and compare the predictive accuracy of several statistical machine learning methods for predicting viral failure at the first and second measurements following initiation of antiretroviral therapy. Predictive accuracy is measured in terms of sensitivity, specificity and area under the receiver-operator characteristic curve.
We trained and cross-validated 10 statistical machine learning models and algorithms on data from over 10,000 patients in the Academic Model Providing Access to Healthcare care program in western Kenya. These included parametric, non-parametric, ensemble, and Bayesian methods. The input variables included 50 items from the clinical record, hand picked in consultation with clinician experts. Predictive accuracy measures were calculated using 10-fold cross validation.
Viral load failure rate is about 20% in this patient cohort at both the first and second measurements. Ensemble techniques generally outperformed other methods. For predicting viral failure at the first follow up measure, specificity was over 90% for these methods, but sensitivity was typically in the 50-60% range. Predictive accuracy was greater for the second follow up measure, with sensitivities over 80%. Super Learner, gradient boosting and Bayesian additive regression trees consistently outperformed other methods. For a viral failure rate of 20%, the positive predictive value for the top-performing methods is between 75 and 85%, while the negative predictive value is over 95%.
Evidence from this study suggests that machine learning techniques have potential to identify patients at risk for viral failure prior to their scheduled measurements. Ultimately, prognostic virologic assessment can help guide the administration of earlier targeted intervention such as enhanced drug resistance monitoring, rigorous adherence counseling, or appropriate next-line therapy switching. External validation studies should be used to confirm the results found here.
当抗逆转录病毒疗法未能将患者的病毒载量抑制并维持在每毫升低于1000拷贝的病毒核糖核酸水平时,就会发生人类免疫缺陷病毒(HIV)病毒治疗失败。对于那些新诊断出感染HIV且生活在医疗资源有限的地区(如低收入和中等收入国家)的人,世界卫生组织建议在开始抗逆转录病毒治疗六个月后进行病毒载量监测,此后每年监测一次。在发生病毒治疗失败的情况下或根据临床医生的判断,可以偏离此时间表。未能及时检测到病毒治疗失败可能导致必要干预措施的延迟实施。基于患者病历中可用信息的临床预测模型正在越来越多地被开发和应用于临床医学和公共卫生的决策支持。这增加了一种可能性,即预测模型可用于在病毒测量之前提前检测病毒治疗失败的可能性,特别是当这些测量不经常进行时。
我们的目标是使用来自肯尼亚一个大型HIV护理项目的电子健康记录数据,来描述和比较几种统计机器学习方法在预测抗逆转录病毒治疗开始后的第一次和第二次测量时病毒治疗失败的预测准确性。预测准确性通过敏感性、特异性和受试者操作特征曲线下面积来衡量。
我们在肯尼亚西部“提供医疗服务学术模型”(Academic Model Providing Access to Healthcare)护理项目中超过10000名患者的数据上训练并交叉验证了10种统计机器学习模型和算法。这些方法包括参数法、非参数法、集成法和贝叶斯方法。输入变量包括从临床记录中挑选出的50项内容,这些内容是在与临床专家协商后精心挑选的。使用10折交叉验证来计算预测准确性指标。
在这个患者队列中,第一次和第二次测量时的病毒载量失败率约为20%。集成技术通常优于其他方法。对于预测第一次随访测量时的病毒治疗失败,这些方法的特异性超过90%,但敏感性通常在50 - 60%的范围内。第二次随访测量的预测准确性更高,敏感性超过80%。超级学习器、梯度提升和贝叶斯加法回归树始终优于其他方法。对于20%的病毒治疗失败率,表现最佳的方法的阳性预测值在75%至85%之间,而阴性预测值超过95%。
这项研究的证据表明,机器学习技术有潜力在预定测量之前识别有病毒治疗失败风险的患者。最终,预后病毒学评估有助于指导更早地进行有针对性的干预,如加强耐药性监测、严格的依从性咨询或适当的二线治疗转换。应使用外部验证研究来证实此处发现的结果。