Nethi Arun Kumar, Karam Albert George, Alvarez Kristin S, Luque Amneris Esther, Nijhawan Ank E, Adhikari Emily, King Helen Lynne
PCCI, Dallas, TX.
Center of Innovation and Value at Parkland Health, Dallas, TX.
J Acquir Immune Defic Syndr. 2024 Sep 1;97(1):40-47. doi: 10.1097/QAI.0000000000003464.
BACKGROUND: Effective measures exist to prevent the spread of HIV. However, the identification of patients who are candidates for these measures can be a challenge. A machine learning model to predict risk for HIV may enhance patient selection for proactive outreach. SETTING: Using data from the electronic health record at Parkland Health, 1 of the largest public healthcare systems in the country, a machine learning model is created to predict incident HIV cases. The study cohort includes any patient aged 16 or older from 2015 to 2019 (n = 458,893). METHODS: Implementing a 70:30 ratio random split of the data into training and validation sets with an incident rate <0.08% and stratified by incidence of HIV, the model is evaluated using a k-fold cross-validated (k = 5) area under the receiver operating characteristic curve leveraging Light Gradient Boosting Machine Algorithm, an ensemble classifier. RESULTS: The light gradient boosting machine produces the strongest predictive power to identify good candidates for HIV PrEP. A gradient boosting classifier produced the best result with an AUC of 0.88 (95% confidence interval: 0.86 to 0.89) on the training set and 0.85 (95% confidence interval: 0.81 to 0.89) on the validation set for a sensitivity of 77.8% and specificity of 75.1%. CONCLUSIONS: A gradient boosting model using electronic health record data can be used to identify patients at risk of acquiring HIV and implemented in the clinical setting to build outreach for preventative interventions.
背景:存在有效的措施来预防艾滋病毒的传播。然而,识别适合采取这些措施的患者可能具有挑战性。一种预测艾滋病毒风险的机器学习模型可能会增强对患者进行主动干预的筛选。 背景:利用美国最大的公共医疗系统之一帕克兰健康中心电子健康记录中的数据,创建了一个机器学习模型来预测艾滋病毒感染病例。研究队列包括2015年至2019年期间年龄在16岁及以上的任何患者(n = 458,893)。 方法:将数据按70:30的比例随机分为训练集和验证集,感染率<0.08%,并按艾滋病毒感染率分层,使用k折交叉验证(k = 5)的受试者操作特征曲线下面积,利用轻梯度提升机算法(一种集成分类器)对模型进行评估。 结果:轻梯度提升机在识别艾滋病毒暴露前预防的合适人选方面具有最强的预测能力。梯度提升分类器在训练集上的AUC为0.88(95%置信区间:0.86至0.89),在验证集上的AUC为0.85(95%置信区间:0.81至0.89),敏感性为77.8%,特异性为75.1%,产生了最佳结果。 结论:使用电子健康记录数据的梯度提升模型可用于识别有感染艾滋病毒风险的患者,并在临床环境中实施,以开展预防干预的外联工作。
J Acquir Immune Defic Syndr. 2024-9-1
Clin Orthop Relat Res. 2020-7
BMC Public Health. 2024-6-28
BMJ Health Care Inform. 2025-5-15
Appl Clin Inform. 2025-5
J Am Med Inform Assoc. 2024-2-16
BMC Womens Health. 2023-6-16
J Orthop Trauma. 2022-6-1
AIDS Patient Care STDS. 2020-2-28
EClinicalMedicine. 2019-11-5