Cui Yifan, Moyo Sikhulile, Pretorius Holme Molly, Hurwitz Kathleen E, Choga Wonderful, Bennett Kara, Chakalisa Unoda, San James Emmanuel, Manyake Kutlo, Kgathi Coulson, Diphoko Ame, Gaseitsiwe Simani, Gaolathe Tendani, Essex M, Tchetgen Tchetgen Eric, Makhema Joseph M, Lockman Shahin
Center for Data Science, Zhejiang University, Hangzhou, Zhejiang, China.
Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
AIDS. 2025 Mar 1;39(3):290-297. doi: 10.1097/QAD.0000000000004055. Epub 2024 Nov 4.
To identify predictors of HIV acquisition in Botswana.
We applied machine learning approaches to identify HIV risk predictors using existing data from a large, well characterized HIV incidence cohort.
We applied machine learning (randomForestSRC) to analyze data from a large population-based HIV incidence cohort enrolled in a cluster-randomized HIV prevention trial in 30 communities across Botswana. We sought to identify the most important risk factors for HIV acquisition, starting with 110 potential predictors.
During a median 29-month follow-up of 8551 HIV-negative adults, 147 (1.7%) acquired HIV. Our machine learning analysis found that for females, the most important variables for predicting HIV acquisition were the use of injectable hormonal contraception, frequency of sex in the prior 3 months with the most recent partner and residing in a community with HIV prevalence of 29% or higher. For the small proportion (0.3%) of females who had all three risk factors, their estimated probability of acquiring HIV during 29 months of follow-up was 34% (approximate annual incidence of 14%). For males, nonlong-term relationships with the most recent partner and community HIV prevalence of 34% or higher were the most important HIV risk predictors. The 6% of males who had both risk factors had a 5.1% probability of acquiring HIV during the follow-up period (approximate annual incidence of 2.1%).
Machine learning approaches allowed us to analyze a large number of variables to efficiently identify key factors strongly predictive of HIV risk. These factors could help target HIV prevention interventions in Botswana.
NCT01965470.
确定博茨瓦纳艾滋病病毒感染的预测因素。
我们应用机器学习方法,利用来自一个大型、特征明确的艾滋病病毒发病率队列的现有数据,来识别艾滋病病毒风险预测因素。
我们应用机器学习(randomForestSRC)分析来自一个基于人群的大型艾滋病病毒发病率队列的数据,该队列参与了博茨瓦纳30个社区的一项整群随机艾滋病预防试验。我们试图从110个潜在预测因素入手,找出艾滋病病毒感染的最重要风险因素。
在对8551名艾滋病病毒阴性成年人进行的中位29个月随访期间,147人(1.7%)感染了艾滋病病毒。我们的机器学习分析发现,对于女性而言,预测艾滋病病毒感染的最重要变量是使用注射用激素避孕法、过去3个月与最近性伴侣的性行为频率以及居住在艾滋病病毒流行率为29%或更高的社区。对于同时具备这三个风险因素的一小部分女性(0.3%),她们在29个月随访期间感染艾滋病病毒的估计概率为34%(年发病率约为14%)。对于男性而言,与最近性伴侣的非长期关系以及社区艾滋病病毒流行率为34%或更高是最重要的艾滋病病毒风险预测因素。同时具备这两个风险因素的6%男性在随访期间感染艾滋病病毒的概率为5.1%(年发病率约为2.1%)。
机器学习方法使我们能够分析大量变量,以有效识别强烈预测艾滋病病毒风险的关键因素。这些因素有助于在博茨瓦纳确定艾滋病预防干预的目标人群。
NCT01965470。