Department of Public Health, School of Public Health, College of Medicine and Health Science, Mizan-Tepi University, Mizan-Aman, Ethiopia.
Department of Public Health, College of Medicine and Health Science, Debre-Markos University, Gojjam, Ethiopia.
Front Public Health. 2024 Mar 15;12:1341279. doi: 10.3389/fpubh.2024.1341279. eCollection 2024.
Despite endeavors to achieve the Joint United Nations Programme on HIV/AIDS 95-95-95 fast track targets established in 2014 for HIV prevention, progress has fallen short. Hence, it is imperative to identify factors that can serve as predictors of an adolescent's HIV status. This identification would enable the implementation of targeted screening interventions and the enhancement of healthcare services. Our primary objective was to identify these predictors to facilitate the improvement of HIV testing services for adolescents in Ethiopia.
A study was conducted by utilizing eight different machine learning techniques to develop models using demographic and health data from 4,502 adolescent respondents. The dataset consisted of 31 variables and variable selection was done using different selection methods. To train and validate the models, the data was randomly split into 80% for training and validation, and 20% for testing. The algorithms were evaluated, and the one with the highest accuracy and mean f1 score was selected for further training using the most predictive variables.
The J48 decision tree algorithm has proven to be remarkably successful in accurately detecting HIV positivity, outperforming seven other algorithms with an impressive accuracy rate of 81.29% and a Receiver Operating Characteristic (ROC) curve of 86.3%. The algorithm owes its success to its remarkable capability to identify crucial predictor features, with the top five being age, knowledge of HIV testing locations, age at first sexual encounter, recent sexual activity, and exposure to family planning. Interestingly, the model's performance witnessed a significant improvement when utilizing only twenty variables as opposed to including all variables.
Our research findings indicate that the J48 decision tree algorithm, when combined with demographic and health-related data, is a highly effective tool for identifying potential predictors of HIV testing. This approach allows us to accurately predict which adolescents are at a high risk of infection, enabling the implementation of targeted screening strategies for early detection and intervention. To improve the testing status of adolescents in the country, we recommend considering demographic factors such as age, age at first sexual encounter, exposure to family planning, recent sexual activity, and other identified predictors.
尽管 2014 年联合联合国艾滋病规划署(UNAIDS)提出了艾滋病毒预防 95-95-95 快速通道目标,但进展仍未达到预期。因此,确定哪些因素可以作为青少年艾滋病毒感染状况的预测因素至关重要。这一识别将使实施有针对性的筛查干预措施和加强医疗保健服务成为可能。我们的主要目标是确定这些预测因素,以改善埃塞俄比亚青少年的艾滋病毒检测服务。
本研究利用八种不同的机器学习技术,利用来自 4502 名青少年受访者的人口统计学和健康数据开发模型。该数据集包含 31 个变量,采用不同的选择方法进行变量选择。为了训练和验证模型,数据随机分为 80%用于训练和验证,20%用于测试。评估了算法,并选择准确性和平均 f1 分数最高的算法,使用最具预测性的变量进行进一步训练。
J48 决策树算法在准确检测艾滋病毒阳性方面表现出色,优于其他七种算法,准确率达到 81.29%,ROC 曲线为 86.3%。该算法之所以成功,是因为它具有识别关键预测特征的出色能力,其中前五个特征是年龄、艾滋病毒检测地点知识、首次性接触年龄、最近的性行为和计划生育接触。有趣的是,与包括所有变量相比,仅使用 20 个变量时,模型的性能有了显著提高。
我们的研究结果表明,J48 决策树算法与人口统计学和健康相关数据相结合,是识别艾滋病毒检测潜在预测因素的一种非常有效的工具。这种方法使我们能够准确预测哪些青少年感染风险较高,从而实施有针对性的筛查策略,进行早期检测和干预。为了改善该国青少年的检测状况,我们建议考虑年龄、首次性行为年龄、计划生育接触、最近性行为和其他确定的预测因素等人口统计学因素。