Department of Engineering, Universidad del Pacífico, Lima 15072, Peru.
Faculty of Health Sciences, School of Medicine, Universidad Continental, Lima 15046, Peru.
Int J Environ Res Public Health. 2023 Mar 30;20(7):5318. doi: 10.3390/ijerph20075318.
Inadequate knowledge is one of the principal obstacles for preventing HIV/AIDS spread. Worldwide, it is reported that adolescents and young people have a higher vulnerability of being infected. Thus, the need to understand youths' knowledge towards HIV/AIDS becomes crucial. This study aimed to identify the determinants and develop a predictive model to estimate HIV/AIDS knowledge among this target population in Peru. Data from the 2019 DHS Survey were used. The software RStudio and RapidMiner were used for quasi-binomial logistic regression and computational model building, respectively. Five classification algorithms were considered for model development and their performance was assessed using accuracy, sensitivity, specificity, FPR, FNR, Cohen's kappa, F1 score and AUC. The results revealed an association between 14 socio-demographic, economic and health factors and HIV/AIDS knowledge. The accuracy levels were estimated between 59.47 and 64.30%, with the random forest model showing the best performance (64.30%). Additionally, the best classifier showed that the gender of the respondent, area of residence, wealth index, region of residence, interviewee's age, highest educational level, ethnic self-perception, having heard about HIV/AIDS in the past, the performance of an HIV/AIDS screening test and mass media access have a major influence on HIV/AIDS knowledge prediction. The results suggest the usefulness of the associations found and the random forest model as a predictor of knowledge of HIV/AIDS and may aid policy makers to guide and reinforce the planning and implementation of healthcare strategies.
知识不足是预防艾滋病传播的主要障碍之一。据报道,在全球范围内,青少年和年轻人更容易感染艾滋病。因此,了解年轻人对艾滋病的认识变得至关重要。本研究旨在确定秘鲁青少年对艾滋病相关知识的决定因素,并建立预测模型。使用了 2019 年 DHS 调查的数据。RStudio 和 RapidMiner 软件分别用于拟二项逻辑回归和计算模型构建。考虑了五种分类算法来开发模型,并使用准确性、敏感性、特异性、FPR、FNR、Cohen's kappa、F1 得分和 AUC 来评估模型的性能。结果表明,14 个社会人口、经济和健康因素与艾滋病知识之间存在关联。准确性水平估计在 59.47%至 64.30%之间,随机森林模型表现最佳(64.30%)。此外,最佳分类器表明,受访者的性别、居住区域、财富指数、居住地区、受访者年龄、最高教育水平、民族自我认知、过去听说过艾滋病、艾滋病筛查测试的表现以及大众媒体的获取情况对艾滋病知识预测有重大影响。研究结果表明,所发现的关联和随机森林模型作为艾滋病知识预测指标的有用性,可能有助于政策制定者指导和加强医疗保健策略的规划和实施。