• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习算法在南非基于成年人群的重复调查中预测HIV检测的应用:多波横断面分析方案

The Application of Machine Learning Algorithms to Predict HIV Testing in Repeated Adult Population-Based Surveys in South Africa: Protocol for a Multiwave Cross-Sectional Analysis.

作者信息

Jaiteh Musa, Phalane Edith, Shiferaw Yegnanew A, Phaswana-Mafuya Refilwe Nancy

机构信息

South African Medical Research Council/University of Johannesburg Pan African Centre for Epidemics Research Extramural Unit, Faculty of Health Sciences, University of Johannesburg, Johannesburg, South Africa.

Department of Statistics, Faculty of Science, University of Johannesburg, Johannesburg, South Africa.

出版信息

JMIR Res Protoc. 2025 Jan 27;14:e59916. doi: 10.2196/59916.

DOI:10.2196/59916
PMID:39870368
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11811654/
Abstract

BACKGROUND

HIV testing is the cornerstone of HIV prevention and a pivotal step in realizing the Joint United Nations Program on HIV/AIDS (UNAIDS) goal of ending AIDS by 2030. Despite the availability of relevant survey data, there exists a research gap in using machine learning (ML) to analyze and predict HIV testing among adults in South Africa. Further investigation is needed to bridge this knowledge gap and inform evidence-based interventions to improve HIV testing.

OBJECTIVE

This study aims to determine consistent predictors of HIV testing by applying supervised ML algorithms in repeated adult population-based surveys in South Africa.

METHODS

A retrospective analysis of multiwave cross-sectional survey data will be conducted to determine the predictors of HIV testing among South African adults aged 18 years and older. A supervised ML technique will be applied across the five cycles of the South African National HIV Prevalence, Incidence, Behavior, and Communication Survey (SABSSM) surveys. The Human Science Research Council (HSRC) conducted the SABSSM surveys in 2002, 2005, 2008, 2012, and 2017. The available SABSSM datasets will be imported to RStudio (version 4.3.2; Posit Software, PBC) to clean and remove outliers. A chi-square test will be conducted to select important predictors of HIV testing. Each dataset will be split into 80% training and 20% test samples. Logistic regression, support vector machines, random forests, and decision trees will be used. A cross-validation technique will be used to divide the training sample into k-folds, including a validation set, and models will be trained on each fold. The models' performance will be evaluated on the validation set using evaluation metrics such as accuracy, precision, recall, F-score, area under curve-receiver operating characteristics, and confusion matrix.

RESULTS

The SABSSM datasets are open access datasets available on the HSRC database. Ethics approval for this study was obtained from the University of Johannesburg Research and Ethics Committee on April 23, 2024 (REC-2725-2024). The authors were given access to all five SABSSM datasets by the HSRC on August 20, 2024. The datasets were explored to identify the independent variables likely influencing HIV testing uptake. The findings of this study will determine consistent variables predicting HIV testing uptake among the South African adult population over the course of 20 years. Furthermore, this study will evaluate and compare the performance metrics of the 4 different ML algorithms, and the best model will be used to develop an HIV testing predictive model.

CONCLUSIONS

This study will contribute to existing knowledge and deepen understanding of factors linked to HIV testing beyond traditional methods. Consequently, the findings would inform evidence-based policy recommendations that can guide policy makers to formulate more effective and targeted public health approaches toward strengthening HIV testing.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/59916.

摘要

背景

艾滋病毒检测是艾滋病毒预防的基石,也是实现联合国艾滋病规划署(UNAIDS)到2030年终结艾滋病目标的关键一步。尽管有相关调查数据,但在利用机器学习(ML)分析和预测南非成年人艾滋病毒检测情况方面存在研究空白。需要进一步调查以弥合这一知识差距,并为改善艾滋病毒检测的循证干预措施提供依据。

目的

本研究旨在通过在南非基于成年人群体的重复调查中应用监督式ML算法,确定艾滋病毒检测的一致预测因素。

方法

将对多波横断面调查数据进行回顾性分析,以确定18岁及以上南非成年人艾滋病毒检测的预测因素。将在南非国家艾滋病毒流行率、发病率、行为和传播调查(SABSSM)的五个周期中应用监督式ML技术。人类科学研究委员会(HSRC)在2002年、2005年、2008年、2012年和2017年开展了SABSSM调查。将可用的SABSSM数据集导入RStudio(版本4.3.2;Posit软件公司)进行清理和去除异常值。将进行卡方检验以选择艾滋病毒检测的重要预测因素。每个数据集将分为80%的训练样本和20%的测试样本。将使用逻辑回归、支持向量机、随机森林和决策树。将使用交叉验证技术将训练样本划分为k折,包括一个验证集,并在每一折上训练模型。将使用准确性、精确性、召回率、F分数、曲线下面积-接收者操作特征和混淆矩阵等评估指标在验证集上评估模型的性能。

结果

SABSSM数据集是HSRC数据库上的开放获取数据集。本研究于2024年4月23日获得约翰内斯堡大学研究与伦理委员会的伦理批准(REC-2725-2024)。2024年8月20日,HSRC向作者提供了所有五个SABSSM数据集。对数据集进行了探索,以确定可能影响艾滋病毒检测接受情况的自变量。本研究的结果将确定在20年期间预测南非成年人群体艾滋病毒检测接受情况的一致变量。此外,本研究将评估和比较4种不同ML算法的性能指标,并将最佳模型用于开发艾滋病毒检测预测模型。

结论

本研究将为现有知识做出贡献,并加深对与艾滋病毒检测相关因素的理解,超越传统方法。因此,研究结果将为循证政策建议提供依据,可指导政策制定者制定更有效和有针对性的公共卫生方法,以加强艾滋病毒检测。

国际注册报告识别码(IRRID):DERR1-10.2196/59916。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c47c/11811654/4c808b95bc23/resprot_v14i1e59916_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c47c/11811654/4c808b95bc23/resprot_v14i1e59916_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c47c/11811654/4c808b95bc23/resprot_v14i1e59916_fig1.jpg

相似文献

1
The Application of Machine Learning Algorithms to Predict HIV Testing in Repeated Adult Population-Based Surveys in South Africa: Protocol for a Multiwave Cross-Sectional Analysis.机器学习算法在南非基于成年人群的重复调查中预测HIV检测的应用:多波横断面分析方案
JMIR Res Protoc. 2025 Jan 27;14:e59916. doi: 10.2196/59916.
2
Trends and determinants of ever having tested for HIV among youth and adults in South Africa from 2005-2017: Results from four repeated cross-sectional nationally representative household-based HIV prevalence, incidence, and behaviour surveys.2005-2017 年南非青年和成年人中进行 HIV 检测的趋势和决定因素:四次全国代表性基于家庭的 HIV 流行率、发病率和行为调查的重复横断面研究结果。
PLoS One. 2020 May 14;15(5):e0232883. doi: 10.1371/journal.pone.0232883. eCollection 2020.
3
Towards achieving the 90-90-90 HIV targets: results from the south African 2017 national HIV survey.迈向实现 90-90-90 艾滋病毒目标:南非 2017 年全国艾滋病毒调查结果。
BMC Public Health. 2020 Sep 9;20(1):1375. doi: 10.1186/s12889-020-09457-z.
4
Reaching priority populations with different HIV self-testing distribution models in South Africa: an analysis of programme data.在南非通过不同的艾滋病毒自我检测分发模式覆盖重点人群:项目数据分析
BMC Infect Dis. 2025 Feb 25;22(Suppl 1):981. doi: 10.1186/s12879-025-10662-7.
5
Machine learning to improve HIV screening using routine data in Kenya.利用肯尼亚的常规数据,通过机器学习改善艾滋病毒筛查。
J Int AIDS Soc. 2025 Apr;28(4):e26436. doi: 10.1002/jia2.26436.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
Predicting Therapy Outcomes in Patients With Stress-Related Disorders: Protocol for a Predictive Modeling Study.预测应激相关障碍患者的治疗结果:一项预测模型研究的方案
JMIR Res Protoc. 2025 Mar 25;14:e65790. doi: 10.2196/65790.
8
Factors influencing HIV testing uptake in Sub-Saharan Africa: a comprehensive multi-level analysis using demographic and health survey data (2015-2022).影响撒哈拉以南非洲地区 HIV 检测率的因素:基于人口与健康调查数据的综合多层次分析(2015-2022 年)。
BMC Infect Dis. 2024 Aug 13;24(1):821. doi: 10.1186/s12879-024-09695-1.
9
Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation.机器学习算法在预测男男性行为者中 HIV 感染中的应用:模型开发和验证。
Front Public Health. 2022 Aug 25;10:967681. doi: 10.3389/fpubh.2022.967681. eCollection 2022.
10
Harnessing Big Heterogeneous Data to Evaluate the Potential Impact of HIV Responses Among Key Populations in Sub-Saharan Africa: Protocol for the Boloka Data Repository Initiative.利用大量异构数据评估撒哈拉以南非洲关键人群中艾滋病应对措施的潜在影响:博洛卡数据存储库倡议方案
JMIR Res Protoc. 2025 Jan 22;14:e63583. doi: 10.2196/63583.

引用本文的文献

1
Understanding the determinants of treated bed net use in Ethiopia: A machine learning classification approach using PMA Ethiopia 2023 survey data.了解埃塞俄比亚经处理蚊帐使用情况的决定因素:一种使用埃塞俄比亚2023年人口与健康调查数据的机器学习分类方法。
PLoS One. 2025 Jul 7;20(7):e0327800. doi: 10.1371/journal.pone.0327800. eCollection 2025.
2
The Application of Machine Learning Algorithms to Predict HIV Testing Using Evidence from the 2002-2017 South African Adult Population-Based Surveys: An HIV Testing Predictive Model.运用机器学习算法,根据2002 - 2017年南非基于成人人口的调查数据预测HIV检测情况:一种HIV检测预测模型
Trop Med Infect Dis. 2025 Jun 14;10(6):167. doi: 10.3390/tropicalmed10060167.

本文引用的文献

1
Development and validation of observational and qualitative study protocol reporting checklists for novice researchers (ObsQual checklist).新手研究者观察性和定性研究报告清单的制定与验证(ObsQual 清单)。
Eval Program Plann. 2024 Oct;106:102468. doi: 10.1016/j.evalprogplan.2024.102468. Epub 2024 Jul 18.
2
TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods.TRIPOD+AI 声明:报告使用回归或机器学习方法的临床预测模型的更新指南。
BMJ. 2024 Apr 16;385:e078378. doi: 10.1136/bmj-2023-078378.
3
Utility of a machine-guided tool for assessing risk behaviour associated with contracting HIV in three sites in South Africa.
一种机器引导工具在南非三个地点评估与感染艾滋病毒相关的风险行为中的效用。
Inform Med Unlocked. 2023;37:101192. doi: 10.1016/j.imu.2023.101192.
4
HIV testing and associated factors among men (15-64 years) in Eastern Africa: a multilevel analysis using the recent demographic and health survey.东非地区 15-64 岁男性的 HIV 检测及相关因素:利用最近的人口与健康调查进行的多层次分析
BMC Public Health. 2022 Nov 24;22(1):2170. doi: 10.1186/s12889-022-14588-6.
5
Predicting HIV Status among Men Who Have Sex with Men in Bulawayo & Harare, Zimbabwe Using Bio-Behavioural Data, Recurrent Neural Networks, and Machine Learning Techniques.利用生物行为数据、循环神经网络和机器学习技术预测津巴布韦布拉瓦约和哈拉雷男男性行为者的艾滋病毒感染状况。
Trop Med Infect Dis. 2022 Sep 5;7(9):231. doi: 10.3390/tropicalmed7090231.
6
Using machine learning approaches to predict timely clinic attendance and the uptake of HIV/STI testing post clinic reminder messages.利用机器学习方法预测及时就诊和诊所提醒信息后接受 HIV/性传播感染检测的情况。
Sci Rep. 2022 May 24;12(1):8757. doi: 10.1038/s41598-022-12033-7.
7
A Machine-Learning-Based Risk-Prediction Tool for HIV and Sexually Transmitted Infections Acquisition over the Next 12 Months.一种基于机器学习的未来12个月内感染艾滋病毒和性传播感染风险预测工具。
J Clin Med. 2022 Mar 25;11(7):1818. doi: 10.3390/jcm11071818.
8
A discrete choice experiment investigating HIV testing preferences in South Africa.一项关于南非 HIV 检测偏好的离散选择实验研究。
J Med Econ. 2022 Jan-Dec;25(1):481-490. doi: 10.1080/13696998.2022.2055937.
9
Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa.基于东非和南非的社会行为特征预测艾滋病毒感染状况。
PLoS One. 2022 Mar 3;17(3):e0264429. doi: 10.1371/journal.pone.0264429. eCollection 2022.
10
Geographical variation in HIV testing in South Africa: Evidence from the 2017 national household HIV survey.南非艾滋病毒检测的地理差异:来自2017年全国家庭艾滋病毒调查的证据。
South Afr J HIV Med. 2021 Aug 31;22(1):1273. doi: 10.4102/sajhivmed.v22i1.1273. eCollection 2021.