文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

运用机器学习算法,根据2002 - 2017年南非基于成人人口的调查数据预测HIV检测情况:一种HIV检测预测模型

The Application of Machine Learning Algorithms to Predict HIV Testing Using Evidence from the 2002-2017 South African Adult Population-Based Surveys: An HIV Testing Predictive Model.

作者信息

Jaiteh Musa, Phalane Edith, Shiferaw Yegnanew A, Jallow Haruna, Phaswana-Mafuya Refilwe Nancy

机构信息

South African Medical Research Council/University of Johannesburg Pan African Centre for Epidemics Research Extramural Unit, Faculty of Health Sciences, University of Johannesburg, Johannesburg 2006, South Africa.

Department of Statistics, Faculty of Science, University of Johannesburg, Johannesburg 2006, South Africa.

出版信息

Trop Med Infect Dis. 2025 Jun 14;10(6):167. doi: 10.3390/tropicalmed10060167.


DOI:10.3390/tropicalmed10060167
PMID:40559734
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12197452/
Abstract

There is a significant portion of the South African population with unknown HIV status, which slows down epidemic control despite the progress made in HIV testing. Machine learning (ML) has been effective in identifying individuals at higher risk of HIV infection, for whom testing is strongly recommended. However, there are insufficient predictive models to inform targeted HIV testing interventions in South Africa. By harnessing the power of supervised ML (SML) algorithms, this study aimed to identify the most consistent predictors of HIV testing in repeated adult population-based surveys in South Africa. The study employed four SML algorithms, namely, decision trees, random forest, support vector machines (SVM), and logistic regression, across the five cross-sectional cycles of the South African National HIV Prevalence, Incidence, and Behavior and Communication Survey (SABSSM) datasets. The Human Science Research Council (HSRC) conducted the SABSSM surveys and made the datasets available for this study. Each dataset was split into 80% training and 20% testing sets with a 5-fold cross-validation technique. The random forest outperformed the other models across all five datasets with the highest accuracy (80.98%), precision (81.51%), F-score (80.30%), area under the curve (AUC) (88.31%), and cross-validation average (79.10%) in the 2002 data. Random forest achieved the highest classification performance across all the dates, especially in the 2017 survey. SVM had a high recall (89.12% in 2005, 86.28% in 2008) but lower precision, leading to a suboptimal F-score in the initial analysis. We applied a soft margin to the SVM to improve its classification robustness and generalization, but the accuracy and precision were still low in most surveys, increasing the chances of misclassifying individuals who tested for HIV. Logistic regression performed well in terms of accuracy = 72.75, precision = 73.64, and AUC = 81.41 in 2002, and the F-score = 73.83 in 2017, but its performance was somewhat lower than that of the random forest. Decision trees demonstrated moderate accuracy (73.80% in 2002) but were prone to overfitting. The topmost consistent predictors of HIV testing are knowledge of HIV testing sites, being a female, being a younger adult, having high socioeconomic status, and being well-informed about HIV through digital platforms. Random forest's ability to analyze complex datasets makes it a valuable tool for informing data-driven policy initiatives, such as raising awareness, engaging the media, improving employment outcomes, enhancing accessibility, and targeting high-risk individuals. By addressing the identified gaps in the existing healthcare framework, South Africa can enhance the efficacy of HIV testing and progress towards achieving the UNAIDS 2030 goal of eradicating AIDS.

摘要

南非有很大一部分人口的艾滋病毒感染状况不明,这减缓了疫情控制的进程,尽管在艾滋病毒检测方面已取得进展。机器学习(ML)已有效地识别出感染艾滋病毒风险较高的个体,强烈建议对这些人进行检测。然而,在南非,用于指导有针对性的艾滋病毒检测干预措施的预测模型不足。通过利用监督式机器学习(SML)算法的力量,本研究旨在确定在南非基于成年人群的重复调查中,艾滋病毒检测最一致的预测因素。该研究在南非国家艾滋病毒流行率、发病率、行为与传播调查(SABSSM)数据集的五个横断面周期中,采用了四种SML算法,即决策树、随机森林、支持向量机(SVM)和逻辑回归。人类科学研究理事会(HSRC)开展了SABSSM调查,并提供了数据集供本研究使用。每个数据集通过5折交叉验证技术被分为80%的训练集和20%的测试集。在2002年的数据中,随机森林在所有五个数据集中的表现优于其他模型,其准确率最高(80.98%)、精确率(81.51%)、F值(80.30%)、曲线下面积(AUC)(88.31%)和交叉验证平均值(79.10%)。随机森林在所有日期的分类性能最高,尤其是在2017年的调查中。支持向量机有较高的召回率(2005年为89.12%,2008年为86.28%),但精确率较低,导致在初步分析中F值次优。我们对支持向量机应用了软间隔以提高其分类稳健性和泛化能力,但在大多数调查中准确率和精确率仍然较低,增加了对艾滋病毒检测呈阳性个体误分类的可能性。逻辑回归在2002年的准确率为72.75、精确率为73.64、AUC为81.41,在2017年F值为73.83,表现良好,但其性能略低于随机森林。决策树显示出中等准确率(2002年为73.80%),但容易过度拟合。艾滋病毒检测最一致的预测因素包括对艾滋病毒检测地点的了解、女性、年轻成年人、社会经济地位高以及通过数字平台对艾滋病毒有充分了解。随机森林分析复杂数据集的能力使其成为一个有价值的工具,可用于为数据驱动的政策举措提供信息,如提高认识、吸引媒体、改善就业成果、增强可及性以及针对高危个体。通过解决现有医疗框架中已确定的差距,南非可以提高艾滋病毒检测的效果,并朝着实现联合国艾滋病规划署2030年消除艾滋病的目标迈进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/f9d4b092234b/tropicalmed-10-00167-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/4f20192f2909/tropicalmed-10-00167-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/6e561d7626e0/tropicalmed-10-00167-g002a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/7ffe50850980/tropicalmed-10-00167-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/0deead85a602/tropicalmed-10-00167-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/74b040f961f1/tropicalmed-10-00167-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/f9d4b092234b/tropicalmed-10-00167-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/4f20192f2909/tropicalmed-10-00167-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/6e561d7626e0/tropicalmed-10-00167-g002a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/7ffe50850980/tropicalmed-10-00167-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/0deead85a602/tropicalmed-10-00167-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/74b040f961f1/tropicalmed-10-00167-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2db/12197452/f9d4b092234b/tropicalmed-10-00167-g006.jpg

相似文献

[1]
The Application of Machine Learning Algorithms to Predict HIV Testing Using Evidence from the 2002-2017 South African Adult Population-Based Surveys: An HIV Testing Predictive Model.

Trop Med Infect Dis. 2025-6-14

[2]
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.

Health Technol Assess. 2001

[3]
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022-5-20

[4]
Behavioral interventions to reduce risk for sexual transmission of HIV among men who have sex with men.

Cochrane Database Syst Rev. 2008-7-16

[5]
The Application of Machine Learning Algorithms to Predict HIV Testing in Repeated Adult Population-Based Surveys in South Africa: Protocol for a Multiwave Cross-Sectional Analysis.

JMIR Res Protoc. 2025-1-27

[6]
Antidepressants for pain management in adults with chronic pain: a network meta-analysis.

Health Technol Assess. 2024-10

[7]
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006-9

[8]
Rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection.

Cochrane Database Syst Rev. 2022-7-22

[9]
Drugs for preventing postoperative nausea and vomiting in adults after general anaesthesia: a network meta-analysis.

Cochrane Database Syst Rev. 2020-10-19

[10]
Uterotonic agents for preventing postpartum haemorrhage: a network meta-analysis.

Cochrane Database Syst Rev. 2018-4-25

引用本文的文献

[1]
The status of machine learning in HIV testing in South Africa: a qualitative inquiry with stakeholders in Gauteng province.

Front Digit Health. 2025-8-1

本文引用的文献

[1]
Application of Machine Learning and Emerging Health Technologies in the Uptake of HIV Testing: Bibliometric Analysis of Studies Published From 2000 to 2024.

Interact J Med Res. 2025-5-22

[2]
The Application of Machine Learning Algorithms to Predict HIV Testing in Repeated Adult Population-Based Surveys in South Africa: Protocol for a Multiwave Cross-Sectional Analysis.

JMIR Res Protoc. 2025-1-27

[3]
Harnessing Big Heterogeneous Data to Evaluate the Potential Impact of HIV Responses Among Key Populations in Sub-Saharan Africa: Protocol for the Boloka Data Repository Initiative.

JMIR Res Protoc. 2025-1-22

[4]
Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients.

Front Med (Lausanne). 2025-1-6

[5]
STI/HIV risk prediction model development-A novel use of public data to forecast STIs/HIV risk for men who have sex with men.

Front Public Health. 2025-1-3

[6]
Machine learning prediction of adolescent HIV testing services in Ethiopia.

Front Public Health. 2024

[7]
Improving HIV Case Finding Through Index Testing: Findings from Health Facilities in 12 Districts of South Africa, October 2019-September 2021.

AIDS Behav. 2024-5

[8]
The Association Between HIV-Related Stigma and the Uptake of HIV Testing and ART Among Older Adults in Rural South Africa: Findings from the HAALSI Cohort Study.

AIDS Behav. 2024-3

[9]
Web-Based STI/HIV Testing Services Available for Access in Australia: Systematic Search and Analysis.

J Med Internet Res. 2023-9-22

[10]
Trends in HIV testing, the treatment cascade, and HIV incidence among men who have sex with men in Africa: a systematic review and meta-analysis.

Lancet HIV. 2023-8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索