基于网络的机器学习算法个体 HIV 和性传播感染风险预测工具的开发和外部验证研究。
Web-Based Risk Prediction Tool for an Individual's Risk of HIV and Sexually Transmitted Infections Using Machine Learning Algorithms: Development and External Validation Study.
机构信息
Melbourne Sexual Health Centre, Alfred Health, Melbourne, Australia.
Central Clinical School, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Australia.
出版信息
J Med Internet Res. 2022 Aug 25;24(8):e37850. doi: 10.2196/37850.
BACKGROUND
HIV and sexually transmitted infections (STIs) are major global public health concerns. Over 1 million curable STIs occur every day among people aged 15 years to 49 years worldwide. Insufficient testing or screening substantially impedes the elimination of HIV and STI transmission.
OBJECTIVE
The aim of our study was to develop an HIV and STI risk prediction tool using machine learning algorithms.
METHODS
We used clinic consultations that tested for HIV and STIs at the Melbourne Sexual Health Centre between March 2, 2015, and December 31, 2018, as the development data set (training and testing data set). We also used 2 external validation data sets, including data from 2019 as external "validation data 1" and data from January 2020 and January 2021 as external "validation data 2." We developed 34 machine learning models to assess the risk of acquiring HIV, syphilis, gonorrhea, and chlamydia. We created an online tool to generate an individual's risk of HIV or an STI.
RESULTS
The important predictors for HIV and STI risk were gender, age, men who reported having sex with men, number of casual sexual partners, and condom use. Our machine learning-based risk prediction tool, named MySTIRisk, performed at an acceptable or excellent level on testing data sets (area under the curve [AUC] for HIV=0.78; AUC for syphilis=0.84; AUC for gonorrhea=0.78; AUC for chlamydia=0.70) and had stable performance on both external validation data from 2019 (AUC for HIV=0.79; AUC for syphilis=0.85; AUC for gonorrhea=0.81; AUC for chlamydia=0.69) and data from 2020-2021 (AUC for HIV=0.71; AUC for syphilis=0.84; AUC for gonorrhea=0.79; AUC for chlamydia=0.69).
CONCLUSIONS
Our web-based risk prediction tool could accurately predict the risk of HIV and STIs for clinic attendees using simple self-reported questions. MySTIRisk could serve as an HIV and STI screening tool on clinic websites or digital health platforms to encourage individuals at risk of HIV or an STI to be tested or start HIV pre-exposure prophylaxis. The public can use this tool to assess their risk and then decide if they would attend a clinic for testing. Clinicians or public health workers can use this tool to identify high-risk individuals for further interventions.
背景
艾滋病毒和性传播感染(STIs)是全球主要的公共卫生问题。全世界每天有超过 100 万可治愈的性传播感染发生在 15 至 49 岁的人群中。检测或筛查不足严重阻碍了艾滋病毒和性传播感染的传播。
目的
我们的研究目的是使用机器学习算法开发艾滋病毒和性传播感染风险预测工具。
方法
我们使用了 2015 年 3 月 2 日至 2018 年 12 月 31 日期间在墨尔本性健康中心进行的艾滋病毒和性传播感染检测的临床咨询作为开发数据集(培训和测试数据集)。我们还使用了 2 个外部验证数据集,包括 2019 年的数据作为外部“验证数据 1”和 2020 年 1 月和 2021 年 1 月的数据作为外部“验证数据 2”。我们开发了 34 个机器学习模型来评估感染艾滋病毒、梅毒、淋病和衣原体的风险。我们创建了一个在线工具来生成个人感染艾滋病毒或性传播感染的风险。
结果
艾滋病毒和性传播感染风险的重要预测因素是性别、年龄、报告有男男性行为的男性、偶然性行为伴侣的数量和使用避孕套。我们基于机器学习的风险预测工具,名为 MySTIRisk,在测试数据集上表现出可接受或优秀的水平(艾滋病毒的曲线下面积[AUC]为 0.78;梅毒的 AUC 为 0.84;淋病的 AUC 为 0.78;衣原体的 AUC 为 0.70),并且在 2019 年的两个外部验证数据(艾滋病毒的 AUC 为 0.79;梅毒的 AUC 为 0.85;淋病的 AUC 为 0.81;衣原体的 AUC 为 0.69)和 2020-2021 年的数据(艾滋病毒的 AUC 为 0.71;梅毒的 AUC 为 0.84;淋病的 AUC 为 0.79;衣原体的 AUC 为 0.69)上表现出稳定的性能。
结论
我们的基于网络的风险预测工具可以使用简单的自我报告问题准确预测临床就诊者感染艾滋病毒和性传播感染的风险。MySTIRisk 可以作为诊所网站或数字健康平台上的艾滋病毒和性传播感染筛查工具,以鼓励有感染艾滋病毒或性传播感染风险的个人进行检测或开始艾滋病毒暴露前预防。公众可以使用该工具来评估自己的风险,然后决定是否去诊所进行检测。临床医生或公共卫生工作者可以使用该工具来识别高风险个体,以便进一步干预。