Suppr超能文献

通过机器学习方法预测大学生的危险性行为:对中国 31 个省 1264 所高校个体数据的横断面分析。

Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China.

机构信息

Vanke School of Public Health, Tsinghua University, Beijing, China.

出版信息

JMIR Public Health Surveill. 2023 Jan 25;9:e41162. doi: 10.2196/41162.

Abstract

BACKGROUND

Risky sexual behavior (RSB), the most direct risk factor for sexually transmitted infections (STIs), is common among college students. Thus, identifying relevant risk factors and predicting RSB are important to intervene and prevent RSB among college students.

OBJECTIVE

We aim to establish a predictive model for RSB among college students to facilitate timely intervention and the prevention of RSB to help limit STI contraction.

METHODS

We included a total of 8794 heterosexual Chinese students who self-reported engaging in sexual intercourse from November 2019 to February 2020. We identified RSB among those students and attributed it to 4 dimensions: whether contraception was used, whether the contraceptive method was safe, whether students engaged in casual sex or sex with multiple partners, and integrated RSB (which combined the first 3 dimensions). Overall, 126 predictors were included in this study, including demographic characteristics, daily habits, physical and mental health, relationship status, sexual knowledge, sexual education, sexual attitude, and previous sexual experience. For each type of RSB, we compared 8 machine learning (ML) models: multiple logistic regression (MLR), naive Bayes (BYS), linear discriminant analysis (LDA), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), deep learning (DL), and the ensemble model. The optimal model for both RSB prediction and risk factor identification was selected based on a set of validation indicators. An MLR model was applied to investigate the association between RSB and identified risk factors through ML methods.

RESULTS

In total, 5328 (60.59%) students were found to have previously engaged in RSB. Among them, 3682 (41.87%) did not use contraception every time they had sexual intercourse, 3602 (40.96%) had previously used an ineffective or unsafe contraceptive method, and 1157 (13.16%) had engaged in casual sex or sex with multiple partners. XGBoost achieved the optimal predictive performance on all 4 types of RSB, with the area under the receiver operator characteristic curve (AUROC) reaching 0.78, 0.72, 0.94, and 0.80 for contraceptive use, safe contraceptive method use, engagement in casual sex or with multiple partners, and integrated RSB, respectively. By ensuring the stability of various validation indicators, the 12 most predictive variables were then selected using XGBoost, including the participants' relationship status, sexual knowledge, sexual attitude, and previous sexual experience. Through MLR, RSB was found to be significantly associated with less sexual knowledge, more liberal sexual attitudes, single relationship status, and increased sexual experience.

CONCLUSIONS

RSB is prevalent among college students. The XGBoost model is an effective approach to predict RSB and identify corresponding risk factors. This study presented an opportunity to promote sexual and reproductive health through ML models, which can help targeted interventions aimed at different subgroups and the precise surveillance and prevention of RSB among college students through risk probability prediction.

摘要

背景

性行为风险(RSB)是性传播感染(STI)的最直接风险因素,在大学生中很常见。因此,识别相关风险因素并预测 RSB 对于干预和预防大学生 RSB 很重要。

目的

我们旨在建立一个大学生 RSB 的预测模型,以方便及时干预和预防 RSB,帮助限制 STI 的传播。

方法

我们纳入了 2019 年 11 月至 2020 年 2 月期间自我报告有过性行为的 8794 名异性恋中国学生。我们确定了这些学生中的 RSB,并将其归因于 4 个维度:是否使用避孕措施、避孕方法是否安全、是否发生过随意性行为或与多个伴侣发生性行为,以及综合 RSB(综合前 3 个维度)。总体而言,这项研究共纳入了 126 个预测指标,包括人口统计学特征、日常习惯、身心健康、关系状况、性知识、性教育、性态度和以往的性经历。对于每种类型的 RSB,我们比较了 8 种机器学习(ML)模型:多逻辑回归(MLR)、朴素贝叶斯(BYS)、线性判别分析(LDA)、随机森林(RF)、梯度提升机(GBM)、极端梯度提升(XGBoost)、深度学习(DL)和集成模型。根据一系列验证指标,选择了用于 RSB 预测和风险因素识别的最优模型。通过 ML 方法,应用 MLR 模型研究了 RSB 与确定的风险因素之间的关联。

结果

共有 5328 名(60.59%)学生被发现有过 RSB。其中,3682 名(41.87%)学生每次发生性行为时都没有使用避孕措施,3602 名(40.96%)学生之前使用过无效或不安全的避孕方法,1157 名(13.16%)学生发生过随意性行为或与多个伴侣发生性行为。XGBoost 在所有 4 种 RSB 上均取得了最佳的预测性能,避孕措施使用、安全避孕方法使用、随意性行为或与多个伴侣发生性行为以及综合 RSB 的受试者工作特征曲线(AUROC)分别达到 0.78、0.72、0.94 和 0.80。通过确保各种验证指标的稳定性,然后使用 XGBoost 选择了 12 个最具预测性的变量,包括参与者的关系状况、性知识、性态度和以往的性经历。通过 MLR,发现 RSB 与性知识较少、性态度较自由、单一关系状况和性经验增加显著相关。

结论

RSB 在大学生中很常见。XGBoost 模型是预测 RSB 和识别相应风险因素的有效方法。本研究通过 ML 模型提供了一个促进性和生殖健康的机会,可以通过风险概率预测帮助针对不同亚组进行有针对性的干预,并精确监测和预防大学生的 RSB。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ee6/9909517/492fa346497f/publichealth_v9i1e41162_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验