Suppr超能文献

基于四种机器学习模型的HIV感染者生存预测模型

Survival prediction models for people living with HIV based on four machine learning models.

作者信息

Cai Qiong, Yang Lanting, Ling Yulong, Pan Wei, Zhong Qing, Wang Chunjie, Pan Xilong

机构信息

Department of Social Medicine and Health Education, School of Public Health, Peking University, 38 College Road, Haidian District, Beijing, 100191, China.

Faculty of Arts, The University of Melbourne, VIC, Melbourne, VIC, 3052, Australia.

出版信息

Sci Rep. 2025 Aug 25;15(1):31256. doi: 10.1038/s41598-025-16479-3.

Abstract

Although antiretroviral therapy has prolonged the lifespan of people living with HIV, significant variations still exist in survival rates and risk factors among these people. This study compares the performance of the Cox proportional hazard models with four machine learning models in predicting the survival of people living with HIV, analyzing the survival factors among them, thereby assisting medical decision-making. We collected data on 676 people living with HIV from the Chinese Center for Disease Control and Prevention. Significant variables (p < 0.05) were identified using Cox univariate analysis. Using a random number method, the data were split into a training set (473 cases) and a test set (203 cases) in a 7:3 ratio. We employed the Cox proportional hazard model and four classification machine learning models, including eXtreme Gradient Boosting, Random Forest, Support Vector Machine, and Multilayer Perceptron, to develop survival prediction models for people living with HIV. The predictive performance of these models was evaluated based on accuracy, precision, recall, F1-score, area under the receiver operating characteristic curve (AUC), and calibration curves, and the best model was selected based on these metrics. The average age of diagnosis among the sample participants was 56.63 years (SD = 17.53). Considering the performance of both the training and testing cohorts, the Random Forest classifier emerged as the model with the best predictive performance, with an AUC of 0.912, an Accuracy of 0.862, a Precision of 0.794, a Recall of 0.562, and an F1 score of 0.659. Random Forest was followed by the Support Vector Machine, the eXtreme Gradient Boosting, Multilayer Perceptron, and the Cox proportional hazard model performed similarly. The predictive performance of machine learning models surpasses traditional Cox proportional hazard models. In China, the Random Forest model can be considered for analyzing and predicting the survival rates of people living with HIV.

摘要

尽管抗逆转录病毒疗法延长了艾滋病毒感染者的寿命,但这些人的生存率和风险因素仍存在显著差异。本研究比较了Cox比例风险模型与四种机器学习模型在预测艾滋病毒感染者生存情况方面的表现,分析其中的生存因素,从而辅助医疗决策。我们从中国疾病预防控制中心收集了676名艾滋病毒感染者的数据。使用Cox单变量分析确定显著变量(p < 0.05)。采用随机数法,将数据按7:3的比例分为训练集(473例)和测试集(203例)。我们采用Cox比例风险模型和四种分类机器学习模型,包括极端梯度提升、随机森林、支持向量机和多层感知器,来开发艾滋病毒感染者的生存预测模型。基于准确率、精确率、召回率、F1分数、受试者工作特征曲线下面积(AUC)和校准曲线评估这些模型的预测性能,并根据这些指标选择最佳模型。样本参与者的平均诊断年龄为56.63岁(标准差 = 17.53)。考虑到训练和测试队列的表现,随机森林分类器成为预测性能最佳的模型,AUC为0.912,准确率为0.862,精确率为0.794,召回率为0.562,F1分数为0.659。其次是支持向量机、极端梯度提升,多层感知器和Cox比例风险模型表现相似。机器学习模型的预测性能超过了传统的Cox比例风险模型。在中国,可以考虑使用随机森林模型来分析和预测艾滋病毒感染者的生存率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68df/12378378/9359758fa98a/41598_2025_16479_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验