预测高血压患者中风风险的机器学习模型比较：套索回归模型、随机森林模型、Boruta算法模型以及Boruta算法与套索回归模型相结合的模型。

Comparison of machine learning models for predicting stroke risk in hypertensive patients: Lasso regression model, random forest model, Boruta algorithm model, and Boruta algorithm combined with Lasso regression model.

作者信息

Huang Junzhang, Liu Wencai

机构信息

Department of General Surgery, Lianjiang Traditional Chinese Medicine Hospital, Zhanjiang, Guangdong, China.

Department of Orthopedics, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China.

出版信息

Medicine (Baltimore). 2025 May 30;104(22):e42690. doi: 10.1097/MD.0000000000042690.

DOI:10.1097/MD.0000000000042690

PMID:40441184

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12129492/

Abstract

The aim of this study was to compare the performance of 4 machine learning models-Lasso regression model, random forest model, Boruta algorithm model, and the Boruta algorithm combined with Lasso regression-in predicting stroke risk among hypertensive patients. The study evaluated the strengths and weaknesses of each model to provide a more clinically valuable prediction model for stroke risk. The study included 3472 hypertensive patients, of which 312 had experienced a stroke, and 3160 had not. Various health indicators were analyzed using Lasso regression, random forest, Boruta algorithm, and the Boruta algorithm combined with Lasso regression. Model performance was evaluated based on the area under the curve (AUC) of the receiver operating characteristic curve, the precision-recall curve, calibration curve, and decision curve analysis to assess classification ability, precision, calibration, and clinical benefit. The Lasso regression and Boruta algorithm models both have an AUC of 0.716, making them the best-performing models in terms of classification ability. The Boruta algorithm combined with Lasso regression model has an AUC of 0.705, slightly lower than the previous 2 models but still shows good predictive capability, with better interpretability due to feature selection. The random forest model has an AUC of 0.626, which is the lowest among the models, indicating weaker classification performance compared to the others. Among the 4 models, the Lasso regression model and Boruta algorithm model performed similarly in terms of classification ability, both demonstrating moderate predictive power, while the random forest model performed relatively poorly. The Boruta combined with Lasso regression model was precise in variable selection but had limited clinical utility. Therefore, the Lasso regression model appears to be the most balanced in predicting stroke risk and is the recommended model based on this study.

摘要

本研究的目的是比较4种机器学习模型（套索回归模型、随机森林模型、Boruta算法模型以及结合套索回归的Boruta算法）在预测高血压患者中风风险方面的表现。该研究评估了每种模型的优缺点，以提供一个对中风风险更具临床价值的预测模型。该研究纳入了3472名高血压患者，其中312人曾发生中风，3160人未发生中风。使用套索回归、随机森林、Boruta算法以及结合套索回归的Boruta算法对各种健康指标进行了分析。基于受试者工作特征曲线的曲线下面积（AUC）、精确召回曲线、校准曲线和决策曲线分析对模型性能进行评估，以评估分类能力、精确性、校准和临床益处。套索回归模型和Boruta算法模型的AUC均为0.716，就分类能力而言，它们是表现最佳的模型。结合套索回归的Boruta算法模型的AUC为0.705，略低于前两个模型，但仍显示出良好的预测能力，由于进行了特征选择，其可解释性更强。随机森林模型的AUC为0.626，是所有模型中最低的，表明其分类性能比其他模型弱。在这4种模型中，套索回归模型和Boruta算法模型在分类能力方面表现相似，均显示出中等预测能力，而随机森林模型表现相对较差。结合套索回归的Boruta算法模型在变量选择方面很精确，但临床实用性有限。因此，套索回归模型在预测中风风险方面似乎是最平衡的，基于本研究，它是推荐模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/627e/12129492/4bc684b9b649/medi-104-e42690-g001.jpg

相似文献

Comparison of machine learning models for predicting stroke risk in hypertensive patients: Lasso regression model, random forest model, Boruta algorithm model, and Boruta algorithm combined with Lasso regression model.预测高血压患者中风风险的机器学习模型比较：套索回归模型、随机森林模型、Boruta算法模型以及Boruta算法与套索回归模型相结合的模型。

Medicine (Baltimore). 2025 May 30;104(22):e42690. doi: 10.1097/MD.0000000000042690.

[Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms].基于监督机器学习算法构建脓毒症休克患者死亡风险预测模型

Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2024 Apr;36(4):345-352. doi: 10.3760/cma.j.cn121430-20230930-00832.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Construction of a random survival forest model based on a machine learning algorithm to predict early recurrence after hepatectomy for adult hepatocellular carcinoma.基于机器学习算法构建随机生存森林模型以预测成人肝细胞癌肝切除术后的早期复发。

BMC Cancer. 2024 Dec 25;24(1):1575. doi: 10.1186/s12885-024-13366-4.

LASSO regression and Boruta algorithm to explore the relationship between neutrophil percentage to albumin ratio and asthma: results from the NHANES 2001 to 2018.运用套索回归和博鲁塔算法探究中性粒细胞与白蛋白比值和哮喘之间的关系：来自2001年至2018年美国国家健康与营养检查调查的结果

Clin Exp Med. 2025 May 10;25(1):149. doi: 10.1007/s10238-025-01701-3.

Predictive modeling of lower extreme deep vein thrombosis following radical gastrectomy for gastric cancer: based on multiple machine learning methods.基于多种机器学习方法的胃癌根治术后下肢深静脉血栓形成的预测模型。

Sci Rep. 2024 Jul 8;14(1):15711. doi: 10.1038/s41598-024-66754-y.

Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms.使用机器学习算法开发并验证针对20岁及以上抑郁症患者冠心病风险的预测模型。

Front Cardiovasc Med. 2025 Jan 9;11:1504957. doi: 10.3389/fcvm.2024.1504957. eCollection 2024.

Predicting postoperative adhesive small bowel obstruction in infants under 3 months with intestinal malrotation: a random forest approach.预测3个月以下患有肠旋转不良的婴儿术后粘连性小肠梗阻：一种随机森林方法。

J Pediatr (Rio J). 2025 Mar-Apr;101(2):282-289. doi: 10.1016/j.jped.2024.11.011. Epub 2025 Jan 21.

Machine learning-based risk prediction of malignant arrhythmia in hospitalized patients with heart failure.基于机器学习的心力衰竭住院患者恶性心律失常风险预测。

ESC Heart Fail. 2021 Dec;8(6):5363-5371. doi: 10.1002/ehf2.13627. Epub 2021 Sep 28.

An explainable web application based on machine learning for predicting fragility fracture in people living with HIV: data from Beijing Ditan Hospital, China.一种基于机器学习的可解释网络应用程序，用于预测HIV感染者的脆性骨折：来自中国北京地坛医院的数据。

Front Cell Infect Microbiol. 2025 Mar 14;15:1461740. doi: 10.3389/fcimb.2025.1461740. eCollection 2025.

本文引用的文献

Diagnostic Models for Differentiating COVID-19-Related Acute Ischemic Stroke Using Machine Learning Methods.使用机器学习方法鉴别COVID-19相关急性缺血性卒中的诊断模型

Diagnostics (Basel). 2024 Dec 13;14(24):2802. doi: 10.3390/diagnostics14242802.

Explainable artificial intelligence for stroke prediction through comparison of deep learning and machine learning models.通过深度学习与机器学习模型比较实现可解释的人工智能用于中风预测

Sci Rep. 2024 Dec 28;14(1):31392. doi: 10.1038/s41598-024-82931-5.

C-reactive protein-triglyceride glucose index predicts stroke incidence in a hypertensive population: a national cohort study.C反应蛋白-甘油三酯葡萄糖指数可预测高血压人群的卒中发生率：一项全国队列研究

Diabetol Metab Syndr. 2024 Nov 21;16(1):277. doi: 10.1186/s13098-024-01529-z.

The most efficient machine learning algorithms in stroke prediction: A systematic review.中风预测中最有效的机器学习算法：一项系统综述。

Health Sci Rep. 2024 Oct 1;7(10):e70062. doi: 10.1002/hsr2.70062. eCollection 2024 Oct.

An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction.一种用于增强中风预测的集成机器学习和数据挖掘方法。

Bioengineering (Basel). 2024 Jul 2;11(7):672. doi: 10.3390/bioengineering11070672.

Machine Learning Approaches for Stroke Risk Prediction: Findings from the Suita Study.用于中风风险预测的机器学习方法：吹田研究的结果

J Cardiovasc Dev Dis. 2024 Jul 1;11(7):207. doi: 10.3390/jcdd11070207.

Predictive modelling and identification of key risk factors for stroke using machine learning.利用机器学习对中风进行预测建模和关键风险因素识别。

Sci Rep. 2024 May 20;14(1):11498. doi: 10.1038/s41598-024-61665-4.

An exploration on the machine-learning-based stroke prediction model.基于机器学习的中风预测模型探索

Front Neurol. 2024 Apr 29;15:1372431. doi: 10.3389/fneur.2024.1372431. eCollection 2024.

Machine learning-based prognostication of mortality in stroke patients.基于机器学习的中风患者死亡率预测

Heliyon. 2024 Apr 3;10(7):e28869. doi: 10.1016/j.heliyon.2024.e28869. eCollection 2024 Apr 15.

An interpretable machine learning model for stroke recurrence in patients with symptomatic intracranial atherosclerotic arterial stenosis.一种用于有症状颅内动脉粥样硬化性动脉狭窄患者卒中复发的可解释机器学习模型。

Front Neurosci. 2024 Jan 8;17:1323270. doi: 10.3389/fnins.2023.1323270. eCollection 2023.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

预测高血压患者中风风险的机器学习模型比较：套索回归模型、随机森林模型、Boruta算法模型以及Boruta算法与套索回归模型相结合的模型。

Comparison of machine learning models for predicting stroke risk in hypertensive patients: Lasso regression model, random forest model, Boruta algorithm model, and Boruta algorithm combined with Lasso regression model.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献