关于在高血压成年人中建立基于机器学习的中风预测模型的新见解。

Novel Insights on Establishing Machine Learning-Based Stroke Prediction Models Among Hypertensive Adults.

作者信息

Huang Xiao, Cao Tianyu, Chen Liangziqian, Li Junpei, Tan Ziheng, Xu Benjamin, Xu Richard, Song Yun, Zhou Ziyi, Wang Zhuo, Wei Yaping, Zhang Yan, Li Jianping, Huo Yong, Qin Xianhui, Wu Yanqing, Wang Xiaobin, Wang Hong, Cheng Xiaoshu, Xu Xiping, Liu Lishun

机构信息

Department of Cardiology, The Second Affiliated Hospital of Nanchang University, Nanchang, China.

Biological Anthropology, University of California, Santa Barbara, Santa Barbara, CA, United States.

出版信息

Front Cardiovasc Med. 2022 May 6;9:901240. doi: 10.3389/fcvm.2022.901240. eCollection 2022.

DOI:10.3389/fcvm.2022.901240

PMID:35600480

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9120532/

Abstract

BACKGROUND

Stroke is a major global health burden, and risk prediction is essential for the primary prevention of stroke. However, uncertainty remains about the optimal prediction model for analyzing stroke risk. In this study, we aim to determine the most effective stroke prediction method in a Chinese hypertensive population using machine learning and establish a general methodological pipeline for future analysis.

METHODS

The training set included 70% of data ( = 14,491) from the China Stroke Primary Prevention Trial (CSPPT). Internal validation was processed with the rest 30% of CSPPT data ( = 6,211), and external validation was conducted using a nested case-control (NCC) dataset ( = 2,568). The primary outcome was the first stroke. Four received analysis methods were processed and compared: logistic regression (LR), stepwise logistic regression (SLR), extreme gradient boosting (XGBoost), and random forest (RF). Population characteristic data with inclusion and exclusion of laboratory variables were separately analyzed. Accuracy, sensitivity, specificity, kappa, and area under receiver operating characteristic curves (AUCs) were used to make model assessments with AUCs the top concern. Data balancing techniques, including random under-sampling (RUS) and synthetic minority over-sampling technique (SMOTE), were applied to process this unbalanced training set.

RESULTS

The best model performance was observed in RUS-applied RF model with laboratory variables. Compared with null models (sensitivity = 0, specificity = 100, and mean AUCs = 0.643), data balancing techniques improved overall performance with RUS, demonstrating a more satisfactory effect in the current study (RUS: sensitivity = 63.9; specificity = 53.7; and mean AUCs = 0.624. Adding laboratory variables improved the performance of analysis methods. All results were reconfirmed in validation sets. The top 10 important variables were determined by the analysis method with the best performance.

CONCLUSION

Among the tested methods, the most effective stroke prediction model in targeted population is RUS-applied RF. From the insights, the current study revealed, we provided general frameworks for building machine learning-based prediction models.

摘要

背景

中风是一项重大的全球健康负担，风险预测对于中风的一级预防至关重要。然而，对于分析中风风险的最佳预测模型仍存在不确定性。在本研究中，我们旨在使用机器学习确定中国高血压人群中最有效的中风预测方法，并建立一个用于未来分析的通用方法流程。

方法

训练集包括来自中国脑卒中一级预防试验（CSPPT）的70%的数据（n = 14491）。使用CSPPT其余30%的数据（n = 6211）进行内部验证，并使用嵌套病例对照（NCC）数据集（n = 2568）进行外部验证。主要结局是首次中风。对四种分析方法进行了处理和比较：逻辑回归（LR）、逐步逻辑回归（SLR）、极端梯度提升（XGBoost）和随机森林（RF）。分别分析了包含和排除实验室变量的人群特征数据。使用准确性、敏感性、特异性、kappa值和受试者操作特征曲线下面积（AUC）进行模型评估，其中AUC是最关注的指标。应用数据平衡技术，包括随机欠采样（RUS）和合成少数过采样技术（SMOTE）来处理这个不平衡的训练集。

结果

在应用RUS的包含实验室变量的RF模型中观察到最佳模型性能。与空模型（敏感性 = 0，特异性 = 100，平均AUC = 0.643）相比，数据平衡技术通过RUS提高了整体性能，在本研究中显示出更令人满意的效果（RUS：敏感性 = 63.9；特异性 = 53.7；平均AUC = 0.624）。添加实验室变量提高了分析方法的性能。所有结果在验证集中得到再次确认。通过性能最佳的分析方法确定了前10个重要变量。

结论

在测试的方法中，针对目标人群最有效的中风预测模型是应用RUS的RF。从本研究揭示的见解中，我们提供了构建基于机器学习的预测模型的通用框架。

相似文献

Novel Insights on Establishing Machine Learning-Based Stroke Prediction Models Among Hypertensive Adults.关于在高血压成年人中建立基于机器学习的中风预测模型的新见解。

Front Cardiovasc Med. 2022 May 6;9:901240. doi: 10.3389/fcvm.2022.901240. eCollection 2022.

Stroke Prediction with Machine Learning Methods among Older Chinese.基于机器学习方法对中国老年人进行中风预测。

Int J Environ Res Public Health. 2020 Mar 12;17(6):1828. doi: 10.3390/ijerph17061828.

Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques.使用机器学习技术预测中国老年人患2型糖尿病的风险

J Pers Med. 2022 May 31;12(6):905. doi: 10.3390/jpm12060905.

Prediction Model of Osteonecrosis of the Femoral Head After Femoral Neck Fracture: Machine Learning-Based Development and Validation Study.股骨颈骨折后股骨头坏死的预测模型：基于机器学习的开发与验证研究

JMIR Med Inform. 2021 Nov 19;9(11):e30079. doi: 10.2196/30079.

Machine Learning Models for Predicting Influential Factors of Early Outcomes in Acute Ischemic Stroke: Registry-Based Study.用于预测急性缺血性卒中早期预后影响因素的机器学习模型：基于登记处的研究

JMIR Med Inform. 2022 Mar 25;10(3):e32508. doi: 10.2196/32508.

Prediction of Masked Hypertension and Masked Uncontrolled Hypertension Using Machine Learning.使用机器学习预测隐匿性高血压和隐匿性未控制高血压

Front Cardiovasc Med. 2021 Nov 19;8:778306. doi: 10.3389/fcvm.2021.778306. eCollection 2021.

Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults.用于预测中国成年人新发糖尿病3年风险的机器学习

Front Public Health. 2021 Jun 29;9:626331. doi: 10.3389/fpubh.2021.626331. eCollection 2021.

Prediction of Neurological Outcomes in Out-of-hospital Cardiac Arrest Survivors Immediately after Return of Spontaneous Circulation: Ensemble Technique with Four Machine Learning Models.院外心脏骤停幸存者自主循环恢复后即刻的神经功能结局预测：四种机器学习模型的集成技术。

J Korean Med Sci. 2021 Jul 19;36(28):e187. doi: 10.3346/jkms.2021.36.e187.

Development and validation of prediction models for hypertension risks: A cross-sectional study based on 4,287,407 participants.高血压风险预测模型的开发与验证：一项基于4287407名参与者的横断面研究。

Front Cardiovasc Med. 2022 Sep 26;9:928948. doi: 10.3389/fcvm.2022.928948. eCollection 2022.

Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development.运用机器学习技术预测中国女性乳腺癌：算法开发

JMIR Med Inform. 2020 Jun 8;8(6):e17364. doi: 10.2196/17364.

引用本文的文献

Predicting 3-month poor functional outcomes of acute ischemic stroke in young patients using machine learning.使用机器学习预测年轻急性缺血性脑卒中患者 3 个月的不良功能结局。

Eur J Med Res. 2024 Oct 10;29(1):494. doi: 10.1186/s40001-024-02056-3.

A machine learning model for diagnosing acute pulmonary embolism and comparison with Wells score, revised Geneva score, and Years algorithm.机器学习模型诊断急性肺栓塞与 Wells 评分、修订版 Geneva 评分和 Years 算法的比较。

Chin Med J (Engl). 2024 Mar 20;137(6):676-682. doi: 10.1097/CM9.0000000000002837. Epub 2023 Oct 12.

Risk factor mining and prediction of urine protein progression in chronic kidney disease: a machine learning- based study.风险因素挖掘与慢性肾脏病尿蛋白进展的预测：基于机器学习的研究。

BMC Med Inform Decis Mak. 2023 Aug 31;23(1):173. doi: 10.1186/s12911-023-02269-2.

Machine Learning and the Conundrum of Stroke Risk Prediction.机器学习与中风风险预测难题

Arrhythm Electrophysiol Rev. 2023 Apr 12;12:e07. doi: 10.15420/aer.2022.34. eCollection 2023.

本文引用的文献

Association between plasma copper levels and first stroke: a community-based nested case-control study.血浆铜水平与首次卒中的关系：一项基于社区的巢式病例对照研究。

Nutr Neurosci. 2022 Jul;25(7):1524-1533. doi: 10.1080/1028415X.2021.1875299. Epub 2021 Feb 3.

Stroke Prediction with Machine Learning Methods among Older Chinese.基于机器学习方法对中国老年人进行中风预测。

Int J Environ Res Public Health. 2020 Mar 12;17(6):1828. doi: 10.3390/ijerph17061828.

A Machine-Learning-Based Prediction Method for Hypertension Outcomes Based on Medical Data.一种基于医学数据的高血压结局的机器学习预测方法。

Diagnostics (Basel). 2019 Nov 7;9(4):178. doi: 10.3390/diagnostics9040178.

A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.基于机器学习的数据驱动方法预测糖尿病和心血管疾病。

BMC Med Inform Decis Mak. 2019 Nov 6;19(1):211. doi: 10.1186/s12911-019-0918-5.

Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study.使用机器学习进行肌肉减少症特征选择和风险预测：一项横断面研究。

Medicine (Baltimore). 2019 Oct;98(43):e17699. doi: 10.1097/MD.0000000000017699.

Extreme Gradient Boosting Model Has a Better Performance in Predicting the Risk of 90-Day Readmissions in Patients with Ischaemic Stroke.极端梯度提升模型在预测缺血性脑卒中患者 90 天再入院风险方面具有更好的性能。

J Stroke Cerebrovasc Dis. 2019 Dec;28(12):104441. doi: 10.1016/j.jstrokecerebrovasdis.2019.104441. Epub 2019 Oct 16.

Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis.基于聚类的欠采样与随机过采样示例和支持向量机在乳腺癌诊断中的不平衡分类。

Comput Assist Surg (Abingdon). 2019 Oct;24(sup2):62-72. doi: 10.1080/24699322.2019.1649074. Epub 2019 Aug 12.

Predicting 10-Year and Lifetime Stroke Risk in Chinese Population.预测中国人群的 10 年和终生卒中风险。

Stroke. 2019 Sep;50(9):2371-2378. doi: 10.1161/STROKEAHA.119.025553. Epub 2019 Aug 8.

Machine Learning-Based Model for Prediction of Outcomes in Acute Stroke.基于机器学习的急性脑卒中结局预测模型。

Stroke. 2019 May;50(5):1263-1265. doi: 10.1161/STROKEAHA.118.024293.

Causal associations of blood lipids with risk of ischemic stroke and intracerebral hemorrhage in Chinese adults.血脂与中国成年人缺血性卒中和脑出血风险的因果关联。

Nat Med. 2019 Apr;25(4):569-574. doi: 10.1038/s41591-019-0366-x. Epub 2019 Mar 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

关于在高血压成年人中建立基于机器学习的中风预测模型的新见解。

Novel Insights on Establishing Machine Learning-Based Stroke Prediction Models Among Hypertensive Adults.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献