Department of Cardiology, Gyeongsang National University School of Medicine and Gyeongsang National University Changwon Hospital, Changwon, Korea.
Cardiovascular Center, Internal Medicine, Seoul National University Bundang Hospital, 82, Gumi-Ro 173 Beon-Gil, Bundang-Gu, Seongnam-si, 13620, Gyeonggi-Do, Korea.
Sci Rep. 2021 Apr 26;11(1):8886. doi: 10.1038/s41598-021-88257-w.
Predicting the risk of cardiovascular disease is the key to primary prevention. Machine learning has attracted attention in analyzing increasingly large, complex healthcare data. We assessed discrimination and calibration of pre-existing cardiovascular risk prediction models and developed machine learning-based prediction algorithms. This study included 222,998 Korean adults aged 40-79 years, naïve to lipid-lowering therapy, had no history of cardiovascular disease. Pre-existing models showed moderate to good discrimination in predicting future cardiovascular events (C-statistics 0.70-0.80). Pooled cohort equation (PCE) specifically showed C-statistics of 0.738. Among other machine learning models such as logistic regression, treebag, random forest, and adaboost, the neural network model showed the greatest C-statistic (0.751), which was significantly higher than that for PCE. It also showed improved agreement between the predicted risk and observed outcomes (Hosmer-Lemeshow χ = 86.1, P < 0.001) than PCE for whites did (Hosmer-Lemeshow χ = 171.1, P < 0.001). Similar improvements were observed for Framingham risk score, systematic coronary risk evaluation, and QRISK3. This study demonstrated that machine learning-based algorithms could improve performance in cardiovascular risk prediction over contemporary cardiovascular risk models in statin-naïve healthy Korean adults without cardiovascular disease. The model can be easily adopted for risk assessment and clinical decision making.
预测心血管疾病的风险是一级预防的关键。机器学习在分析日益庞大、复杂的医疗保健数据方面引起了关注。我们评估了现有的心血管风险预测模型的区分度和校准度,并开发了基于机器学习的预测算法。本研究纳入了 222998 名年龄在 40-79 岁、初次接受降脂治疗、无心血管疾病史的韩国成年人。现有的模型在预测未来心血管事件方面表现出中等至良好的区分度(C 统计量为 0.70-0.80)。特定的队列方程(PCE)显示的 C 统计量为 0.738。在其他机器学习模型(如逻辑回归、树袋、随机森林和自适应增强)中,神经网络模型的 C 统计量最高(0.751),显著高于 PCE。它还显示出与观察结果之间的预测风险之间的一致性得到了改善(Hosmer-Lemeshow χ=86.1,P<0.001),而 PCE 对白人则没有(Hosmer-Lemeshow χ=171.1,P<0.001)。Framingham 风险评分、系统性冠状动脉风险评估和 QRISK3 也观察到了类似的改善。本研究表明,基于机器学习的算法可以提高他汀类药物初治无心血管疾病的韩国健康成年人的心血管风险预测的性能,优于当代心血管风险模型。该模型可方便地用于风险评估和临床决策。