Dolezalova Nikola, Reed Angus B, Despotovic Aleksa, Obika Bernard Dillon, Morelli Davide, Aral Mert, Plans David
Department of Research and Development, Huma Therapeutics Limited, Millbank Tower, 21-24 Millbank, London SW1P 4QP, UK.
Department for Social Studies and Public Health, Faculty of Medicine, Faculty of Medicine, University of Belgrade, Belgrade, Serbia.
Eur Heart J Digit Health. 2021 Jun 26;2(3):528-538. doi: 10.1093/ehjdh/ztab057. eCollection 2021 Sep.
Cardiovascular diseases (CVDs) are among the leading causes of death worldwide. Predictive scores providing personalized risk of developing CVD are increasingly used in clinical practice. Most scores, however, utilize a homogenous set of features and require the presence of a physician. The aim was to develop a new risk model (DiCAVA) using statistical and machine learning techniques that could be applied in a remote setting. A secondary goal was to identify new patient-centric variables that could be incorporated into CVD risk assessments.
Across 466 052 participants, Cox proportional hazards (CPH) and DeepSurv models were trained using 608 variables derived from the UK Biobank to investigate the 10-year risk of developing a CVD. Data-driven feature selection reduced the number of features to 47, after which reduced models were trained. Both models were compared to the Framingham score. The reduced CPH model achieved a c-index of 0.7443, whereas DeepSurv achieved a c-index of 0.7446. Both CPH and DeepSurv were superior in determining the CVD risk compared to Framingham score. Minimal difference was observed when cholesterol and blood pressure were excluded from the models (CPH: 0.741, DeepSurv: 0.739). The models show very good calibration and discrimination on the test data.
We developed a cardiovascular risk model that has very good predictive capacity and encompasses new variables. The score could be incorporated into clinical practice and utilized in a remote setting, without the need of including cholesterol. Future studies will focus on external validation across heterogeneous samples.
心血管疾病(CVD)是全球主要死因之一。提供CVD发病个性化风险的预测评分在临床实践中越来越常用。然而,大多数评分使用一组同质化的特征,且需要医生在场。目的是使用可应用于远程环境的统计和机器学习技术开发一种新的风险模型(DiCAVA)。次要目标是识别可纳入CVD风险评估的以患者为中心的新变量。
在466052名参与者中,使用从英国生物银行衍生的608个变量训练Cox比例风险(CPH)模型和深度生存(DeepSurv)模型,以研究CVD发病的10年风险。数据驱动的特征选择将特征数量减少到47个,之后训练简化模型。将这两个模型与弗雷明汉评分进行比较。简化后的CPH模型的c指数为0.7443,而DeepSurv模型的c指数为0.7446。与弗雷明汉评分相比,CPH和DeepSurv在确定CVD风险方面都更优。当模型中排除胆固醇和血压时,观察到的差异最小(CPH:0.741,DeepSurv:0.739)。模型在测试数据上显示出非常好的校准和区分能力。
我们开发了一种具有非常好预测能力且包含新变量的心血管风险模型。该评分可纳入临床实践并在远程环境中使用,无需纳入胆固醇。未来的研究将集中在跨异质样本的外部验证上。