Chen Shang-Fu, Lee Sang Eun, Sadaei Hossein Javedani, Park Jun-Bean, Khattab Ahmed, Chen Jei-Fu, Henegar Corneliu, Wineinger Nathan E, Muse Evan D, Torkamani Ali
Scripps Research Translational Institute, La Jolla, CA, USA.
Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA.
Nat Med. 2025 Apr 16. doi: 10.1038/s41591-025-03648-0.
Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide, and accurately predicting individual risk is critical for prevention. Here we aimed to integrate unmodifiable risk factors, such as age and genetics, with modifiable risk factors, such as clinical and biometric measurements, into a meta-prediction framework that produces actionable and personalized risk estimates. In the initial development of the model, ~2,000 predictive features were considered, including demographic data, lifestyle factors, physical measurements, laboratory tests, medication usage, diagnoses and genetics. To power our meta-prediction approach, we stratified the UK Biobank into two primary cohorts: first, a prevalent CAD cohort used to train predictive models for cross-sectional prediction at baseline and prospective estimation of contributing risk factor levels and diagnoses (baseline models) and, second, an incident CAD cohort using, in part, these baseline models as meta-features to train a final CAD incident risk prediction model. The resultant 10-year incident CAD risk model, composed of 15 derived meta-features with multiple embedded polygenic risk scores, achieves an area under the curve of 0.84. In an independent test cohort from the All of Us research program, this model achieved an area under the curve of 0.81 for predicting 10-year incident CAD risk, outperforming standard clinical scores and previously developed integrative models. Moreover, this framework enables the generation of individualized risk reduction profiles by quantifying the potential impact of standard clinical interventions. Notably, genetic risk influences the extent to which these interventions reduce overall CAD risk, allowing for tailored prevention strategies.
冠状动脉疾病(CAD)是全球发病和死亡的主要原因,准确预测个体风险对于预防至关重要。在此,我们旨在将年龄和基因等不可改变的风险因素与临床和生物特征测量等可改变的风险因素整合到一个元预测框架中,该框架能够产生可操作的个性化风险估计。在模型的初始开发阶段,考虑了约2000个预测特征,包括人口统计学数据、生活方式因素、身体测量、实验室检查、药物使用、诊断和基因。为了支持我们的元预测方法,我们将英国生物银行分为两个主要队列:第一,一个现患CAD队列,用于训练预测模型,以进行基线时的横断面预测以及对促成风险因素水平和诊断的前瞻性估计(基线模型);第二,一个新发CAD队列,部分使用这些基线模型作为元特征来训练最终的CAD发病风险预测模型。由此产生的10年CAD发病风险模型由15个派生的元特征组成,带有多个嵌入式多基因风险评分,曲线下面积达到0.84。在“我们所有人”研究项目的一个独立测试队列中,该模型在预测10年CAD发病风险时曲线下面积达到0.81,优于标准临床评分和先前开发的综合模型。此外,该框架通过量化标准临床干预的潜在影响,能够生成个性化的风险降低概况。值得注意的是,基因风险会影响这些干预措施降低总体CAD风险的程度,从而实现量身定制的预防策略。