Torkamani Ali, Chen Shang-Fu, Lee Sang Eun, Sadaei Hossein, Park Jun-Bean, Khattab Ahmed, Henegar Corneliu, Wineinger Nathan, Muse Evan
Scripps Research & Scripps Research Translational Institute.
Asan Medical Center, University of Ulsan College of Medicine.
Res Sq. 2023 Dec 20:rs.3.rs-3694374. doi: 10.21203/rs.3.rs-3694374/v1.
Coronary artery disease (CAD) remains the leading cause of mortality and morbidity worldwide. Recent advances in large-scale genome-wide association studies have highlighted the potential of genetic risk, captured as polygenic risk scores (PRS), in clinical prevention. However, the current clinical utility of PRS models is limited to identifying high-risk populations based on the top percentiles of genetic susceptibility. While some studies have attempted integrative prediction using genetic and non-genetic factors, many of these studies have been cross-sectional and focused solely on risk stratification. Our primary objective in this study was to integrate unmodifiable (age / genetics) and modifiable (clinical / biometric) risk factors into a prospective prediction framework which also produces actionable and personalized risk estimates for the purpose of CAD prevention in a heterogenous adult population. Thus, we present an integrative, omnigenic, meta-prediction framework that effectively captures CAD risk subgroups, primarily distinguished by degree and nature of genetic risk, with distinct risk reduction profiles predicted from standard clinical interventions. Initial model development considered ~ 2,000 predictive features, including demographic data, lifestyle factors, physical measurements, laboratory tests, medication usage, diagnoses, and genetics. To power our meta-prediction approach, we stratified the UK Biobank into two primary cohorts: 1) a prevalent CAD cohort used to train baseline and prospective predictive models for contributing risk factors and diagnoses, and 2) an incident CAD cohort used to train the final CAD incident risk prediction model. The resultant 10-year incident CAD risk model is composed of 35 derived meta-features from models trained on the prevalent risk cohort, most of which are predicted baseline diagnoses with multiple embedded PRSs. This model achieved an AUC of 0.81 and macro-averaged F1-score of 0.65, outperforming standard clinical scores and prior integrative models. We further demonstrate that individualized risk reduction profiles can be derived from this model, with genetic risk mediating the degree of risk reduction achieved by standard clinical interventions.
冠状动脉疾病(CAD)仍是全球范围内导致死亡和发病的主要原因。大规模全基因组关联研究的最新进展凸显了遗传风险(以多基因风险评分(PRS)表示)在临床预防中的潜力。然而,PRS模型目前的临床应用仅限于根据遗传易感性的最高百分位数来识别高危人群。虽然一些研究尝试使用遗传和非遗传因素进行综合预测,但其中许多研究都是横断面研究,且仅专注于风险分层。我们这项研究的主要目标是将不可改变的(年龄/遗传学)和可改变的(临床/生物特征)风险因素整合到一个前瞻性预测框架中,该框架还能为异质性成年人群的CAD预防生成可操作的个性化风险估计。因此,我们提出了一个综合的、全基因的元预测框架,该框架有效地捕捉了CAD风险亚组,主要根据遗传风险的程度和性质进行区分,并通过标准临床干预预测出不同的风险降低概况。初始模型开发考虑了约2000个预测特征,包括人口统计学数据、生活方式因素、身体测量、实验室检查、药物使用情况、诊断结果和遗传学信息。为了支持我们的元预测方法,我们将英国生物银行分为两个主要队列:1)一个CAD现患队列,用于训练关于促成风险因素和诊断的基线及前瞻性预测模型;2)一个CAD新发队列,用于训练最终的CAD发病风险预测模型。由此产生的10年CAD发病风险模型由35个从现患风险队列训练的模型中得出的元特征组成,其中大部分是通过多个嵌入式PRS预测的基线诊断结果。该模型的曲线下面积(AUC)为0.81,宏平均F1分数为0.65,优于标准临床评分和先前的综合模型。我们进一步证明,可以从该模型中得出个性化的风险降低概况,遗传风险介导了标准临床干预所实现的风险降低程度。