JM-USDA Human Nutrition Research Center on Aging at Tufts University Boston MA.
Cardiovascular Epidemiology and Genetics Research Group REGICOR Study Group IMIM (Hospital del Mar Medical Research Institute) Barcelona Catalonia Spain.
J Am Heart Assoc. 2020 Apr 21;9(8):e015299. doi: 10.1161/JAHA.119.015299. Epub 2020 Apr 20.
Background Epigenome-wide association studies for cardiometabolic risk factors have discovered multiple loci associated with incident cardiovascular disease (CVD). However, few studies have sought to directly optimize a predictor of CVD risk. Furthermore, it is challenging to train multivariate models across multiple studies in the presence of study- or batch effects. Methods and Results Here, we analyzed existing DNA methylation data collected using the Illumina HumanMethylation450 microarray to create a predictor of CVD risk across 3 cohorts: Women's Health Initiative, Framingham Heart Study Offspring Cohort, and Lothian Birth Cohorts. We trained Cox proportional hazards-based elastic net regressions for incident CVD separately in each cohort and used a recently introduced cross-study learning approach to integrate these individual scores into an ensemble predictor. The methylation-based risk score was associated with CVD time-to-event in a held-out fraction of the Framingham data set (hazard ratio per SD=1.28, 95% CI, 1.10-1.50) and predicted myocardial infarction status in the independent REGICOR (Girona Heart Registry) data set (odds ratio per SD=2.14, 95% CI, 1.58-2.89). These associations remained after adjustment for traditional cardiovascular risk factors and were similar to those from elastic net models trained on a directly merged data set. Additionally, we investigated interactions between the methylation-based risk score and both genetic and biochemical CVD risk, showing preliminary evidence of an enhanced performance in those with less traditional risk factor elevation. Conclusions This investigation provides proof-of-concept for a genome-wide, CVD-specific epigenomic risk score and suggests that DNA methylation data may enable the discovery of high-risk individuals who would be missed by alternative risk metrics.
背景 针对心脏代谢风险因素的全基因组关联研究已经发现了多个与心血管疾病(CVD)事件相关的位点。然而,很少有研究试图直接优化 CVD 风险的预测因子。此外,在存在研究或批次效应的情况下,跨多个研究训练多变量模型具有挑战性。
方法和结果 在这里,我们分析了使用 Illumina HumanMethylation450 微阵列收集的现有 DNA 甲基化数据,以在 3 个队列中创建 CVD 风险的预测因子:妇女健康倡议、弗雷明汉心脏研究后代队列和洛锡安出生队列。我们分别在每个队列中训练基于 Cox 比例风险的弹性网络回归,以对 CVD 事件进行预测,并使用最近引入的跨研究学习方法将这些个体分数整合到一个集成预测器中。基于甲基化的风险评分与弗雷明汉数据集的保留部分的 CVD 时间事件相关(每 SD 的危险比=1.28,95%CI,1.10-1.50),并预测独立的 REGICOR(Girona Heart Registry)数据集的心肌梗死状态(每 SD 的优势比=2.14,95%CI,1.58-2.89)。这些关联在调整传统心血管风险因素后仍然存在,并且与直接合并数据集上训练的弹性网络模型的关联相似。此外,我们还研究了基于甲基化的风险评分与遗传和生化 CVD 风险之间的相互作用,初步证据表明,在传统风险因素升高较少的人群中,该评分的性能得到了提高。
结论 本研究为全基因组 CVD 特异性表观基因组风险评分提供了概念验证,并表明 DNA 甲基化数据可能能够发现通过替代风险指标可能会错过的高风险个体。