Li Jiqing, Wei Jiate, Fu Ping, Gu Jianhua
Department of Emergency Medicine, Qilu Hospital of Shandong University, Shandong University, Jinan, 250012, Shandong, China.
Office of Hospital Management Research, Beijing Friendship Hospital, Capital Medical University, Beijing, 100050, China.
Heliyon. 2024 Sep 19;10(19):e38036. doi: 10.1016/j.heliyon.2024.e38036. eCollection 2024 Oct 15.
Most coronary artery disease (CAD) risk loci identified by genome-wide association studies (GWAS) are located in non-coding regions, hampering the interpretation of how they confer CAD risk. It is essential to integrate GWAS with molecular traits data to further explore the genetic basis of CAD.
We used the probabilistic Mendelian randomization (PMR) method to identify potential proteins involved in CAD by integrating CAD GWAS data (∼76,014 cases and ∼264,785 controls) and human plasma proteomes (N = 35,559). Then, Bayesian co-localization analysis, confirmatory PMR analysis using independent plasma proteome data (N = 7752), and gene expression data (N1 = 213, N2 = 670) were performed to validate candidate proteins. We further investigated the associations between candidate proteins and CAD-related traits and explored the rationality and biological functions of candidate proteins through disease enrichment, cell type-specific, GO, and KEGG enrichment analysis.
This study inferred that the abundance of 30 proteins in the plasma was causally associated with CAD ( < 0.05/4408, Bonferroni correction), such as PLG, IL15RA, and CSNK2A1. PLG, PSCK9, COLEC11, ZNF180, ERP29, TCP1, FN1, CDH5, IL15RA, MGAT4B, TNFRSF6B, DNM2, and TGF1R were replicated in the confirmatory PMR ( < 0.05). PCSK9 (PP.H4 = 0.99), APOB (PP.H4 = 0.89), FN1 (PP.H4 = 0.87), and APOC1 (PP.H4 = 0.78) coding proteins shared one common variant with CAD. MTAP, TCP1, APOC2, ERP29, MORF4L1, C19orf80, PCSK9, APOC1, EPOR, DNM2, TNFRSF6B, CDKN2B, and LDLR were supported by PMR at the transcriptome level in whole blood and/or coronary arteries ( < 0.05). Enrichment analysis identified multiple pathways involved in cholesterol metabolism, regulation of lipoprotein levels and telomerase, such as cholesterol metabolism (hsa04979, = 2.25E-7), plasma lipoprotein particle clearance (GO:0034381, = 5.47E-5), and regulation of telomerase activity (GO:0051972, = 2.34E-3).
Our integration analysis has identified 30 candidate proteins for CAD, which may provide important leads to design future functional studies and potential drug targets for CAD.
通过全基因组关联研究(GWAS)确定的大多数冠状动脉疾病(CAD)风险位点位于非编码区域,这妨碍了对它们如何赋予CAD风险的解释。将GWAS与分子特征数据相结合以进一步探索CAD的遗传基础至关重要。
我们使用概率孟德尔随机化(PMR)方法,通过整合CAD的GWAS数据(约76,014例和约264,785例对照)和人类血浆蛋白质组(N = 35,559)来识别与CAD相关的潜在蛋白质。然后,进行贝叶斯共定位分析、使用独立血浆蛋白质组数据(N = 7752)和基因表达数据(N1 = 213,N2 = 670)的验证性PMR分析,以验证候选蛋白质。我们进一步研究了候选蛋白质与CAD相关性状之间的关联,并通过疾病富集、细胞类型特异性、基因本体(GO)和京都基因与基因组百科全书(KEGG)富集分析来探索候选蛋白质的合理性和生物学功能。
本研究推断血浆中30种蛋白质的丰度与CAD存在因果关联(P < 0.05/4408,Bonferroni校正),如纤溶酶原(PLG)、白细胞介素15受体α(IL15RA)和酪蛋白激酶2α1(CSNK2A1)。PLG、前蛋白转化酶枯草溶菌素9(PCSK9)、胶原凝集素11(COLEC11)、锌指蛋白180(ZNF180)、内质网蛋白29(ERP29)、TCP1、纤连蛋白1(FN1)、钙黏蛋白5(CDH5)、IL15RA、甘露糖苷酶α4β(MGAT4B)、肿瘤坏死因子受体超家族成员6B(TNFRSF6B)、发动蛋白2(DNM2)和转化生长因子1受体(TGF1R)在验证性PMR中得到重复验证(P < 0.05)。PCSK9(后验概率H4 = 0.99)、载脂蛋白B(APOB)(后验概率H4 = 0.89)、FN1(后验概率H4 = 0.87)和载脂蛋白C1(APOC1)(后验概率H4 = 0.78)编码的蛋白质与CAD共享一个常见变体。甲基硫代腺苷磷酸化酶(MTAP)、TCP1、载脂蛋白C2(APOC2)、ERP29、MORF4L1、19号染色体开放阅读框80(C19orf80)、PCSK9、APOC1、促红细胞生成素受体(EPOR)、DNM2、TNFRSF6B、细胞周期蛋白依赖性激酶抑制剂2B(CDKN2B)和低密度脂蛋白受体(LDLR)在全血和/或冠状动脉转录组水平上得到PMR支持(P < 0.05)。富集分析确定了多个参与胆固醇代谢、脂蛋白水平调节和端粒酶调节的途径,如胆固醇代谢(hsa04979,P = 2.25E - 7)、血浆脂蛋白颗粒清除(GO:0034381,P = 5.47E - 5)和端粒酶活性调节(GO:0051972,P = 2.34E - 3)。
我们的整合分析确定了30种CAD候选蛋白质,这可能为设计未来的功能研究和CAD潜在药物靶点提供重要线索。