Department of Psychiatry, Washington University School of Medicine, 4444 Forest Park Ave., Box 8134, St. Louis, MO, 63108, USA.
NeuroGenomics and Informatics Center, Washington University School of Medicine, St Louis, MO, USA.
Genome Med. 2022 Dec 12;14(1):140. doi: 10.1186/s13073-022-01140-9.
BACKGROUND: Human proteins are widely used as drug targets. Integration of large-scale protein-level genome-wide association studies (GWAS) and disease-related GWAS has thus connected genetic variation to disease mechanisms via protein. Previous proteome-by-phenome-wide Mendelian randomization (MR) studies have been mainly focused on plasma proteomes. Previous MR studies using the brain proteome only reported protein effects on a set of pre-selected tissue-specific diseases. No studies, however, have used high-throughput proteomics from multiple tissues to perform MR on hundreds of phenotypes. METHODS: Here, we performed MR and colocalization analysis using multi-tissue (cerebrospinal fluid (CSF), plasma, and brain from pre- and post-meta-analysis of several disease-focus cohorts including Alzheimer disease (AD)) protein quantitative trait loci (pQTLs) as instrumental variables to infer protein effects on 211 phenotypes, covering seven broad categories: biological traits, blood traits, cancer types, neurological diseases, other diseases, personality traits, and other risk factors. We first implemented these analyses with cis pQTLs, as cis pQTLs are known for being less prone to horizontal pleiotropy. Next, we included both cis and trans conditionally independent pQTLs that passed the genome-wide significance threshold keeping only variants associated with fewer than five proteins to minimize pleiotropic effects. We compared the tissue-specific protein effects on phenotypes across different categories. Finally, we integrated the MR-prioritized proteins with the druggable genome to identify new potential targets. RESULTS: In the MR and colocalization analysis including study-wide significant cis pQTLs as instrumental variables, we identified 33 CSF, 13 plasma, and five brain proteins to be putative causal for 37, 18, and eight phenotypes, respectively. After expanding the instrumental variables by including genome-wide significant cis and trans pQTLs, we identified a total of 58 CSF, 32 plasma, and nine brain proteins associated with 58, 44, and 16 phenotypes, respectively. For those protein-phenotype associations that were found in more than one tissue, the directions of the associations for 13 (87%) pairs were consistent across tissues. As we were unable to use methods correcting for horizontal pleiotropy given most of the proteins were only associated with one valid instrumental variable after clumping, we found that the observations of protein-phenotype associations were consistent with a causal role or horizontal pleiotropy. Between 66.7 and 86.3% of the disease-causing proteins overlapped with the druggable genome. Finally, between one and three proteins, depending on the tissue, were connected with at least one drug compound for one phenotype from both DrugBank and ChEMBL databases. CONCLUSIONS: Integrating multi-tissue pQTLs with MR and the druggable genome may open doors to pinpoint novel interventions for complex traits with no effective treatments, such as ovarian and lung cancers.
背景:人类蛋白被广泛用作药物靶点。因此,将大规模蛋白质组范围的全基因组关联研究(GWAS)与疾病相关的 GWAS 相结合,通过蛋白将遗传变异与疾病机制联系起来。先前的基于蛋白质组的表型全基因组关联研究(MR)主要集中在血浆蛋白质组上。然而,先前使用大脑蛋白质组进行的 MR 研究仅报告了蛋白质对一组预先选择的组织特异性疾病的影响。但是,尚无研究使用来自多个组织的高通量蛋白质组学对数百种表型进行基于 MR 的研究。
方法:在这里,我们使用多组织(脑脊液(CSF)、血浆和大脑)的蛋白质定量性状基因座(pQTL)作为工具变量进行 MR 和 colocalization 分析,以推断蛋白质对 211 种表型的影响,这些表型涵盖七个广泛的类别:生物特征、血液特征、癌症类型、神经疾病、其他疾病、人格特质和其他危险因素。我们首先使用 cis pQTL 执行这些分析,因为 cis pQTL 已知不易发生水平多效性。接下来,我们纳入了 cis 和 trans 条件独立的 pQTL,这些 pQTL 通过全基因组显著性阈值,仅保留与少于 5 种蛋白质相关的变异,以最大程度地减少多效性影响。我们比较了不同类别中不同组织的蛋白质对表型的影响。最后,我们将 MR 优先考虑的蛋白质与可用药基因组相结合,以识别新的潜在靶标。
结果:在包括全基因组 cis pQTL 作为工具变量的 MR 和 colocalization 分析中,我们鉴定出 33 种 CSF、13 种血浆和 5 种大脑蛋白质,分别是 37、18 和 8 种表型的潜在因果关系。在扩展了包括全基因组 cis 和 trans pQTL 的工具变量后,我们总共鉴定出 58 种 CSF、32 种血浆和 9 种大脑蛋白质,分别与 58、44 和 16 种表型相关。对于在多个组织中发现的那些蛋白质-表型关联,对于 13(87%)对,关联的方向在组织之间是一致的。由于在聚类后大多数蛋白质仅与一个有效的工具变量相关联,我们无法使用纠正水平多效性的方法,因此我们发现,蛋白质-表型关联的观察结果与因果关系或水平多效性一致。66.7%至 86.3%的疾病相关蛋白与可用药基因组重叠。最后,取决于组织,在 DrugBank 和 ChEMBL 数据库中,有一个或三个蛋白质分别与至少一个针对一种表型的药物化合物相连。
结论:将多组织 pQTL 与 MR 和可用药基因组相结合,可能为卵巢癌和肺癌等尚无有效治疗方法的复杂疾病找到新的干预措施。
Zhonghua Shao Shang Yu Chuang Mian Xiu Fu Za Zhi. 2025-6-20
Cochrane Database Syst Rev. 2021-4-19
Cochrane Database Syst Rev. 2018-2-6
Clin Cosmet Investig Dermatol. 2025-5-27
JAMA Psychiatry. 2025-5-1
J Am Heart Assoc. 2025-2-4
J R Stat Soc Series B Stat Methodol. 2020-12
Nat Rev Drug Discov. 2022-8
Nat Rev Genet. 2021-10