Liu Yuqi, Elmas Abdulkadir, Huang Kuan-Lin
Department of Genetics and Genomic Sciences, Department of Artificial Intelligence and Human Health, Center for Transformative Disease Modeling, Tisch Cancer Institute, Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae113.
Cancer mutations are often assumed to alter proteins, thus promoting tumorigenesis. However, how mutations affect protein expression-in addition to gene expression-has rarely been systematically investigated. This is significant as mRNA and protein levels frequently show only moderate correlation, driven by factors such as translation efficiency and protein degradation. Proteogenomic datasets from large tumor cohorts provide an opportunity to systematically analyze the effects of somatic mutations on mRNA and protein abundance and identify mutations with distinct impacts on these molecular levels.
We conduct a comprehensive analysis of mutation impacts on mRNA- and protein-level expressions of 953 cancer cases with paired genomics and global proteomic profiling across 6 cancer types. Protein-level impacts are validated for 47.2% of the somatic expression quantitative trait loci (seQTLs), including CDH1 and MSH3 truncations, as well as other mutations from likely "long-tail" driver genes. Devising a statistical pipeline for identifying somatic protein-specific QTLs (spsQTLs), we reveal several gene mutations, including NF1 and MAP2K4 truncations and TP53 missenses showing disproportional influence on protein abundance not readily explained by transcriptomics. Cross-validating with data from massively parallel assays of variant effects (MAVE), TP53 missenses associated with high tumor TP53 proteins are more likely to be experimentally confirmed as functional.
This study reveals that somatic mutations can exhibit distinct impacts on mRNA and protein levels, underscoring the necessity of integrating proteogenomic data to comprehensively identify functionally significant cancer mutations. These insights provide a framework for prioritizing mutations for further functional validation and therapeutic targeting.
癌症突变通常被认为会改变蛋白质,从而促进肿瘤发生。然而,除了基因表达外,突变如何影响蛋白质表达却很少得到系统研究。这一点很重要,因为受翻译效率和蛋白质降解等因素影响,mRNA水平和蛋白质水平之间的相关性通常仅为中等程度。来自大型肿瘤队列的蛋白质基因组数据集为系统分析体细胞突变对mRNA和蛋白质丰度的影响以及识别对这些分子水平有不同影响的突变提供了机会。
我们对953例癌症病例的突变对mRNA和蛋白质水平表达的影响进行了全面分析,这些病例涵盖6种癌症类型,并配有基因组学和全蛋白质组分析。47.2%的体细胞表达数量性状位点(seQTL)在蛋白质水平上的影响得到了验证,包括CDH1和MSH3的截断突变,以及可能来自“长尾”驱动基因的其他突变。通过设计一种统计方法来识别体细胞蛋白质特异性QTL(spsQTL),我们发现了几种基因突变,包括NF1和MAP2K4的截断突变以及TP53的错义突变,这些突变对蛋白质丰度有不成比例的影响,而转录组学难以解释这种影响。通过与大规模平行变异效应分析(MAVE)的数据进行交叉验证,与高肿瘤TP53蛋白相关的TP53错义突变更有可能在实验中被确认为具有功能。
本研究表明,体细胞突变可对mRNA和蛋白质水平产生不同影响,强调了整合蛋白质基因组数据以全面识别具有功能意义的癌症突变的必要性。这些见解为优先考虑进行进一步功能验证和治疗靶点的突变提供了一个框架。