• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质预测模型通过与相互作用伙伴的相互作用,支持蛋白质丰度的广泛转录后调控。

Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners.

机构信息

Department of Medicine/Cardiology, University of Colorado School of Medicine, Aurora, Colorado, United States of America.

Consortium for Fibrosis Research and Translation, University of Colorado School of Medicine, Aurora, Colorado, United States of America.

出版信息

PLoS Comput Biol. 2022 Nov 10;18(11):e1010702. doi: 10.1371/journal.pcbi.1010702. eCollection 2022 Nov.

DOI:10.1371/journal.pcbi.1010702
PMID:36356032
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9681107/
Abstract

Protein and mRNA levels correlate only moderately. The availability of proteogenomics data sets with protein and transcript measurements from matching samples is providing new opportunities to assess the degree to which protein levels in a system can be predicted from mRNA information. Here we examined the contributions of input features in protein abundance prediction models. Using large proteogenomics data from 8 cancer types within the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data set, we trained models to predict the abundance of over 13,000 proteins using matching transcriptome data from up to 958 tumor or normal adjacent tissue samples each, and compared predictive performances across algorithms, data set sizes, and input features. Over one-third of proteins (4,648) showed relatively poor predictability (elastic net r ≤ 0.3) from their cognate transcripts. Moreover, we found widespread occurrences where the abundance of a protein is considerably less well explained by its own cognate transcript level than that of one or more trans locus transcripts. The incorporation of additional trans-locus transcript abundance data as input features increasingly improved the ability to predict sample protein abundance. Transcripts that contribute to non-cognate protein abundance primarily involve those encoding known or predicted interaction partners of the protein of interest, including not only large multi-protein complexes as previously shown, but also small stable complexes in the proteome with only one or few stable interacting partners. Network analysis further shows a complex proteome-wide interdependency of protein abundance on the transcript levels of multiple interacting partners. The predictive model analysis here therefore supports that protein-protein interaction including in small protein complexes exert post-transcriptional influence on proteome compositions more broadly than previously recognized. Moreover, the results suggest mRNA and protein co-expression analysis may have utility for finding gene interactions and predicting expression changes in biological systems.

摘要

蛋白质和 mRNA 水平的相关性仅为中等。具有来自匹配样本的蛋白质和转录物测量的蛋白质基因组学数据集的可用性为评估系统中蛋白质水平从 mRNA 信息预测的程度提供了新的机会。在这里,我们研究了输入特征在蛋白质丰度预测模型中的贡献。使用来自临床蛋白质组肿瘤分析联盟(CPTAC)数据集中 8 种癌症类型的大型蛋白质基因组学数据,我们使用来自每个肿瘤或正常相邻组织样本的多达 958 个转录组数据训练了模型,以预测超过 13000 种蛋白质的丰度,并比较了算法,数据集大小和输入特征的预测性能。超过三分之一的蛋白质(4648 种)显示出与其同源转录物相对较差的可预测性(弹性网络 r ≤ 0.3)。此外,我们发现广泛存在一种情况,即蛋白质的丰度与其自身同源转录物水平相比,由一个或多个跨基因座转录物解释的程度要差得多。作为输入特征纳入额外的跨基因座转录物丰度数据可逐渐提高预测样品蛋白质丰度的能力。有助于非同源蛋白质丰度的转录物主要涉及那些编码感兴趣蛋白质的已知或预测的相互作用伙伴的转录物,包括不仅像以前那样包括大的多蛋白复合物,而且还包括蛋白质组中具有一个或几个稳定相互作用伙伴的小稳定复合物。网络分析进一步显示了蛋白质丰度与多个相互作用伙伴的转录水平之间的广泛的复杂的全蛋白质组相互依存关系。因此,这里的预测模型分析支持蛋白质 - 蛋白质相互作用,包括小蛋白质复合物,对蛋白质组组成的转录后影响比以前认识的更广泛。此外,这些结果表明,mRNA 和蛋白质共表达分析可能有助于发现基因相互作用并预测生物系统中的表达变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/15a998d9a97b/pcbi.1010702.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/081d22f93c93/pcbi.1010702.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/014fdb8b0ab4/pcbi.1010702.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/15a998d9a97b/pcbi.1010702.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/081d22f93c93/pcbi.1010702.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/014fdb8b0ab4/pcbi.1010702.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c82c/9681107/15a998d9a97b/pcbi.1010702.g005.jpg

相似文献

1
Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners.蛋白质预测模型通过与相互作用伙伴的相互作用,支持蛋白质丰度的广泛转录后调控。
PLoS Comput Biol. 2022 Nov 10;18(11):e1010702. doi: 10.1371/journal.pcbi.1010702. eCollection 2022 Nov.
2
Challenges in proteogenomics: a comparison of analysis methods with the case study of the DREAM proteogenomics sub-challenge.蛋白质基因组学面临的挑战:以 DREAM 蛋白质基因组学子挑战为例的分析方法比较。
BMC Bioinformatics. 2019 Dec 20;20(Suppl 24):669. doi: 10.1186/s12859-019-3253-z.
3
Workability of mRNA Sequencing for Predicting Protein Abundance.mRNA 测序预测蛋白质丰度的可行性。
Genes (Basel). 2023 Nov 11;14(11):2065. doi: 10.3390/genes14112065.
4
A deep proteome and transcriptome abundance atlas of 29 healthy human tissues.29 个人体健康组织的深度蛋白质组和转录组丰度图谱。
Mol Syst Biol. 2019 Feb 18;15(2):e8503. doi: 10.15252/msb.20188503.
5
Joint learning improves protein abundance prediction in cancers.联合学习提高癌症中蛋白质丰度预测。
BMC Biol. 2019 Dec 23;17(1):107. doi: 10.1186/s12915-019-0730-9.
6
Transcriptome and proteome quantification of a tumor model provides novel insights into post-transcriptional gene regulation.肿瘤模型的转录组和蛋白质组定量分析为转录后基因调控提供了新见解。
Genome Biol. 2013 Nov 30;14(11):r133. doi: 10.1186/gb-2013-14-11-r133.
7
Integrated Analysis of Protein Abundance, Transcript Level, and Tissue Diversity To Reveal Developmental Regulation of Maize.整合蛋白丰度、转录水平和组织多样性分析揭示玉米的发育调控
J Proteome Res. 2018 Feb 2;17(2):822-833. doi: 10.1021/acs.jproteome.7b00586. Epub 2018 Jan 8.
8
Predicting the dynamics of protein abundance.预测蛋白质丰度的动态变化。
Mol Cell Proteomics. 2014 May;13(5):1330-40. doi: 10.1074/mcp.M113.033076. Epub 2014 Feb 16.
9
Transcriptomic and proteomic analyses of the Aspergillus fumigatus hypoxia response using an oxygen-controlled fermenter.使用氧气控制发酵罐对烟曲霉缺氧反应的转录组学和蛋白质组学分析。
BMC Genomics. 2012 Feb 6;13:62. doi: 10.1186/1471-2164-13-62.
10
Mitochondrial proteome heterogeneity between tissues from the vegetative and reproductive stages of Arabidopsis thaliana development.拟南芥发育营养生长和生殖阶段各组织间线粒体蛋白质组的异质性
J Proteome Res. 2012 Jun 1;11(6):3326-43. doi: 10.1021/pr3001157. Epub 2012 May 14.

引用本文的文献

1
Muscle Proteome Dynamics.肌肉蛋白质组动力学
Adv Exp Med Biol. 2025;1478:113-153. doi: 10.1007/978-3-031-88361-3_7.
2
Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.蛋白质组生成器:利用转录组与蛋白质组的不匹配来推断新型基因调控关系。
bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.
3
Solid stress compression enhances breast cancer cell migration through the upregulation of Interleukin-6.固体应力压缩通过上调白细胞介素-6促进乳腺癌细胞迁移。

本文引用的文献

1
Experimental reproducibility limits the correlation between mRNA and protein abundances in tumor proteomic profiles.实验可重复性限制了肿瘤蛋白质组图谱中 mRNA 和蛋白质丰度之间的相关性。
Cell Rep Methods. 2022 Sep 8;2(9):100288. doi: 10.1016/j.crmeth.2022.100288. eCollection 2022 Sep 19.
2
Evaluation of machine learning models on protein level inference from prioritized RNA features.基于优先级 RNA 特征的蛋白质水平推断的机器学习模型评估。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac091.
3
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
Front Cell Dev Biol. 2025 Apr 30;13:1541953. doi: 10.3389/fcell.2025.1541953. eCollection 2025.
4
Multi-omics analysis reveals discordant proteome and transcriptome responses in larval guts of Frankliniella occidentalis infected with an orthotospovirus.多组学分析揭示了感染正粘病毒的西花蓟马幼虫肠道中蛋白质组和转录组反应的不一致性。
Insect Mol Biol. 2025 Oct;34(5):671-686. doi: 10.1111/imb.12992. Epub 2025 Apr 25.
5
Turnover atlas of proteome and phosphoproteome across mouse tissues and brain regions.小鼠组织和脑区蛋白质组及磷酸化蛋白质组的周转图谱。
Cell. 2025 Apr 17;188(8):2267-2287.e21. doi: 10.1016/j.cell.2025.02.021. Epub 2025 Mar 20.
6
Inferring protein from transcript abundances using convolutional neural networks.使用卷积神经网络从转录本丰度推断蛋白质。
BioData Min. 2025 Feb 27;18(1):18. doi: 10.1186/s13040-025-00434-z.
7
Combined statistical-biophysical modeling links ion channel genes to physiology of cortical neuron types.统计-生物物理联合建模将离子通道基因与皮层神经元类型的生理学联系起来。
bioRxiv. 2025 Jan 2:2023.03.02.530774. doi: 10.1101/2023.03.02.530774.
8
An Extensive Atlas of Proteome and Phosphoproteome Turnover Across Mouse Tissues and Brain Regions.小鼠组织和脑区蛋白质组及磷酸化蛋白质组周转的综合图谱
bioRxiv. 2024 Oct 17:2024.10.15.618303. doi: 10.1101/2024.10.15.618303.
9
Proteome-wide copy-number estimation from transcriptomics.基于转录组学的蛋白质组拷贝数估计。
Mol Syst Biol. 2024 Nov;20(11):1230-1256. doi: 10.1038/s44320-024-00064-3. Epub 2024 Sep 27.
10
Gene expression response under thermal stress in two Hawaiian corals is dominated by ploidy and genotype.两种夏威夷珊瑚在热应激下的基因表达反应主要由倍性和基因型决定。
Ecol Evol. 2024 Jul 24;14(7):e70037. doi: 10.1002/ece3.70037. eCollection 2024 Jul.
clusterProfiler 4.0:用于解释组学数据的通用富集工具。
Innovation (Camb). 2021 Jul 1;2(3):100141. doi: 10.1016/j.xinn.2021.100141. eCollection 2021 Aug 28.
4
A proteogenomic portrait of lung squamous cell carcinoma.肺鳞状细胞癌的蛋白质基因组图谱。
Cell. 2021 Aug 5;184(16):4348-4371.e40. doi: 10.1016/j.cell.2021.07.016.
5
Transcriptome features of striated muscle aging and predictability of protein level changes.横纹肌衰老的转录组特征及蛋白质水平变化的可预测性。
Mol Omics. 2021 Oct 11;17(5):796-808. doi: 10.1039/d1mo00178g.
6
Proteogenomic and metabolomic characterization of human glioblastoma.人类脑胶质瘤的蛋白质基因组学和代谢组学特征分析。
Cancer Cell. 2021 Apr 12;39(4):509-528.e20. doi: 10.1016/j.ccell.2021.01.006. Epub 2021 Feb 11.
7
Simplified and Unified Access to Cancer Proteogenomic Data.简化和统一的癌症蛋白质基因组学数据访问。
J Proteome Res. 2021 Apr 2;20(4):1902-1910. doi: 10.1021/acs.jproteome.0c00919. Epub 2021 Feb 9.
8
HiDeF: identifying persistent structures in multiscale 'omics data.HiDeF:识别多尺度“组学”数据中的持久结构。
Genome Biol. 2021 Jan 7;22(1):21. doi: 10.1186/s13059-020-02228-4.
9
The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets.2021 年的 STRING 数据库:可定制的蛋白质-蛋白质网络,以及用户上传的基因/测量集的功能特征分析。
Nucleic Acids Res. 2021 Jan 8;49(D1):D605-D612. doi: 10.1093/nar/gkaa1074.
10
Multiscale community detection in Cytoscape.Cytoscape 中的多尺度社区检测。
PLoS Comput Biol. 2020 Oct 23;16(10):e1008239. doi: 10.1371/journal.pcbi.1008239. eCollection 2020 Oct.