• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

联合学习提高癌症中蛋白质丰度预测。

Joint learning improves protein abundance prediction in cancers.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.

Department of Internal Medicine, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.

出版信息

BMC Biol. 2019 Dec 23;17(1):107. doi: 10.1186/s12915-019-0730-9.

DOI:10.1186/s12915-019-0730-9
PMID:31870366
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6929375/
Abstract

BACKGROUND

The classic central dogma in biology is the information flow from DNA to mRNA to protein, yet complicated regulatory mechanisms underlying protein translation often lead to weak correlations between mRNA and protein abundances. This is particularly the case in cancer samples and when evaluating the same gene across multiple samples.

RESULTS

Here, we report a method for predicting proteome from transcriptome, using a training dataset provided by NCI-CPTAC and TCGA, consisting of transcriptome and proteome data from 77 breast and 105 ovarian cancer samples. First, we establish a generic model capturing the correlation between mRNA and protein abundance of a single gene. Second, we build a gene-specific model capturing the interdependencies among multiple genes in a regulatory network. Third, we create a cross-tissue model by joint learning the information of shared regulatory networks and pathways across cancer tissues. Our method ranked first in the NCI-CPTAC DREAM Proteogenomics Challenge, and the predictive performance is close to the accuracy of experimental replicates. Key functional pathways and network modules controlling the proteomic abundance in cancers were revealed, in particular metabolism-related genes.

CONCLUSIONS

We present a method to predict proteome from transcriptome, leveraging data from different cancer tissues to build a trans-tissue model, and suggest how to integrate information from multiple cancers to provide a foundation for further research.

摘要

背景

生物学中的经典中心法则是指信息从 DNA 流向 mRNA 再流向蛋白质,然而,蛋白质翻译背后复杂的调控机制常常导致 mRNA 和蛋白质丰度之间的相关性较弱。在癌症样本中以及在评估多个样本中的相同基因时,这种情况尤其明显。

结果

在这里,我们报告了一种使用 NCI-CPTAC 和 TCGA 提供的训练数据集从转录组预测蛋白质组的方法,该数据集由 77 个乳腺癌和 105 个卵巢癌样本的转录组和蛋白质组数据组成。首先,我们建立了一个通用模型,该模型捕获了单个基因的 mRNA 和蛋白质丰度之间的相关性。其次,我们构建了一个基因特异性模型,该模型捕获了调控网络中多个基因之间的相互依赖关系。第三,我们通过联合学习癌症组织中共享调控网络和途径的信息来创建跨组织模型。我们的方法在 NCI-CPTAC DREAM Proteogenomics 挑战赛中排名第一,其预测性能接近实验复制品的准确性。揭示了控制癌症中蛋白质组丰度的关键功能途径和网络模块,特别是与代谢相关的基因。

结论

我们提出了一种从转录组预测蛋白质组的方法,利用来自不同癌症组织的数据构建跨组织模型,并提出如何整合来自多个癌症的信息,为进一步研究提供基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/1ffeb6e9c2c5/12915_2019_730_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/b8b3e57b8a57/12915_2019_730_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/b0547819d71f/12915_2019_730_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/7ee74c318e5d/12915_2019_730_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/9abddfa1861f/12915_2019_730_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/06af8d782503/12915_2019_730_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/1ffeb6e9c2c5/12915_2019_730_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/b8b3e57b8a57/12915_2019_730_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/b0547819d71f/12915_2019_730_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/7ee74c318e5d/12915_2019_730_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/9abddfa1861f/12915_2019_730_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/06af8d782503/12915_2019_730_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944d/6929375/1ffeb6e9c2c5/12915_2019_730_Fig6_HTML.jpg

相似文献

1
Joint learning improves protein abundance prediction in cancers.联合学习提高癌症中蛋白质丰度预测。
BMC Biol. 2019 Dec 23;17(1):107. doi: 10.1186/s12915-019-0730-9.
2
Machine learning empowers phosphoproteome prediction in cancers.机器学习赋能癌症磷酸化蛋白质组预测。
Bioinformatics. 2020 Feb 1;36(3):859-864. doi: 10.1093/bioinformatics/btz639.
3
Challenges in proteogenomics: a comparison of analysis methods with the case study of the DREAM proteogenomics sub-challenge.蛋白质基因组学面临的挑战:以 DREAM 蛋白质基因组学子挑战为例的分析方法比较。
BMC Bioinformatics. 2019 Dec 20;20(Suppl 24):669. doi: 10.1186/s12859-019-3253-z.
4
Integrative Proteo-genomic Analysis to Construct CNA-protein Regulatory Map in Breast and Ovarian Tumors.整合蛋白质基因组分析构建乳腺癌和卵巢肿瘤的 CNA-蛋白调控图谱。
Mol Cell Proteomics. 2019 Aug 9;18(8 suppl 1):S66-S81. doi: 10.1074/mcp.RA118.001229. Epub 2019 Jul 7.
5
Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.从基因组学和转录组学评估癌症蛋白和磷酸化蛋白水平的可预测性:社区视角。
Cell Syst. 2020 Aug 26;11(2):186-195.e9. doi: 10.1016/j.cels.2020.06.013. Epub 2020 Jul 24.
6
Evaluation of machine learning models on protein level inference from prioritized RNA features.基于优先级 RNA 特征的蛋白质水平推断的机器学习模型评估。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac091.
7
Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction.基于共表达的基因功能预测中,蛋白质组分析优于转录组分析。
Mol Cell Proteomics. 2017 Jan;16(1):121-134. doi: 10.1074/mcp.M116.060301. Epub 2016 Nov 11.
8
Integrated proteotranscriptomics of breast cancer reveals globally increased protein-mRNA concordance associated with subtypes and survival.乳腺癌的整合蛋白质组学转录组学研究揭示了与亚型和生存相关的全局蛋白质-mRNA 一致性增加。
Genome Med. 2018 Dec 3;10(1):94. doi: 10.1186/s13073-018-0602-x.
9
Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners.蛋白质预测模型通过与相互作用伙伴的相互作用,支持蛋白质丰度的广泛转录后调控。
PLoS Comput Biol. 2022 Nov 10;18(11):e1010702. doi: 10.1371/journal.pcbi.1010702. eCollection 2022 Nov.
10
Proteome-wide onco-proteogenomic somatic variant identification in ER-positive breast cancer.雌激素受体阳性乳腺癌中全蛋白质组肿瘤蛋白质基因组体细胞变异鉴定
Clin Biochem. 2019 Apr;66:63-75. doi: 10.1016/j.clinbiochem.2019.01.005. Epub 2019 Jan 23.

引用本文的文献

1
Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.蛋白质组生成器:利用转录组与蛋白质组的不匹配来推断新型基因调控关系。
bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.
2
Inferring protein from transcript abundances using convolutional neural networks.使用卷积神经网络从转录本丰度推断蛋白质。
BioData Min. 2025 Feb 27;18(1):18. doi: 10.1186/s13040-025-00434-z.
3
Extrapolated cross-validation for randomized ensembles.随机集成的外推交叉验证

本文引用的文献

1
Machine learning empowers phosphoproteome prediction in cancers.机器学习赋能癌症磷酸化蛋白质组预测。
Bioinformatics. 2020 Feb 1;36(3):859-864. doi: 10.1093/bioinformatics/btz639.
2
Quantification and discovery of sequence determinants of protein-per-mRNA amount in 29 human tissues.在 29 个人体组织中定量和发现蛋白质与 mRNA 数量的序列决定因素。
Mol Syst Biol. 2019 Feb 18;15(2):e8513. doi: 10.15252/msb.20188513.
3
Anchor: trans-cell type prediction of transcription factor binding sites.预测转录因子结合位点的跨细胞类型。
J Comput Graph Stat. 2024;33(3):1061-1072. doi: 10.1080/10618600.2023.2288194. Epub 2024 Jan 3.
4
Multi-dataset Integration and Residual Connections Improve Proteome Prediction from Transcriptomes using Deep Learning.多数据集整合与残差连接通过深度学习改进从转录组预测蛋白质组
bioRxiv. 2024 Jul 11:2024.07.08.602560. doi: 10.1101/2024.07.08.602560.
5
Predicting single-cell cellular responses to perturbations using cycle consistency learning.使用循环一致性学习预测单细胞对扰动的细胞反应。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i462-i470. doi: 10.1093/bioinformatics/btae248.
6
Predicting locus-specific DNA methylation levels in cancer and paracancer tissues.预测癌症组织和癌旁组织中位点特异性DNA甲基化水平。
Epigenomics. 2024 Mar 13;16(8):549-70. doi: 10.2217/epi-2023-0114.
7
PARROT: Prediction of enzyme abundances using protein-constrained metabolic models.利用蛋白约束代谢模型预测酶丰度。
PLoS Comput Biol. 2023 Oct 19;19(10):e1011549. doi: 10.1371/journal.pcbi.1011549. eCollection 2023 Oct.
8
Deep Learning in Phosphoproteomics: Methods and Application in Cancer Drug Discovery.磷酸化蛋白质组学中的深度学习:方法及其在癌症药物发现中的应用
Proteomes. 2023 May 2;11(2):16. doi: 10.3390/proteomes11020016.
9
Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners.蛋白质预测模型通过与相互作用伙伴的相互作用,支持蛋白质丰度的广泛转录后调控。
PLoS Comput Biol. 2022 Nov 10;18(11):e1010702. doi: 10.1371/journal.pcbi.1010702. eCollection 2022 Nov.
10
Experimental reproducibility limits the correlation between mRNA and protein abundances in tumor proteomic profiles.实验可重复性限制了肿瘤蛋白质组图谱中 mRNA 和蛋白质丰度之间的相关性。
Cell Rep Methods. 2022 Sep 8;2(9):100288. doi: 10.1016/j.crmeth.2022.100288. eCollection 2022 Sep 19.
Genome Res. 2019 Feb;29(2):281-292. doi: 10.1101/gr.237156.118. Epub 2018 Dec 19.
4
TAIJI: approaching experimental replicates-level accuracy for drug synergy prediction.太极:接近药物协同作用预测的实验重复水平的准确性。
Bioinformatics. 2019 Jul 1;35(13):2338-2339. doi: 10.1093/bioinformatics/bty955.
5
Network Propagation Predicts Drug Synergy in Cancers.网络传播预测癌症中的药物协同作用。
Cancer Res. 2018 Sep 15;78(18):5446-5457. doi: 10.1158/0008-5472.CAN-18-0740. Epub 2018 Jul 27.
6
Accurate prediction of personalized olfactory perception from large-scale chemoinformatic features.从大规模化学信息特征准确预测个性化嗅觉感知。
Gigascience. 2018 Feb 1;7(2):1-11. doi: 10.1093/gigascience/gix127.
7
Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer.肌层浸润性膀胱癌的综合分子特征分析
Cell. 2017 Oct 19;171(3):540-556.e25. doi: 10.1016/j.cell.2017.09.007. Epub 2017 Oct 5.
8
EGF hijacks miR-198/FSTL1 wound-healing switch and steers a two-pronged pathway toward metastasis.表皮生长因子劫持了miR-198/FSTL1伤口愈合开关,并引导一条双管齐下的转移途径。
J Exp Med. 2017 Oct 2;214(10):2889-2900. doi: 10.1084/jem.20170354. Epub 2017 Aug 21.
9
Can we predict protein from mRNA levels?我们能否根据mRNA水平预测蛋白质?
Nature. 2017 Jul 26;547(7664):E19-E20. doi: 10.1038/nature22293.
10
The BioGRID interaction database: 2017 update.生物通用互作数据库:2017年更新版。
Nucleic Acids Res. 2017 Jan 4;45(D1):D369-D379. doi: 10.1093/nar/gkw1102. Epub 2016 Dec 14.