蛋白质组生成器：利用转录组与蛋白质组的不匹配来推断新型基因调控关系。

Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.

作者信息

Deangeli Giulio, Spillantini Maria Grazia, Liò Pietro

机构信息

University of Cambridge, Department of Clinical Neurosciences, Clifford Allbutt Building, Hills Road, CB2 0HA Cambridge, UK.

University of Cambridge, Department of Computer Science and Technology, William Gates Building, 15 J. J. Thomson Ave, CB3 0FD Cambridge, UK.

出版信息

bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.

DOI:10.1101/2025.06.22.660946

PMID:40666834

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12262288/

Abstract

The correlation between transcriptomic (Tx) and proteomic (Px) profiles remains modest, typically around across genes and across samples, limiting the utility of transcriptomic data as a proxy for protein abundance. To address this, we introduce Proteomizer, a deep learning platform designed to infer a sample's Px landscape from its Tx and miRNomic (Mx) profiles. Trained on 8,613 matched Tx-Mx-Px samples from TCGA and CPTAC, Proteomizer achieved a Tx-Px correlation of , representing the highest performance reported to date for this task. We further developed a Monte Carlo simulation framework to evaluate the impact of proteomization on differential expression analysis. Proteomizer substantially improved the accuracy of differential gene expression detection, with p-value precision increasing by up to 62-fold, and by as much as six orders of magnitude for a subset of genes enriched in mitochondrial and ribosomal functions. However, performance gains did not generalize to unseen tissue types or datasets generated using different protocols. Finally, we applied explainable AI (XAI) techniques to identify regulatory relations contributing to Tx-Px discrepancies. Our predictions from 100 highly annotated genes were cross-compared against by a literature-based biological knowledge graph of 322 million annotations: our explainers achieved a ROC-AUC of 0.74 in predicting miRNA-gene downregulation interactions. To our knowledge, this is the first study to systematically evaluate the biological relevance, limitations, and interpretability of proteomization models, establishing Proteomizer as a state-of-the-art tool for multiomic integration and hypothesis generation.

摘要

转录组（Tx）和蛋白质组（Px）图谱之间的相关性仍然不高，通常在基因层面约为[具体数值1]，在样本层面约为[具体数值2]，这限制了转录组数据作为蛋白质丰度替代指标的实用性。为了解决这个问题，我们引入了Proteomizer，这是一个深度学习平台，旨在从样本的Tx和miRNA组（Mx）图谱推断其Px图谱。在来自TCGA和CPTAC的8613个匹配的Tx-Mx-Px样本上进行训练后，Proteomizer实现了Tx-Px相关性为[具体数值3]，代表了迄今为止该任务所报告的最高性能。我们进一步开发了一个蒙特卡罗模拟框架，以评估蛋白质组化对差异表达分析的影响。Proteomizer显著提高了差异基因表达检测的准确性，p值精度提高了多达62倍，对于线粒体和核糖体功能富集的基因子集，提高了多达六个数量级。然而，性能提升并未推广到未见过的组织类型或使用不同协议生成的数据集。最后，我们应用可解释人工智能（XAI）技术来识别导致Tx-Px差异的调控关系。我们对100个高度注释基因的预测与一个基于文献的3.22亿条注释的生物知识图谱进行了交叉比较：我们的解释器在预测miRNA-基因下调相互作用时的ROC-AUC为0.74。据我们所知，这是第一项系统评估蛋白质组化模型的生物学相关性、局限性和可解释性的研究，将Proteomizer确立为多组学整合和假设生成的先进工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ba1/12262288/cc5370b7ea1d/nihpp-2025.06.22.660946v1-f0002.jpg

相似文献

Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.

bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.

Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.

Single-cell analysis comparing early-stage oocytes from fresh and slow-frozen/thawed human ovarian cortex reveals minimal impact of cryopreservation on the oocyte transcriptome.

Hum Reprod. 2025 Apr 1;40(4):683-694. doi: 10.1093/humrep/deaf009.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.

Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Learning-based early detection of post-hepatectomy liver failure using temporal perioperative data: a nationwide multicenter retrospective study in China.

EClinicalMedicine. 2025 May 6;83:103220. doi: 10.1016/j.eclinm.2025.103220. eCollection 2025 May.

Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.

Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

本文引用的文献

The STRING database in 2025: protein networks with directionality of regulation.

Nucleic Acids Res. 2025 Jan 6;53(D1):D730-D737. doi: 10.1093/nar/gkae1113.

Complex portal 2025: predicted human complexes and enhanced visualisation tools for the comparison of orthologous and paralogous complexes.

Nucleic Acids Res. 2025 Jan 6;53(D1):D644-D650. doi: 10.1093/nar/gkae1085.

SURF2 is a MDM2 antagonist in triggering the nucleolar stress response.

Nat Commun. 2024 Sep 27;15(1):8404. doi: 10.1038/s41467-024-52659-x.

TarBase-v9.0 extends experimentally supported miRNA-gene interactions to cell-types and virally encoded miRNAs.

Nucleic Acids Res. 2024 Jan 5;52(D1):D304-D310. doi: 10.1093/nar/gkad1071.

Pan-Cancer Proteomics Analysis to Identify Tumor-Enriched and Highly Expressed Cell Surface Antigens as Potential Targets for Cancer Therapeutics.

Mol Cell Proteomics. 2023 Sep;22(9):100626. doi: 10.1016/j.mcpro.2023.100626. Epub 2023 Jul 28.

Decreased miR-451a in cerebrospinal fluid, a marker for both cognitive impairment and depressive symptoms in Alzheimer's disease.

Theranostics. 2023 May 15;13(9):3021-3040. doi: 10.7150/thno.81826. eCollection 2023.

A previously uncharacterized Factor Associated with Metabolism and Energy (FAME/C14orf105/CCDC198/1700011H14Rik) is related to evolutionary adaptation, energy balance, and kidney physiology.

Nat Commun. 2023 May 29;14(1):3092. doi: 10.1038/s41467-023-38663-7.

DIANA-microT 2023: including predicted targets of virally encoded miRNAs.

Nucleic Acids Res. 2023 Jul 5;51(W1):W148-W153. doi: 10.1093/nar/gkad283.

Autophagy genes in biology and disease.

Nat Rev Genet. 2023 Jun;24(6):382-400. doi: 10.1038/s41576-022-00562-w. Epub 2023 Jan 12.

COX7A1 enhances the sensitivity of human NSCLC cells to cystine deprivation-induced ferroptosis via regulating mitochondrial metabolism.

Cell Death Dis. 2022 Nov 23;13(11):988. doi: 10.1038/s41419-022-05430-3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

蛋白质组生成器：利用转录组与蛋白质组的不匹配来推断新型基因调控关系。

Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.

作者信息

Deangeli Giulio, Spillantini Maria Grazia, Liò Pietro

机构信息

University of Cambridge, Department of Clinical Neurosciences, Clifford Allbutt Building, Hills Road, CB2 0HA Cambridge, UK.

University of Cambridge, Department of Computer Science and Technology, William Gates Building, 15 J. J. Thomson Ave, CB3 0FD Cambridge, UK.

出版信息

bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.

DOI:10.1101/2025.06.22.660946

PMID:40666834

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12262288/

Abstract

摘要

蛋白质组生成器：利用转录组与蛋白质组的不匹配来推断新型基因调控关系。

Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

蛋白质组生成器：利用转录组与蛋白质组的不匹配来推断新型基因调控关系。

Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.

作者信息

机构信息

出版信息

相似文献

本文引用的文献