基于从头转录组组装的乳腺癌转录组学和蛋白质组学数据的蛋白质基因组分析：新型肽的全基因组鉴定及其临床意义。

Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications.

机构信息

Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India.

Simulation and Modeling Sciences, Pfizer Pharma GmBH, Berlin, Germany.

出版信息

Mol Cell Proteomics. 2022 Apr;21(4):100220. doi: 10.1016/j.mcpro.2022.100220. Epub 2022 Feb 26.

DOI:10.1016/j.mcpro.2022.100220

PMID:35227895

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9020135/

Abstract

We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipeline that consisted of de novo transcript assembly, six frame-translated custom database, and a combination of search engines to identify novel peptides. A portfolio of 4,387 novel peptide sequences initially identified was further screened through PepQuery validation tool (Clinical Proteomic Tumor Analysis Consortium), which yielded 1,558 novel peptides. We considered the dataset of 1,558 validated through PepQuery to understand their functional and clinical significance, leaving the rest to be further verified using other validation tools and approaches. The novel peptides mapped to the known gene sequences as well as to genomic regions yet undefined for translation, 580 novel peptides mapped to known protein-coding genes, 147 to non-protein-coding genes, and 831 belonged to novel translational sequences. The novel peptides belonging to protein-coding genes represented alternatively spliced events or 5' or 3' extensions, whereas others represented translation from pseudogenes, long noncoding RNAs, or novel peptides originating from uncharacterized protein-coding sequences-mostly from the intronic regions of known genes. Seventy-six of the 580 protein-coding genes were associated with cancer hallmark genes, which included key oncogenes, transcription factors, kinases, and cell surface receptors. Survival association analysis of the 76 novel peptide sequences revealed 10 of them to be significant, and we present a panel of six novel peptides, whose high expression was found to be strongly associated with poor survival of patients with human epidermal growth factor receptor 2-enriched subtype. Our analysis represents a landscape of novel peptides of different types that may be expressed in breast cancer tissues, whereas their presence in full-length functional proteins needs further investigations.

摘要

我们对可从临床蛋白质组肿瘤分析联盟资源获得的乳腺癌转录组学和蛋白质组学数据进行了蛋白质基因组分析，以鉴定新的肽，这些肽源自可变剪接事件以及其他非典型表达。我们使用了一个由从头转录组组装、六个框架翻译的定制数据库以及搜索引擎组合组成的管道来鉴定新的肽。最初鉴定的 4387 个新肽序列组合进一步通过 PepQuery 验证工具（临床蛋白质组肿瘤分析联盟）进行筛选，得到 1558 个新肽。我们考虑了通过 PepQuery 验证的数据集，以了解其功能和临床意义，其余部分留待使用其他验证工具和方法进一步验证。新肽映射到已知基因序列以及尚未定义翻译的基因组区域，580 个新肽映射到已知的蛋白编码基因，147 个映射到非蛋白编码基因，831 个属于新的翻译序列。属于蛋白编码基因的新肽代表可变剪接事件或 5' 或 3' 延伸，而其他则代表来自未鉴定的蛋白编码序列的翻译，主要来自已知基因的内含子区域。580 个蛋白编码基因中的 76 个与癌症标志基因相关，其中包括关键的癌基因、转录因子、激酶和细胞表面受体。76 个新肽序列的生存关联分析显示其中 10 个具有显著意义，我们提出了一个由 6 个新肽组成的面板，发现它们的高表达与人类表皮生长因子受体 2 富集亚型患者的不良生存强烈相关。我们的分析代表了不同类型的新肽在乳腺癌组织中可能表达的情况，而它们在全长功能蛋白中的存在需要进一步研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dc4/9020135/3c21060bd1f8/fx1.jpg

相似文献

Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications.基于从头转录组组装的乳腺癌转录组学和蛋白质组学数据的蛋白质基因组分析：新型肽的全基因组鉴定及其临床意义。

Mol Cell Proteomics. 2022 Apr;21(4):100220. doi: 10.1016/j.mcpro.2022.100220. Epub 2022 Feb 26.

Identification of new protein coding sequences and signal peptidase cleavage sites of Helicobacter pylori strain 26695 by proteogenomics.通过蛋白质组学鉴定幽门螺杆菌 26695 株的新蛋白编码序列和信号肽切割位点。

J Proteomics. 2013 Jun 28;86:27-42. doi: 10.1016/j.jprot.2013.04.036. Epub 2013 May 9.

PepQuery enables fast, accurate, and convenient proteomic validation of novel genomic alterations.PepQuery 可实现对新型基因组改变的快速、准确和便捷的蛋白质组学验证。

Genome Res. 2019 Mar;29(3):485-493. doi: 10.1101/gr.235028.118. Epub 2019 Jan 4.

Computational identification of micro-structural variations and their proteogenomic consequences in cancer.计算鉴定癌症中的微观结构变化及其蛋白质基因组学后果。

Bioinformatics. 2018 May 15;34(10):1672-1681. doi: 10.1093/bioinformatics/btx807.

A Massive Proteogenomic Screen Identifies Thousands of Novel Peptides From the Human "Dark" Proteome.大规模蛋白质组学筛选鉴定出人“暗蛋白质组”中的数千种新型肽。

Mol Cell Proteomics. 2024 Feb;23(2):100719. doi: 10.1016/j.mcpro.2024.100719. Epub 2024 Jan 17.

JUMPg: An Integrative Proteogenomics Pipeline Identifying Unannotated Proteins in Human Brain and Cancer Cells.JUMPg：一种整合蛋白质基因组学流程，用于鉴定人脑中未注释的蛋白质以及癌细胞中的未注释蛋白质。

J Proteome Res. 2016 Jul 1;15(7):2309-20. doi: 10.1021/acs.jproteome.6b00344. Epub 2016 Jun 13.

ProteomeGenerator: A Framework for Comprehensive Proteomics Based on de Novo Transcriptome Assembly and High-Accuracy Peptide Mass Spectral Matching.蛋白质组生成器：基于从头转录组组装和高精度肽质量谱匹配的综合蛋白质组学框架。

J Proteome Res. 2018 Nov 2;17(11):3681-3692. doi: 10.1021/acs.jproteome.8b00295. Epub 2018 Oct 19.

Development of a Spectral Library for the Discovery of Altered Genomic Events in Mycobacterium avium Associated With Virulence Using Mass Spectrometry-Based Proteogenomic Analysis.利用基于质谱的蛋白质基因组分析发现与毒力相关的禽分枝杆菌中改变的基因组事件的光谱库的开发。

Mol Cell Proteomics. 2023 May;22(5):100533. doi: 10.1016/j.mcpro.2023.100533. Epub 2023 Mar 21.

Proteogenomic Methods to Improve Genome Annotation.用于改进基因组注释的蛋白质基因组学方法

Methods Mol Biol. 2016;1410:77-89. doi: 10.1007/978-1-4939-3524-6_5.

Combination of Proteogenomics with Peptide Sequencing Identifies New Genes and Hidden Posttranscriptional Modifications.蛋白质基因组学与肽测序相结合，可鉴定新基因和隐藏的转录后修饰。

mBio. 2019 Oct 15;10(5):e02367-19. doi: 10.1128/mBio.02367-19.

引用本文的文献

Mechanisms and technologies in cancer epigenetics.癌症表观遗传学的机制与技术

Front Oncol. 2025 Jan 7;14:1513654. doi: 10.3389/fonc.2024.1513654. eCollection 2024.

Multiomics insights on the onset, progression, and metastatic evolution of breast cancer.乳腺癌发病、进展和转移演变的多组学见解

Front Oncol. 2023 Dec 19;13:1292046. doi: 10.3389/fonc.2023.1292046. eCollection 2023.

Nutritional Metabolomics in Diet-Breast Cancer Relations: Current Research, Challenges, and Future Directions-A Review.饮食与乳腺癌关系中的营养代谢组学：当前研究、挑战及未来方向——综述

Biomedicines. 2023 Jun 27;11(7):1845. doi: 10.3390/biomedicines11071845.

Omics-Based Investigations of Breast Cancer.基于组学的乳腺癌研究。

Molecules. 2023 Jun 14;28(12):4768. doi: 10.3390/molecules28124768.

PepQuery2 democratizes public MS proteomics data for rapid peptide searching. PepQuery2 使公共 MS 蛋白质组学数据民主化，便于快速进行肽搜索。

Nat Commun. 2023 Apr 18;14(1):2213. doi: 10.1038/s41467-023-37462-4.

本文引用的文献

Proteogenomic and metabolomic characterization of human glioblastoma.人类脑胶质瘤的蛋白质基因组学和代谢组学特征分析。

Cancer Cell. 2021 Apr 12;39(4):509-528.e20. doi: 10.1016/j.ccell.2021.01.006. Epub 2021 Feb 11.

Pan-cancer analysis of transcripts encoding novel open-reading frames (nORFs) and their potential biological functions.对编码新型开放阅读框（nORFs）的转录本及其潜在生物学功能的泛癌分析。

NPJ Genom Med. 2021 Jan 25;6(1):4. doi: 10.1038/s41525-020-00167-4.

A high-stringency blueprint of the human proteome.人类蛋白质组的高精度蓝图。

Nat Commun. 2020 Oct 16;11(1):5301. doi: 10.1038/s41467-020-19045-9.

FGF/FGFR signaling in health and disease.成纤维细胞生长因子/成纤维细胞生长因子受体信号在健康和疾病中的作用。

Signal Transduct Target Ther. 2020 Sep 2;5(1):181. doi: 10.1038/s41392-020-00222-7.

Re-recognition of pseudogenes: From molecular to clinical applications.假基因的再识别：从分子到临床应用。

Theranostics. 2020 Jan 1;10(4):1479-1499. doi: 10.7150/thno.40659. eCollection 2020.

Emerging role of tumor-related functional peptides encoded by lncRNA and circRNA.长链非编码 RNA 和环状 RNA 编码的肿瘤相关功能肽的新兴作用。

Mol Cancer. 2020 Feb 4;19(1):22. doi: 10.1186/s12943-020-1147-3.

When Long Noncoding Becomes Protein Coding.长非编码 RNA 变成蛋白质编码

Mol Cell Biol. 2020 Feb 27;40(6). doi: 10.1128/MCB.00528-19.

Integrated Proteogenomic Characterization of HBV-Related Hepatocellular Carcinoma.HBV 相关肝细胞癌的综合蛋白质基因组特征分析。

Cell. 2019 Oct 3;179(2):561-577.e22. doi: 10.1016/j.cell.2019.08.052.

in Cancer: Mechanisms of Altered Expression and Function, and Clinical Implications.《癌症：表达与功能改变的机制及临床意义》

Cancers (Basel). 2019 Sep 29;11(10):1462. doi: 10.3390/cancers11101462.

Intramolecular electrostatic interactions contribute to phospholipase Cβ3 autoinhibition.分子内静电相互作用有助于磷脂酶 Cβ3 的自动抑制。

Cell Signal. 2019 Oct;62:109349. doi: 10.1016/j.cellsig.2019.109349. Epub 2019 Jun 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于从头转录组组装的乳腺癌转录组学和蛋白质组学数据的蛋白质基因组分析：新型肽的全基因组鉴定及其临床意义。

Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献