‡Interdisciplinary Program of Integrated OMICS for Biomedical Science, The Graduate School, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; §Yonsei Proteome Research Center, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea.
§Yonsei Proteome Research Center, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea.
Mol Cell Proteomics. 2019 Aug;18(8):1651-1668. doi: 10.1074/mcp.RA119.001456. Epub 2019 Jun 17.
Fusion proteoforms are translation products derived from gene fusion. Although very rare, the fusion proteoforms play important roles in biomedical science. For example, fusion proteoforms influence the development of tumors by serving as cancer markers or cell cycle regulators. Although numerous studies have reported bioinformatics tools that can predict fusion transcripts, few proteogenomic tools are available that can predict and identify proteoforms. In this study, we develop a versatile proteogenomic tool "FusionPro," which facilitates the identification of fusion transcripts and their potential translatable peptides. FusionPro provides an independent gene fusion prediction module and can build sequence databases for annotated fusion proteoforms. FusionPro shows greater sensitivity than the available fusion finders when analyzing simulated or real RNA sequencing data sets. We use FusionPro to identify 18 fusion junction peptides and three potential fusion-derived peptides by MS/MS-based analysis of leukemia cell lines (Jurkat and K562) and ovarian cancer tissues from the Clinical Proteomic Tumor Analysis Consortium. Among the identified fusion proteins, we molecularly validate two fusion junction isoforms and a translation product of Moreover, sequence analysis suggests that the fusion protein participates in the cell cycle progression. In addition, our prediction results indicate that fusion transcripts often have multiple fusion junctions and that these fusion junctions tend to be distributed in a nonrandom pattern at both the chromosome and gene levels. Thus, FusionPro allows users to detect various types of fusion translation products using a transcriptome-informed approach and to gain a comprehensive understanding of the formation and biological roles of fusion proteoforms.
融合蛋白是由基因融合产生的翻译产物。虽然非常罕见,但融合蛋白在生物医学科学中起着重要作用。例如,融合蛋白通过作为癌症标志物或细胞周期调节剂来影响肿瘤的发展。尽管有许多研究报道了可以预测融合转录本的生物信息学工具,但可用的预测和鉴定蛋白变体的蛋白质组学工具却很少。在这项研究中,我们开发了一种通用的蛋白质组学工具“FusionPro”,它可以帮助鉴定融合转录本及其潜在的可翻译肽。FusionPro 提供了一个独立的基因融合预测模块,并可以构建注释融合蛋白变体的序列数据库。在分析模拟或真实 RNA 测序数据集时,FusionPro 比现有的融合发现器具有更高的灵敏度。我们使用 FusionPro 通过对白血病细胞系(Jurkat 和 K562)和来自临床蛋白质组肿瘤分析联盟的卵巢癌组织进行基于 MS/MS 的分析,鉴定了 18 个融合接头肽和三个潜在的融合衍生肽。在所鉴定的融合蛋白中,我们通过分子验证了两个融合接头异构体和一个翻译产物。此外,序列分析表明,该融合蛋白参与细胞周期进程。此外,我们的预测结果表明,融合转录本通常具有多个融合接头,并且这些融合接头倾向于在染色体和基因水平上以非随机模式分布。因此,FusionPro 允许用户使用基于转录组的方法检测各种类型的融合翻译产物,并全面了解融合蛋白变体的形成和生物学作用。