J Proteome Res. 2018 Dec 7;17(12):4235-4242. doi: 10.1021/acs.jproteome.8b00548. Epub 2018 Oct 15.
One of the goals of the Chromosome-Centric Human Proteome Project (C-HPP) is to map and characterize the functions of protein isoforms produced by alternative splicing of genes. However, identifying alternative splice variants (ASVs) via mass spectrometry remains a major challenge, because ASVs usually contain highly homologous peptide sequences. A routine protein sequence analysis suggests that more than half of the investigated proteins do not generate two or more uniquely mapping peptides that would enable their isoforms to be distinguished. Here, we develop a new proteogenomics method, named "ASV-ID" (alternative splicing variants identification), which enables identification of ASVs by using a cell type-specific protein sequence database that is supported by RNA-Seq data. Using this workflow, we identify 1935 distinct proteins under highly stringent conditions. In fact, transcript evidence on these 841 proteins helps us distinguish them from other isoforms, despite the fact that these proteins are not predicted to make 2 or more uniquely mapping peptides. We also demonstrate that ASV-ID enables detection of 19 differently expressed isoforms present in several cell lines. Thus, a new workflow using ASV-ID has the potential to map yet-to-be-identified difficult protein isoforms in a simple and robust way.
染色体为中心的人类蛋白质组计划(C-HPP)的目标之一是绘制和描述基因选择性剪接产生的蛋白质异构体的功能。然而,通过质谱法鉴定选择性剪接变体(ASV)仍然是一个主要的挑战,因为 ASV 通常包含高度同源的肽序列。常规的蛋白质序列分析表明,超过一半的被研究的蛋白质不会产生两个或更多独特映射的肽,从而无法区分它们的异构体。在这里,我们开发了一种新的蛋白质基因组学方法,名为“ASV-ID”(选择性剪接变体识别),它可以通过使用由 RNA-Seq 数据支持的细胞类型特异性蛋白质序列数据库来识别 ASV。使用这个工作流程,我们在高度严格的条件下鉴定了 1935 种独特的蛋白质。事实上,关于这些 841 种蛋白质的转录证据有助于我们将它们与其他异构体区分开来,尽管这些蛋白质预计不会产生 2 个或更多独特映射的肽。我们还证明,ASV-ID 能够检测到几种细胞系中存在的 19 种表达不同的异构体。因此,使用 ASV-ID 的新工作流程有可能以简单而稳健的方式绘制尚未确定的困难蛋白质异构体。