Kroll José E, de Souza Sandro J, de Souza Gustavo A
Institute of Bioinformatics and Biotechnology , Natal , Brazil ; Brain Institute, UFRN , Natal , Brazil.
Brain Institute, UFRN , Natal , Brazil.
PeerJ. 2014 Nov 13;2:e673. doi: 10.7717/peerj.673. eCollection 2014.
Integration of transcriptome data is a crucial step for the identification of rare protein variants in mass-spectrometry (MS) data with important consequences for all branches of biotechnology research. Here, we used Splooce, a database of splicing variants recently developed by us, to search MS data derived from a variety of human tumor cell lines. More than 800 new protein variants were identified whose corresponding MS spectra were specific to protein entries from Splooce. Although the types of splicing variants (exon skipping, alternative splice sites and intron retention) were found at the same frequency as in the transcriptome, we observed a large variety of modifications at the protein level induced by alternative splicing events. Surprisingly, we found that 40% of all protein modifications induced by alternative splicing led to the use of alternative translation initiation sites. Other modifications include frameshifts in the open reading frame and inclusion or deletion of peptide sequences. To make the dataset generated here available to the community in a more effective form, the Splooce portal (http://www.bioinformatics-brazil.org/splooce) was modified to report the alternative splicing events supported by MS data.
转录组数据整合是在质谱(MS)数据中鉴定罕见蛋白质变体的关键步骤,对生物技术研究的各个分支都具有重要意义。在此,我们使用了Splooce(我们最近开发的一个剪接变体数据库)来搜索源自多种人类肿瘤细胞系的MS数据。鉴定出了800多种新的蛋白质变体,其相应的MS谱图对Splooce中的蛋白质条目具有特异性。尽管发现剪接变体的类型(外显子跳跃、可变剪接位点和内含子保留)与转录组中的频率相同,但我们观察到可变剪接事件在蛋白质水平上诱导了多种修饰。令人惊讶的是,我们发现由可变剪接诱导的所有蛋白质修饰中有40%导致了可变翻译起始位点的使用。其他修饰包括开放阅读框中的移码以及肽序列的包含或缺失。为了以更有效的形式向社区提供此处生成的数据集,对Splooce门户(http://www.bioinformatics-brazil.org/splooce)进行了修改,以报告由MS数据支持的可变剪接事件。