自动注释多个基因组上实验衍生的进化保守的翻译后修饰。
Automatic annotation of experimentally derived, evolutionarily conserved post-translational modifications onto multiple genomes.
机构信息
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, MD, USA.
出版信息
Database (Oxford). 2011 May 13;2011:bar019. doi: 10.1093/database/bar019. Print 2011.
New generation sequencing technologies have resulted in significant increases in the number of complete genomes. Functional characterization of these genomes, such as by high-throughput proteomics, is an important but challenging task due to the difficulty of scaling up existing experimental techniques. By use of comparative genomics techniques, experimental results can be transferred from one genome to another, while at the same time minimizing errors by requiring discovery in multiple genomes. In this study, protein phosphorylation, an essential component of many cellular processes, is studied using data from large-scale proteomics analyses of the phosphoproteome. Phosphorylation sites from Homo sapiens, Mus musculus and Drosophila melanogaster phosphopeptide data sets were mapped onto conserved domains in NCBI's manually curated portion of Conserved Domain Database (CDD). In this subset, 25 phosphorylation sites are found to be evolutionarily conserved between the three species studied. Transfer of phosphorylation annotation of these conserved sites onto sequences sharing the same conserved domains yield 3253 phosphosite annotations for proteins from coelomata, the taxonomic division that spans H. sapiens, M. musculus and D. melanogaster. The method scales automatically, so as the amount of experimental phosphoproteomics data increases, more conserved phosphorylation sites may be revealed.
新一代测序技术使得完整基因组的数量显著增加。这些基因组的功能表征,如高通量蛋白质组学,是一项重要但具有挑战性的任务,因为现有实验技术的扩展难度很大。通过使用比较基因组学技术,可以将实验结果从一个基因组转移到另一个基因组,同时通过要求在多个基因组中发现来最小化错误。在这项研究中,使用大规模蛋白质组学分析磷酸蛋白质组的数据来研究蛋白质磷酸化,这是许多细胞过程的重要组成部分。来自 Homo sapiens、Mus musculus 和 Drosophila melanogaster 磷酸肽数据集的磷酸化位点被映射到 NCBI 手动编辑的保守域数据库 (CDD) 的保守域部分。在这个子集中,在这三个研究物种之间发现了 25 个磷酸化位点是进化保守的。将这些保守位点的磷酸化注释转移到具有相同保守域的序列上,为来自体腔动物(跨越 Homo sapiens、Mus musculus 和 Drosophila melanogaster 的分类群)的蛋白质提供了 3253 个磷酸化位点注释。该方法可以自动扩展,因此随着实验磷酸蛋白质组学数据量的增加,可能会揭示更多保守的磷酸化位点。