Gupta Nitin, Benhamida Jamal, Bhargava Vipul, Goodman Daniel, Kain Elisabeth, Kerman Ian, Nguyen Ngan, Ollikainen Noah, Rodriguez Jesse, Wang Jian, Lipton Mary S, Romine Margaret, Bafna Vineet, Smith Richard D, Pevzner Pavel A
Bioinformatics Program, University of California San Diego, La Jolla, California 92093, USA.
Genome Res. 2008 Jul;18(7):1133-42. doi: 10.1101/gr.074344.107. Epub 2008 Apr 21.
Recent proliferation of low-cost DNA sequencing techniques will soon lead to an explosive growth in the number of sequenced genomes and will turn manual annotations into a luxury. Mass spectrometry recently emerged as a valuable technique for proteogenomic annotations that improves on the state-of-the-art in predicting genes and other features. However, previous proteogenomic approaches were limited to a single genome and did not take advantage of analyzing mass spectrometry data from multiple genomes at once. We show that such a comparative proteogenomics approach (like comparative genomics) allows one to address the problems that remained beyond the reach of the traditional "single proteome" approach in mass spectrometry. In particular, we show how comparative proteogenomics addresses the notoriously difficult problem of "one-hit-wonders" in proteomics, improves on the existing gene prediction tools in genomics, and allows identification of rare post-translational modifications. We therefore argue that complementing DNA sequencing projects by comparative proteogenomics projects can be a viable approach to improve both genomic and proteomic annotations.
近期低成本DNA测序技术的迅速发展,将很快导致测序基因组数量呈爆发式增长,并使人工注释成为一种奢侈。质谱技术最近成为蛋白质基因组注释的一项有价值的技术,它在预测基因和其他特征方面改进了现有技术水平。然而,以前的蛋白质基因组方法仅限于单个基因组,没有利用同时分析多个基因组的质谱数据。我们表明,这样一种比较蛋白质基因组学方法(类似于比较基因组学)能够解决传统“单蛋白质组”质谱方法难以企及的问题。特别是,我们展示了比较蛋白质基因组学如何解决蛋白质组学中臭名昭著的“一锤子买卖”难题,改进基因组学中现有的基因预测工具,并能够识别罕见的翻译后修饰。因此,我们认为通过比较蛋白质基因组学项目对DNA测序项目进行补充,可能是一种改善基因组和蛋白质组注释的可行方法。