Suppr超能文献

蛋白质基因组学:蛋白质组学在基因组注释中需要发挥的作用

Proteogenomics: needs and roles to be filled by proteomics in genome annotation.

作者信息

Ansong Charles, Purvine Samuel O, Adkins Joshua N, Lipton Mary S, Smith Richard D

机构信息

Biological Sciences Division, Pacific Northwest National Laboratory, P.O. Box 999/K8-98, Richland, WA 99352, USA.

出版信息

Brief Funct Genomic Proteomic. 2008 Jan;7(1):50-62. doi: 10.1093/bfgp/eln010. Epub 2008 Mar 10.

Abstract

While genome sequencing efforts reveal the basic building blocks of life, a genome sequence alone is insufficient for elucidating biological function. Genome annotation--the process of identifying genes and assigning function to each gene in a genome sequence--provides the means to elucidate biological function from sequence. Current state-of-the-art high-throughput genome annotation uses a combination of comparative (sequence similarity data) and non-comparative (ab initio gene prediction algorithms) methods to identify protein-coding genes in genome sequences. Because approaches used to validate the presence of predicted protein-coding genes are typically based on expressed RNA sequences, they cannot independently and unequivocally determine whether a predicted protein-coding gene is translated into a protein. With the ability to directly measure peptides arising from expressed proteins, high-throughput liquid chromatography-tandem mass spectrometry-based proteomics approaches can be used to verify coding regions of a genomic sequence. Here, we highlight several ways in which high-throughput tandem mass spectrometry-based proteomics can improve the quality of genome annotations and suggest that it could be efficiently applied during the gene calling process so that the improvements are propagated through the subsequent functional annotation process.

摘要

虽然基因组测序工作揭示了生命的基本组成部分,但仅靠基因组序列不足以阐明生物学功能。基因组注释——识别基因并为基因组序列中的每个基因赋予功能的过程——提供了从序列阐明生物学功能的方法。当前最先进的高通量基因组注释使用比较(序列相似性数据)和非比较(从头基因预测算法)方法的组合来识别基因组序列中的蛋白质编码基因。由于用于验证预测的蛋白质编码基因存在的方法通常基于表达的RNA序列,因此它们不能独立且明确地确定预测的蛋白质编码基因是否被翻译成蛋白质。基于高通量液相色谱 - 串联质谱的蛋白质组学方法能够直接测量由表达的蛋白质产生的肽段,可用于验证基因组序列的编码区域。在这里,我们强调了基于高通量串联质谱的蛋白质组学可以提高基因组注释质量的几种方式,并表明它可以在基因识别过程中有效应用,以便这些改进在随后的功能注释过程中得以延续。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验