Moretti Sébastien, Reinier Frédéric, Poirot Olivier, Armougom Fabrice, Audic Stéphane, Keduas Vladimir, Notredame Cédric
Information Génomique et Structurale, CNRS UPR2589, Institute for Structural Biology and Microbiology (IBSM), Parc Scientifique de Luminy, 163 Avenue de Luminy, FR 13288, Marseille cedex 09, France.
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W600-3. doi: 10.1093/nar/gkl170.
We describe Protogene, a server that can turn a protein multiple sequence alignment into the equivalent alignment of the original gene coding DNA. Protogene relies on a pipeline where every initial protein sequence is BLASTed against RefSeq or NR. The annotation associated with potential matches is used to identify the gene sequence. This gene sequence is then aligned with the query protein using Exonerate in order to extract a coding nucleotide sequence matching the original protein. Protogene can handle protein fragments and will return every CDS coding for a given protein, even if they occur in different genomes. Protogene is available from http://www.tcoffee.org/.
我们描述了Protogene,这是一个能够将蛋白质多序列比对转化为原始基因编码DNA等效比对的服务器。Protogene依赖于一个流程,其中每个初始蛋白质序列都要与RefSeq或NR进行BLAST比对。与潜在匹配相关的注释用于识别基因序列。然后使用Exonerate将该基因序列与查询蛋白质进行比对,以提取与原始蛋白质匹配的编码核苷酸序列。Protogene可以处理蛋白质片段,并且会返回编码给定蛋白质的每个CDS,即使它们出现在不同的基因组中。可从http://www.tcoffee.org/获取Protogene。