半子囊菌酵母的基因组探索：3. 用于序列分析和注释的方法与策略。

Genomic exploration of the hemiascomycetous yeasts: 3. Methods and strategies used for sequence analysis and annotation.

作者信息

Tekaia F, Blandin G, Malpertuy A, Llorente B, Durrens P, Toffano-Nioche C, Ozier-Kalogeropoulos O, Bon E, Gaillardin C, Aigle M, Bolotin-Fukuhara M, Casarégola S, de Montigny J, Lépingle A, Neuvéglise C, Potier S, Souciet J, Wésolowski-Louvel M, Dujon B

机构信息

Unité de Génétique Moléculaire des Levures (URA 2171 CNRS and UFR927 Univ. P.M. Curie), Institut Pasteur, Paris, France.

出版信息

FEBS Lett. 2000 Dec 22;487(1):17-30. doi: 10.1016/s0014-5793(00)02274-2.

DOI:10.1016/s0014-5793(00)02274-2

PMID:11152878

Abstract

The primary analysis of the sequences for our Hemiascomycete random sequence tag (RST) project was performed using a combination of classical methods for sequence comparison and contig assembly, and of specifically written scripts and computer visualization routines. Comparisons were performed first against DNA and protein sequences from Saccharomyces cerevisiae, then against protein sequences from other completely sequenced organisms and, finally, against protein sequences from all other organisms. Blast alignments were individually inspected to help recognize genes within our random genomic sequences despite the fact that only parts of them were available. For each yeast species, validated alignments were used to infer the proper genetic code, to determine codon usage preferences and to calculate their degree of sequence divergence with S. cerevisiae. The quality of each genomic library was monitored from contig analysis of the DNA sequences. Annotated sequences were submitted to the EMBL database, and the general annotation tables produced served as a basis for our comparative description of the evolution, redundancy and function of the Hemiascomycete genomes described in other articles of this issue.

摘要

我们的半子囊菌随机序列标签（RST）项目序列的初步分析，是结合使用经典的序列比较和重叠群组装方法，以及专门编写的脚本和计算机可视化程序进行的。首先将序列与酿酒酵母的DNA和蛋白质序列进行比较，然后与其他全序列测定生物的蛋白质序列进行比较，最后与所有其他生物的蛋白质序列进行比较。尽管我们的随机基因组序列只有部分可用，但仍对Blast比对结果进行逐一检查，以帮助识别其中的基因。对于每个酵母物种，利用经过验证的比对结果来推断正确的遗传密码，确定密码子使用偏好，并计算它们与酿酒酵母的序列差异程度。通过对DNA序列的重叠群分析来监测每个基因组文库的质量。注释后的序列已提交至EMBL数据库，所生成的通用注释表为本刊其他文章中描述的半子囊菌基因组的进化、冗余和功能的比较描述提供了依据。