Mossé M O, Linder P, Lazowska J, Slonimski P P
Centre de Génétique Moléculaire, Université Pierre et Marie Curie, Gif-sur Yvette, France.
Curr Genet. 1993 Jan;23(1):66-91. doi: 10.1007/BF00336752.
The amount of nucleotide sequence data is increasing exponentially. We therefore continued our effort to make a comprehensive database for the yeast Saccharomyces cerevisiae. In this database (ListA2) we have compiled 1001 protein coding sequences from this organism. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included for each entry. The database is available on request.
核苷酸序列数据量正呈指数级增长。因此,我们继续努力为酿酒酵母构建一个全面的数据库。在这个数据库(ListA2)中,我们汇编了来自该生物体的1001个蛋白质编码序列。每个序列都被赋予了一个单一的基因名称,对于等位重复序列,如有必要会给出同义词。对于命名法,我们引入了基于优先级规则的基因序列命名标准原则。我们还应用了一种简单的方法来区分同一基因的重复序列与重复基因的非等位序列。通过使用这些原则,我们理清了文献和数据库中的许多混乱情况。每个条目除了基因名称外,还包括来自EMBL数据库的助记符、密码子偏好性、序列发表的参考文献以及EMBL登录号。该数据库可应要求提供。