Peña-Castillo Lourdes, Hughes Timothy R
Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada.
Genetics. 2007 May;176(1):7-14. doi: 10.1534/genetics.107.074468. Epub 2007 Apr 15.
The yeast genetics community has embraced genomic biology, and there is a general understanding that obtaining a full encyclopedia of functions of the approximately 6000 genes is a worthwhile goal. The yeast literature comprises over 40,000 research papers, and the number of yeast researchers exceeds the number of genes. There are mutated and tagged alleles for virtually every gene, and hundreds of high-throughput data sets and computational analyses have been described. Why, then, are there >1000 genes still listed as uncharacterized on the Saccharomyces Genome Database, 10 years after sequencing the genome of this powerful model organism? Examination of the currently uncharacterized gene set suggests that while some are small or newly discovered, the vast majority were evident from the initial genome sequence. Most are present in multiple genomics data sets, which may provide clues to function. In addition, roughly half contain recognizable protein domains, and many of these suggest specific metabolic activities. Notably, the uncharacterized gene set is highly enriched for genes whose only homologs are in other fungi. Achieving a full catalog of yeast gene functions may require a greater focus on the life of yeast outside the laboratory.
酵母遗传学领域已经接受了基因组生物学,并且人们普遍认为,获取约6000个基因的完整功能百科全书是一个值得追求的目标。酵母相关文献包含超过40000篇研究论文,酵母研究人员的数量超过了基因的数量。几乎每个基因都有突变和标记的等位基因,并且已经描述了数百个高通量数据集和计算分析。那么,在对这种强大的模式生物进行基因组测序10年后,为什么酿酒酵母基因组数据库中仍有超过1000个基因被列为功能未明呢?对当前功能未明的基因集进行检查表明,虽然其中一些基因很小或为新发现的,但绝大多数在最初的基因组序列中就已显现。大多数基因存在于多个基因组数据集里,这可能为其功能提供线索。此外,大约一半的基因包含可识别的蛋白质结构域,其中许多暗示了特定的代谢活动。值得注意的是,功能未明的基因集中高度富集了那些仅在其他真菌中有同源物的基因。要实现酵母基因功能的完整目录,可能需要更多地关注实验室之外的酵母生命活动。