Kowalczuk M, Mackiewicz P, Gierlik A, Dudek M R, Cebrat S
Institute of Microbiology, Wroclaw University, ul. Przybyszewskiego 63/77, 54-148 Wroclaw, Poland.
Yeast. 1999 Aug;15(11):1031-4. doi: 10.1002/(SICI)1097-0061(199908)15:11<1031::AID-YEA431>3.0.CO;2-G.
At the end of 1996 we approximated the total number of protein coding ORFs in the Saccharomyces cerevisiae genome, based on their properties, as 4700-4800. The number is much smaller than the 5800 which is widely accepted. According to our calculations, there remain about 200-300 orphans-ORFs without known function or homology to already discovered genes, which is only about 5% of the total number of genes. Our results would be questionable if the analysed set of known genes was not a statistically representative sample of the whole set of protein coding genes in the S. cerevisiae genome. Therefore, we repeated our estimation using recently updated databases. In the course of the last 18 months, previously unknown functions of about 500 genes have been found. We have used these to check our method, former results and conclusions. Our previous estimation of the total number of coding ORFs was confirmed.
1996年底,我们根据其特性估算出酿酒酵母基因组中蛋白质编码开放阅读框(ORF)的总数为4700 - 4800个。这个数字远小于被广泛接受的5800个。根据我们的计算,仍有大约200 - 300个孤立的ORF,即与已发现基因没有已知功能或同源性的ORF,这仅占基因总数的约5%。如果所分析的已知基因集不是酿酒酵母基因组中整个蛋白质编码基因集的统计学代表性样本,那么我们的结果可能会受到质疑。因此,我们使用最近更新的数据库重复了我们的估算。在过去的18个月里,已经发现了大约500个基因的先前未知功能。我们用这些来检验我们的方法、先前的结果和结论。我们之前对编码ORF总数的估算得到了证实。