Schulte Ulrich, Becker Irmgard, Mewes H Werner, Mannhaupt Gertrud
Institute of Biochemistry, Heinrich-Heine-University Düsseldorf, D-40225, Dusseldorf, Germany.
J Biotechnol. 2002 Mar 14;94(1):3-13. doi: 10.1016/s0168-1656(01)00415-1.
After 50 years of analysing Neurospora crassa genes one by one large scale sequence analysis has increased the number of accessible genes tremendously in the last few years. Being the only filamentous fungus for which a comprehensive genomic sequence database is publicly accessible N. crassa serves as the model for this important group of microorganisms. The MIPS N. crassa database currently holds more than 16 Mb of non-redundant data of the chromosomes II and V analysed by the German Neurospora Genome Project. This represents more than one-third of the genome. Open reading frames (ORFs) have been extracted from the sequence and the deduced proteins have been annotated extensively. They are classified according to matches in sequence databases and attributed to functional categories according to their relatives. While 41% of analysed proteins are related to known proteins, 30% are hypothetical proteins with no match to a database entry. The entire genome is expected to comprise some 13000 protein coding genes, more than twice as many as found in yeasts, and reflects the high potential of filamentous fungi to cope with various environmental conditions.
在对粗糙脉孢菌基因进行了50年的逐一分析之后,大规模序列分析在过去几年极大地增加了可获取基因的数量。作为唯一一种拥有公开可用的全面基因组序列数据库的丝状真菌,粗糙脉孢菌是这一重要微生物群体的模型。MIPS粗糙脉孢菌数据库目前存有德国粗糙脉孢菌基因组计划分析的超过16兆字节的非冗余的第二和第五条染色体数据。这占基因组的三分之一以上。已从序列中提取了开放阅读框(ORF),并对推导的蛋白质进行了广泛注释。它们根据在序列数据库中的匹配情况进行分类,并根据其亲缘关系归入功能类别。虽然41%的分析蛋白质与已知蛋白质相关,但30%是与数据库条目无匹配的假设蛋白质。整个基因组预计包含约13000个蛋白质编码基因,是酵母中发现数量的两倍多,反映出丝状真菌应对各种环境条件的巨大潜力。