Suppr超能文献

关于完整微生物基因组中的基因总数及其长度分布

On the total number of genes and their length distribution in complete microbial genomes.

作者信息

Skovgaard M, Jensen L J, Brunak S, Ussery D, Krogh A

机构信息

Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark.

出版信息

Trends Genet. 2001 Aug;17(8):425-8. doi: 10.1016/s0168-9525(01)02372-1.

Abstract

In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes.

摘要

在已测序的微生物基因组中,一些注释基因实际上并非蛋白质编码基因,而是偶然出现的开放阅读框。因此,对于大多数这些微生物而言,注释基因的数量高于实际基因数量。将注释基因的长度分布与那些匹配已知蛋白质的基因的长度分布进行比较,发现在许多基因组中注释了过多的短基因。在此,我们估计已测序基因组中蛋白质编码基因的真实数量。尽管人们常称大肠杆菌约有4300个基因,但我们表明它可能仅有约3800个基因,并且几乎所有已发表的基因组都存在类似的差异。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验