通过相似性搜索进行大规模细菌基因发现。

DNA sequencing efforts frequently uncover genes other than the targeted ones. We have used rapid database scanning methods to search for undescribed eubacterial and archean protein coding frames in regions flanking known genes. By searching all prokaryotic DNA sequences not marked as coding for proteins or stable RNAs against the protein databases, we have identified more than 450 new examples of bacterial proteins, as well as a smaller number of possible revisions to known proteins, at a surprisingly high rate of one new protein or revision for every 24 initial DNA sequences or 8,300 nucleotides examined. Seven proteins are members of families which have not been described in prokaryotic sequences. We also describe 49 re-interpretations of existing sequence data of particular biological significance.

Large scale bacterial gene discovery by similarity search.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献