Suppr超能文献

细菌起始位点预测。

Bacterial start site prediction.

作者信息

Hannenhalli S S, Hayes W S, Hatzigeorgiou A G, Fickett J W

机构信息

Bioinformatics, SmithKline Beecham Pharmaceuticals, 709 Swedeland Road, PO Box 1539, King of Prussia, PA 19406, USA.

出版信息

Nucleic Acids Res. 1999 Sep 1;27(17):3577-82. doi: 10.1093/nar/27.17.3577.

Abstract

With the growing number of completely sequenced bacterial genes, accurate gene prediction in bacterial genomes remains an important problem. Although the existing tools predict genes in bacterial genomes with high overall accuracy, their ability to pinpoint the translation start site remains unsatisfactory. In this paper, we present a novel approach to bacterial start site prediction that takes into account multiple features of a potential start site, viz., ribosome binding site (RBS) binding energy, distance of the RBS from the start codon, distance from the beginning of the maximal ORF to the start codon, the start codon itself and the coding/non-coding potential around the start site. Mixed integer programing was used to optimize the discriminatory system. The accuracy of this approach is up to 90%, compared to 70%, using the most common tools in fully automated mode (that is, without expert human post-processing of results). The approach is evaluated using Bacillus subtilis, Escherichia coli and Pyrococcus furiosus. These three genomes cover a broad spectrum of bacterial genomes, since B.subtilis is a Gram-positive bacterium, E.coli is a Gram-negative bacterium and P. furiosus is an archaebacterium. A significant problem is generating a set of 'true' start sites for algorithm training, in the absence of experimental work. We found that sequence conservation between P. furiosus and the related Pyrococcus horikoshii clearly delimited the gene start in many cases, providing a sufficient training set.

摘要

随着完全测序的细菌基因数量不断增加,细菌基因组中的准确基因预测仍然是一个重要问题。尽管现有工具在预测细菌基因组中的基因时总体准确率较高,但它们确定翻译起始位点的能力仍不尽人意。在本文中,我们提出了一种新的细菌起始位点预测方法,该方法考虑了潜在起始位点的多个特征,即核糖体结合位点(RBS)结合能、RBS与起始密码子的距离、从最大开放阅读框(ORF)起始到起始密码子的距离、起始密码子本身以及起始位点周围的编码/非编码潜力。使用混合整数规划来优化判别系统。与完全自动化模式下(即无需专家对结果进行人工后处理)最常用的工具相比,该方法的准确率高达90%,而常用工具的准确率为70%。使用枯草芽孢杆菌、大肠杆菌和激烈火球菌对该方法进行了评估。这三个基因组涵盖了广泛的细菌基因组类型,因为枯草芽孢杆菌是革兰氏阳性菌,大肠杆菌是革兰氏阴性菌,激烈火球菌是古细菌。在没有实验工作的情况下,生成一组用于算法训练的“真实”起始位点是一个重大问题。我们发现,激烈火球菌与相关的堀越火球菌之间的序列保守性在许多情况下明确界定了基因起始位点,从而提供了足够的训练集。

相似文献

1
Bacterial start site prediction.细菌起始位点预测。
Nucleic Acids Res. 1999 Sep 1;27(17):3577-82. doi: 10.1093/nar/27.17.3577.
3
Operon prediction in Pyrococcus furiosus.嗜热栖热菌中的操纵子预测
Nucleic Acids Res. 2007;35(1):11-20. doi: 10.1093/nar/gkl974. Epub 2006 Dec 5.

引用本文的文献

1
Multidrug Efflux Systems in Helicobacter cinaedi.卷曲幽门螺杆菌中的多药外排系统。
Antibiotics (Basel). 2012 Nov 21;1(1):29-43. doi: 10.3390/antibiotics1010029.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验