Suppr超能文献

一种用于原核生物翻译起始位点计算识别的比较基因组方法。

A comparative genomic method for computational identification of prokaryotic translation initiation sites.

作者信息

Walker Megon, Pavlovic Vladimir, Kasif Simon

机构信息

Bioinformatics Program, Boston University, Boston, MA 02215, USA.

出版信息

Nucleic Acids Res. 2002 Jul 15;30(14):3181-91. doi: 10.1093/nar/gkf423.

Abstract

The ever growing number of completely sequenced prokaryotic genomes facilitates cross-species comparisons by genomic annotation algorithms. This paper introduces a new probabilistic framework for comparative genomic analysis and demonstrates its utility in the context of improving the accuracy of prokaryotic gene start site detection. Our frame work employs a product hidden Markov model (PROD-HMM) with state architecture to model the species-specific trinucleotide frequency patterns in sequences immediately upstream and downstream of a translation start site and to detect the contrasting non-synonymous (amino acid changing) and synonymous (silent) substitution rates that differentiate prokaryotic coding from intergenic regions. Depending on the intricacy of the features modeled by the hidden state architecture, intergenic, regulatory, promoter and coding regions can be delimited by this method. The new system is evaluated using a preliminary set of orthologous Pyrococcus gene pairs, for which it demonstrates an improved accuracy of detection. Its robustness is confirmed by analysis with cross-validation of an experimentally verified set of Escherichia coli K-12 and Salmonella thyphimurium LT2 orthologs. The novel architecture has a number of attractive features that distinguish it from previous comparative models such as pair-HMMs.

摘要

完全测序的原核生物基因组数量不断增加,这有助于通过基因组注释算法进行跨物种比较。本文介绍了一种用于比较基因组分析的新概率框架,并展示了其在提高原核生物基因起始位点检测准确性方面的实用性。我们的框架采用具有状态结构的乘积隐马尔可夫模型(PROD-HMM),对翻译起始位点上下游序列中物种特异性的三核苷酸频率模式进行建模,并检测区分原核生物编码区和基因间区域的不同非同义(氨基酸变化)和同义(沉默)替换率。根据隐状态结构所建模特征的复杂性,该方法可以界定基因间区域、调控区域、启动子区域和编码区域。使用一组初步的直系同源嗜热栖热菌基因对评估了新系统,结果表明其检测准确性有所提高。通过对一组经实验验证的大肠杆菌K-12和鼠伤寒沙门氏菌LT2直系同源物进行交叉验证分析,证实了其稳健性。这种新颖的结构具有许多吸引人的特征,使其有别于以前的比较模型,如配对隐马尔可夫模型。

相似文献

8
Bacterial start site prediction.细菌起始位点预测。
Nucleic Acids Res. 1999 Sep 1;27(17):3577-82. doi: 10.1093/nar/27.17.3577.

本文引用的文献

2
A Bayesian framework for combining gene predictions.一种用于整合基因预测的贝叶斯框架。
Bioinformatics. 2002 Jan;18(1):19-27. doi: 10.1093/bioinformatics/18.1.19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验