Suppr超能文献

Z曲线:一种用于识别细菌和古细菌基因组中蛋白质编码基因的新系统。

ZCURVE: a new system for recognizing protein-coding genes in bacterial and archaeal genomes.

作者信息

Guo Feng-Biao, Ou Hong-Yu, Zhang Chun-Ting

机构信息

Department of Physics, Tianjin University, Tianjin 300072, China.

出版信息

Nucleic Acids Res. 2003 Mar 15;31(6):1780-9. doi: 10.1093/nar/gkg254.

Abstract

A new system, ZCURVE 1.0, for finding protein- coding genes in bacterial and archaeal genomes has been proposed. The current algorithm, which is based on the Z curve representation of the DNA sequences, lays stress on the global statistical features of protein-coding genes by taking the frequencies of bases at three codon positions into account. In ZCURVE 1.0, since only 33 parameters are used to characterize the coding sequences, it gives better consideration to both typical and atypical cases, whereas in Markov-model-based methods, e.g. Glimmer 2.02, thousands of parameters are trained, which may result in less adaptability. To compare the performance of the new system with that of Glimmer 2.02, both systems were run, respectively, for 18 genomes not annotated by the Glimmer system. Comparisons were also performed for predicting some function-known genes by both systems. Consequently, the average accuracy of both systems is well matched; however, ZCURVE 1.0 has more accurate gene start prediction, lower additional prediction rate and higher accuracy for the prediction of horizontally transferred genes. It is shown that the joint applications of both systems greatly improve gene-finding results. For a typical genome, e.g. Escherichia coli, the system ZCURVE 1.0 takes approximately 2 min on a Pentium III 866 PC without any human intervention. The system ZCURVE 1.0 is freely available at: http://tubic. tju.edu.cn/Zcurve_B/.

摘要

人们提出了一种名为ZCURVE 1.0的新系统,用于在细菌和古细菌基因组中寻找蛋白质编码基因。当前的算法基于DNA序列的Z曲线表示,通过考虑三个密码子位置的碱基频率,强调了蛋白质编码基因的全局统计特征。在ZCURVE 1.0中,由于仅使用33个参数来表征编码序列,因此它对典型和非典型情况都给予了更好的考虑,而在基于马尔可夫模型的方法(例如Glimmer 2.02)中,要训练数千个参数,这可能导致适应性较差。为了比较新系统与Glimmer 2.02的性能,分别在18个未由Glimmer系统注释的基因组上运行了这两个系统。还对两个系统预测一些功能已知基因进行了比较。结果,两个系统的平均准确率相当;然而,ZCURVE 1.0在基因起始预测方面更准确,额外预测率更低,对水平转移基因的预测准确率更高。结果表明,两个系统的联合应用大大提高了基因发现结果。对于一个典型的基因组,例如大肠杆菌,ZCURVE 1.0系统在一台奔腾III 866 PC上无需任何人工干预大约需要2分钟。ZCURVE 1.0系统可在以下网址免费获取:http://tubic.tju.edu.cn/Zcurve_B/

相似文献

引用本文的文献

4
Recombineering in Non-Model Bacteria.非模式细菌中的重组。
Curr Protoc. 2022 Dec;2(12):e605. doi: 10.1002/cpz1.605.
5
The genome and antigen proteome analysis of .……的基因组和抗原蛋白质组分析
Front Microbiol. 2022 Nov 2;13:996938. doi: 10.3389/fmicb.2022.996938. eCollection 2022.

本文引用的文献

10
Improved microbial gene identification with GLIMMER.利用GLIMMER改进微生物基因识别。
Nucleic Acids Res. 1999 Dec 1;27(23):4636-41. doi: 10.1093/nar/27.23.4636.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验