Suppr超能文献

基于新型可变窗口 Z 曲线方法的原核启动子识别。

Recognition of prokaryotic promoters based on a novel variable-window Z-curve method.

机构信息

School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.

出版信息

Nucleic Acids Res. 2012 Feb;40(3):963-71. doi: 10.1093/nar/gkr795. Epub 2011 Sep 27.

Abstract

Transcription is the first step in gene expression, and it is the step at which most of the regulation of expression occurs. Although sequenced prokaryotic genomes provide a wealth of information, transcriptional regulatory networks are still poorly understood using the available genomic information, largely because accurate prediction of promoters is difficult. To improve promoter recognition performance, a novel variable-window Z-curve method is developed to extract general features of prokaryotic promoters. The features are used for further classification by the partial least squares technique. To verify the prediction performance, the proposed method is applied to predict promoter fragments of two representative prokaryotic model organisms (Escherichia coli and Bacillus subtilis). Depending on the feature extraction and selection power of the proposed method, the promoter prediction accuracies are improved markedly over most existing approaches: for E. coli, the accuracies are 96.05% (σ(70) promoters, coding negative samples), 90.44% (σ(70) promoters, non-coding negative samples), 92.13% (known sigma-factor promoters, coding negative samples), 92.50% (known sigma-factor promoters, non-coding negative samples), respectively; for B. subtilis, the accuracies are 95.83% (known sigma-factor promoters, coding negative samples) and 99.09% (known sigma-factor promoters, non-coding negative samples). Additionally, being a linear technique, the computational simplicity of the proposed method makes it easy to run in a matter of minutes on ordinary personal computers or even laptops. More importantly, there is no need to optimize parameters, so it is very practical for predicting other species promoters without any prior knowledge or prior information of the statistical properties of the samples.

摘要

转录是基因表达的第一步,也是表达调控发生的主要步骤。尽管已测序的原核生物基因组提供了丰富的信息,但利用现有的基因组信息,转录调控网络仍然知之甚少,这主要是因为准确预测启动子较为困难。为了提高启动子识别性能,提出了一种新的可变窗口 Z 曲线方法来提取原核生物启动子的一般特征。使用偏最小二乘法技术对这些特征进行进一步分类。为了验证预测性能,将该方法应用于两个有代表性的原核生物模型(大肠杆菌和枯草芽孢杆菌)的启动子片段预测。根据所提出方法的特征提取和选择能力,与大多数现有方法相比,预测性能有了显著提高:对于大肠杆菌,预测精度分别为 96.05%(σ(70)启动子,编码负样本)、90.44%(σ(70)启动子,非编码负样本)、92.13%(已知 σ 因子启动子,编码负样本)、92.50%(已知 σ 因子启动子,非编码负样本);对于枯草芽孢杆菌,预测精度分别为 95.83%(已知 σ 因子启动子,编码负样本)和 99.09%(已知 σ 因子启动子,非编码负样本)。此外,作为一种线性技术,所提出方法的计算简单性使得它可以在普通个人计算机甚至笔记本电脑上在几分钟内运行。更重要的是,不需要优化参数,因此对于没有样本统计特性的先验知识或先验信息的其他物种启动子的预测非常实用。

相似文献

2
Eukaryotic and prokaryotic promoter prediction using hybrid approach.使用混合方法进行真核和原核启动子预测。
Theory Biosci. 2011 Jun;130(2):91-100. doi: 10.1007/s12064-010-0114-8. Epub 2010 Nov 3.
10
The cross-species prediction of bacterial promoters using a support vector machine.使用支持向量机对细菌启动子进行跨物种预测。
Comput Biol Chem. 2008 Oct;32(5):359-66. doi: 10.1016/j.compbiolchem.2008.07.009. Epub 2008 Jul 15.

引用本文的文献

2
Predicting Promoters in Multiple Prokaryotes with Prompt.利用 Prompt 预测多种原核生物的启动子。
Interdiscip Sci. 2024 Dec;16(4):814-828. doi: 10.1007/s12539-024-00637-8. Epub 2024 Aug 7.

本文引用的文献

2
Eukaryotic and prokaryotic promoter prediction using hybrid approach.使用混合方法进行真核和原核启动子预测。
Theory Biosci. 2011 Jun;130(2):91-100. doi: 10.1007/s12064-010-0114-8. Epub 2010 Nov 3.
5
GenBank.GenBank。
Nucleic Acids Res. 2010 Jan;38(Database issue):D46-51. doi: 10.1093/nar/gkp1024. Epub 2009 Nov 12.
6
Mechanisms and evolution of control logic in prokaryotic transcriptional regulation.原核生物转录调控中控制逻辑的机制与进化
Microbiol Mol Biol Rev. 2009 Sep;73(3):481-509, Table of Contents. doi: 10.1128/MMBR.00037-08.
9
Origins of replication in Cyanothece 51142.蓝细菌51142中的复制起点。
Proc Natl Acad Sci U S A. 2008 Dec 30;105(52):E125; author reply E126-7. doi: 10.1073/pnas.0809987106.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验