Askary Amjad, Masoudi-Nejad Ali, Sharafi Roozbeh, Mizbani Amir, Parizi Sobhan Naderi, Purmasjedi Malihe
Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics and COE in Biomathematics, University of Tehran, Tehran, Iran.
Genes Genet Syst. 2009 Dec;84(6):425-30. doi: 10.1266/ggs.84.425.
Promoters, the genomic regions proximal to the transcriptional start sites (TSSs) play pivotal roles in determining the rate of transcription initiation by serving as direct docking platforms for the RNA polymerase II complex. In the post-genomic era, correct gene prediction has become one of the biggest challenges in genome annotation. Species-independent promoter prediction tools could also be useful in meta-genomics, since transcription data will not be available for micro-organisms which are not cultivated. Promoter prediction in prokaryotic genomes presents unique challenges owing to their organizational properties. Several methods have been developed to predict the promoter regions of genomes in prokaryotes, including algorithms for recognition of sequence motifs, artificial neural networks, and algorithms based on genome's structure. However, none of them satisfies both criteria of sensitivity and precision. In this work, we present a modified artificial neural network fed by nearest neighbors based on DNA duplex stability, named N4, which can predict the transcription start sites of Escherichia coli with sensitivity and precision both above 94%, better than most of the existed algorithms.
启动子是转录起始位点(TSSs)附近的基因组区域,通过作为RNA聚合酶II复合物的直接对接平台,在决定转录起始速率方面发挥着关键作用。在后基因组时代,正确的基因预测已成为基因组注释中最大的挑战之一。不依赖物种的启动子预测工具在宏基因组学中也可能有用,因为对于未培养的微生物,转录数据是不可用的。由于原核生物基因组的组织特性,原核生物基因组中的启动子预测面临独特的挑战。已经开发了几种方法来预测原核生物基因组中的启动子区域,包括识别序列基序的算法、人工神经网络以及基于基因组结构的算法。然而,它们都不能同时满足灵敏度和精度这两个标准。在这项工作中,我们提出了一种基于DNA双链稳定性的最近邻喂养的改进人工神经网络,名为N4,它可以预测大肠杆菌的转录起始位点,灵敏度和精度均高于94%,优于大多数现有算法。