iPromoter-2L:一种双层预测器,通过基于多窗口的 PseKNC 来识别启动子及其类型。

iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC.

机构信息

School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China.

The Gordon Life Science Institute, Boston, MA 02478, USA.

出版信息

Bioinformatics. 2018 Jan 1;34(1):33-40. doi: 10.1093/bioinformatics/btx579.

Abstract

MOTIVATION

Being responsible for initiating transaction of a particular gene in genome, promoter is a short region of DNA. Promoters have various types with different functions. Owing to their importance in biological process, it is highly desired to develop computational tools for timely identifying promoters and their types. Such a challenge has become particularly critical and urgent in facing the avalanche of DNA sequences discovered in the postgenomic age. Although some prediction methods were developed, they can only be used to discriminate a specific type of promoters from non-promoters. None of them has the ability to identify the types of promoters. This is due to the facts that different types of promoters may share quite similar consensus sequence pattern, and that the promoters of same type may have considerably different consensus sequences.

RESULTS

To overcome such difficulty, using the multi-window-based PseKNC (pseudo K-tuple nucleotide composition) approach to incorporate the short-, middle-, and long-range sequence information, we have developed a two-layer seamless predictor named as 'iPromoter-2 L'. The first layer serves to identify a query DNA sequence as a promoter or non-promoter, and the second layer to predict which of the following six types the identified promoter belongs to: σ24, σ28, σ32, σ38, σ54 and σ70.

AVAILABILITY AND IMPLEMENTATION

For the convenience of most experimental scientists, a user-friendly and publicly accessible web-server for the powerful new predictor has been established at http://bioinformatics.hitsz.edu.cn/iPromoter-2L/. It is anticipated that iPromoter-2 L will become a very useful high throughput tool for genome analysis.

CONTACT

bliu@hit.edu.cn or dshuang@tongji.edu.cn or kcchou@gordonlifescience.org.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

启动特定基因在基因组中启动子是一段短的 DNA 序列。启动子有不同的类型和不同的功能。由于它们在生物过程中的重要性,开发及时识别启动子及其类型的计算工具是非常需要的。在面对后基因组时代发现的大量 DNA 序列的情况下,这种挑战变得尤为关键和紧迫。尽管已经开发了一些预测方法,但它们只能用于区分特定类型的启动子和非启动子。没有一种方法能够识别启动子的类型。这是因为不同类型的启动子可能具有非常相似的保守序列模式,而同一类型的启动子可能具有相当不同的保守序列。

结果

为了克服这一困难,使用基于多窗口的 PseKNC(伪 K-元核苷酸组成)方法整合短、中、长距离序列信息,我们开发了一种两层无缝预测器,命名为“iPromoter-2L”。第一层用于识别查询 DNA 序列是启动子还是非启动子,第二层用于预测识别出的启动子属于以下六种类型中的哪一种:σ24、σ28、σ32、σ38、σ54 和 σ70。

可用性和实现

为了方便大多数实验科学家,我们在 http://bioinformatics.hitsz.edu.cn/iPromoter-2L/ 上建立了一个用户友好的、公共访问的、功能强大的新预测器的网络服务器。预计 iPromoter-2L 将成为基因组分析的一个非常有用的高通量工具。

联系信息

bliu@hit.edu.cndshuang@tongji.edu.cnkcchou@gordonlifescience.org

补充信息

补充数据可在 Bioinformatics 在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索