Suppr超能文献

启动子探索器:一种基于AdaBoost算法的有效启动子识别方法。

PromoterExplorer: an effective promoter identification method based on the AdaBoost algorithm.

作者信息

Xie Xudong, Wu Shuanhu, Lam Kin-Man, Yan Hong

机构信息

Department of Electronic Engineering, City University of Hong Kong, Hong Kong.

出版信息

Bioinformatics. 2006 Nov 15;22(22):2722-8. doi: 10.1093/bioinformatics/btl482. Epub 2006 Sep 25.

Abstract

MOTIVATION

Promoter prediction is important for the analysis of gene regulations. Although a number of promoter prediction algorithms have been reported in literature, significant improvement in prediction accuracy remains a challenge. In this paper, an effective promoter identification algorithm, which is called PromoterExplorer, is proposed. In our approach, we analyze the different roles of various features, that is, local distribution of pentamers, positional CpG island features and digitized DNA sequence, and then combine them to build a high-dimensional input vector. A cascade AdaBoost-based learning procedure is adopted to select the most 'informative' or 'discriminating' features to build a sequence of weak classifiers, which are combined to form a strong classifier so as to achieve a better performance. The cascade structure used for identification can also reduce the false positive.

RESULTS

PromoterExplorer is tested based on large-scale DNA sequences from different databases, including the EPD, DBTSS, GenBank and human chromosome 22. Experimental results show that consistent and promising performance can be achieved.

摘要

动机

启动子预测对于基因调控分析至关重要。尽管文献中已报道了许多启动子预测算法,但预测准确性的显著提高仍然是一个挑战。本文提出了一种有效的启动子识别算法,称为PromoterExplorer。在我们的方法中,我们分析了各种特征的不同作用,即五聚体的局部分布、位置CpG岛特征和数字化DNA序列,然后将它们组合起来构建一个高维输入向量。采用基于级联AdaBoost的学习过程来选择最“信息丰富”或“有区分力”的特征,以构建一系列弱分类器,这些弱分类器组合形成一个强分类器,从而实现更好的性能。用于识别的级联结构也可以减少误报。

结果

基于来自不同数据库(包括EPD、DBTSS、GenBank和人类22号染色体)的大规模DNA序列对PromoterExplorer进行了测试。实验结果表明,可以实现一致且有前景的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验