Keich U, Pevzner P A
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA.
Bioinformatics. 2002 Oct;18(10):1374-81. doi: 10.1093/bioinformatics/18.10.1374.
Gene activity is often affected by binding transcription factors to short fragments in DNA sequences called motifs. Identification of subtle regulatory motifs in a DNA sequence is a difficult pattern recognition problem. In this paper we design a new motif finding algorithm that can detect very subtle motifs.
We introduce the notion of a multiprofile and use it for finding subtle motifs in DNA sequences. Multiprofiles generalize the notion of a profile and allow one to detect subtle patterns that escape detection by the standard profiles. Our MULTIPROFILER algorithm outperforms other leading motif finding algorithms in a number of synthetic models. Moreover, it can be shown that in some previously studied motif models, MULTIPROFILER is capable of pushing the performance envelope to its theoretical limits.
基因活性通常受到转录因子与DNA序列中称为模体的短片段结合的影响。识别DNA序列中的细微调控模体是一个困难的模式识别问题。在本文中,我们设计了一种新的模体发现算法,该算法能够检测到非常细微的模体。
我们引入了多轮廓的概念,并将其用于在DNA序列中寻找细微模体。多轮廓推广了轮廓的概念,并允许检测标准轮廓无法检测到的细微模式。我们的多轮廓算法(MULTIPROFILER)在许多合成模型中优于其他领先的模体发现算法。此外,可以证明,在一些先前研究的模体模型中,多轮廓算法能够将性能提升到理论极限。