Kel A, Tikunov Y, Voss N, Wingender E
Institute of Cytology and Genetics SB RAN, Lavrentyev pr., 10, 630090, Novosibirsk, Russia.
Bioinformatics. 2004 Jul 10;20(10):1512-6. doi: 10.1093/bioinformatics/bth111.
Transcription factor binding sites often differ significantly in their primary sequence and can hardly be aligned. Often one set of sites can contain several subsets of sequences that follow not just one but several different patterns. There is a need for sensitive methods to reveal multiple patterns in unaligned sets of sequences.
We developed a novel method for analysis of unaligned sets of sequences based on kernel estimation. The method is able to reveal 'multiple local patterns'-a set of weight matrices. Every weight matrix characterizes a pattern that can be found in a significant subset of sequences under analysis. The method developed has been compared with several other methods of pattern discovery such as Gibbs sampling, MEME, CONSENSUS, MULTIPROFILER and PROJECTION. The kernel method showed the best performance in terms of how close the revealed weight matrices are to the original ones. We applied the kernel method to analyze three samples of promoters (cell-cycle, T-cells and muscle-specific). We compared the multiple patterns revealed with the TRANSFAC library of weight matrices and found a strong similarity to several weight matrices for transcription factors known to be involved in the mentioned specific gene regulation.
The program is available for on-line use at: http://www.biobase.de/cgi-bin/biobase/cbs2/bin/template.cgi?template=cbscall.html
转录因子结合位点的一级序列往往差异很大,难以进行比对。通常,一组位点可能包含几个序列子集,这些子集遵循的不仅是一种,而是几种不同的模式。因此,需要灵敏的方法来揭示未比对序列集中的多种模式。
我们开发了一种基于核估计的未比对序列集分析新方法。该方法能够揭示“多个局部模式”——一组权重矩阵。每个权重矩阵表征一种可在分析的显著序列子集中找到的模式。已将所开发的方法与其他几种模式发现方法进行了比较,如吉布斯采样、MEME、CONSENSUS、MULTIPROFILER和投影法。就所揭示的权重矩阵与原始矩阵的接近程度而言,核方法表现最佳。我们应用核方法分析了启动子的三个样本(细胞周期、T细胞和肌肉特异性)。我们将所揭示的多种模式与权重矩阵的TRANSFAC文库进行了比较,发现与已知参与上述特定基因调控的几种转录因子的权重矩阵有很强的相似性。
该程序可在以下网址在线使用:http://www.biobase.de/cgi-bin/biobase/cbs2/bin/template.cgi?template=cbscall.html