Suppr超能文献

用于基序查找问题的投票算法。

Voting algorithms for the motif finding problem.

作者信息

Liu Xiaowen, Ma Bin, Wang Lusheng

机构信息

Department of Computer Science, University of Western Ontario, London, ON, Canada.

出版信息

Comput Syst Bioinformatics Conf. 2008;7:37-47.

Abstract

UNLABELLED

Finding motifs in many sequences is an important problem in computational biology, especially in identification of regulatory motifs in DNA sequences. Let c be a motif sequence. Given a set of sequences, each is planted with a mutated version of c at an unknown position, the motif finding problem is to find these planted motifs and the original c. In this paper, we study the VM model of the planted motif problem, which is proposed by Pevzner and Sze. We give a simple Selecting One Voting algorithm and a more powerful Selecting k Voting algorithm. When the length of motif and the number of input sequences are large enough, we prove that the two algorithms can find the unknown motif consensus with high probability. In the proof, we show why a large number of input sequences is so important for finding motifs, which is believed by most researchers. Experimental results on simulated data also support the claim. Selecting k Voting algorithm is powerful, but computational intensive. To speed up the algorithm, we propose a progressive filtering algorithm, which improves the running time significantly and has good accuracy in finding motifs. Our experimental results show that Selecting k Voting algorithm with progressive filtering performs very well in practice and it outperforms some best known algorithms.

AVAILABILITY

The software is available upon request.

摘要

未标注

在众多序列中寻找基序是计算生物学中的一个重要问题,尤其是在识别DNA序列中的调控基序方面。设c为一个基序序列。给定一组序列,每个序列都在未知位置植入了c的一个突变版本,基序查找问题就是要找到这些植入的基序以及原始的c。在本文中,我们研究了由佩夫兹纳和斯泽提出的植入基序问题的VM模型。我们给出了一个简单的“选择一个投票”算法和一个更强大的“选择k个投票”算法。当基序长度和输入序列数量足够大时,我们证明这两种算法能够以高概率找到未知的基序共有序列。在证明过程中,我们展示了为什么大量的输入序列对于寻找基序如此重要,这是大多数研究人员所认同的。对模拟数据的实验结果也支持这一说法。“选择k个投票”算法很强大,但计算量很大。为了加速该算法,我们提出了一种渐进过滤算法,它显著提高了运行时间,并且在寻找基序方面具有良好的准确性。我们的实验结果表明,带有渐进过滤的“选择k个投票”算法在实际应用中表现非常出色,并且优于一些最知名的算法。

可用性

可根据要求提供该软件。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验