Computer Science Division, Department of Mathematics, Faculty of Science, Ain Shams University, Cairo 11566, Egypt.
Adv Exp Med Biol. 2010;680:65-73. doi: 10.1007/978-1-4419-5913-3_8.
We consider the planted (l,d)-motif search problem, which consists of finding a substring of length l that occurs in each s ( i ) in a set of input sequences {s (1),…,s ( t )} with at most d substitutions. In this paper, we study the effect of using Balla, Davila, and Rajasekaran strategy on voting algorithm practically. We call this technique, modified voting algorithm. We present an experimental study between original and modified voting algorithms on simulated data from (9,d) to (15,d). The comparison shows that the voting algorithm is faster than its modification in all instances except the instance (15,3). We also study the effect of increasing h, which is proposed by Balla, Davila, and Rajasekaran on the modified voting algorithm. From this study, we obtained the values of the number of sequences that make the running time of modified voting algorithm less than the voting algorithm and minimum. Finally, we analyze the experimental results and give some observations according to the relations: (1) l is fixed and d is variable. (2) l is variable and d is fixed. (3) l and d are variables. (4) (l,d) is challenging.
我们考虑种植的(l,d)-基序搜索问题,该问题由在一组输入序列{s (1),…,s ( t )}中找到长度为 l 的子字符串组成,每个 s ( i )中最多有 d 个替换。在本文中,我们从(9,d)到(15,d)的模拟数据上研究了在投票算法中实际使用 Balla、Davila 和 Rajasekaran 策略的效果。我们将这种技术称为修改后的投票算法。我们在原始投票算法和修改后的投票算法之间进行了实验研究。比较表明,除了实例(15,3)外,投票算法在所有实例中都比其修改版本快。我们还研究了增加 h 的效果,h 是由 Balla、Davila 和 Rajasekaran 提出的,用于修改后的投票算法。通过这项研究,我们获得了使修改后的投票算法的运行时间小于投票算法和最小值的序列数量的值。最后,我们根据关系分析实验结果并给出一些观察结果:(1)l 是固定的,d 是变量。(2)l 是变量,d 是固定的。(3)l 和 d 是变量。(4)(l,d)是具有挑战性的。