Raphael Benjamin, Liu Lung-Tien, Varghese George
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093-0114, USA.
IEEE/ACM Trans Comput Biol Bioinform. 2004 Apr-Jun;1(2):91-4. doi: 10.1109/TCBB.2004.14.
Buhler and Tompa introduced the random projection algorithm for the motif discovery problem and demonstrated that this algorithm performs well on both simulated and biological samples. We describe a modification of the random projection algorithm, called the uniform projection algorithm, which utilizes a different choice of projections. We replace the random selection of projections by a greedy heuristic that approximately equalizes the coverage of the projections. We show that this change in selection of projections leads to improved performance on motif discovery problems. Furthermore, the uniform projection algorithm is directly applicable to other problems where the random projection algorithm has been used, including comparison of protein sequence databases.
布勒和汤帕引入了用于基序发现问题的随机投影算法,并证明该算法在模拟样本和生物样本上都表现良好。我们描述了一种随机投影算法的改进版本,称为均匀投影算法,它采用了不同的投影选择方式。我们用一种贪婪启发式方法取代了随机选择投影,该方法能使投影的覆盖范围大致相等。我们表明,投影选择的这种变化会提高基序发现问题的性能。此外,均匀投影算法可直接应用于已使用随机投影算法的其他问题,包括蛋白质序列数据库的比较。