量化基序之间的相似性。

Gupta Shobhit, Stamatoyannopoulos John A, Bailey Timothy L, Noble William Stafford

Department of Genome Sciences, University of Washington, 1705 NE Pacific Street, Box 355065, Seattle, WA 98195, USA.

Genome Biol. 2007;8(2):R24. doi: 10.1186/gb-2007-8-2-r24.

A common question within the context of de novo motif discovery is whether a newly discovered, putative motif resembles any previously discovered motif in an existing database. To answer this question, we define a statistical measure of motif-motif similarity, and we describe an algorithm, called Tomtom, for searching a database of motifs with a given query motif. Experimental simulations demonstrate the accuracy of Tomtom's E values and its effectiveness in finding similar motifs.

在从头基序发现的背景下，一个常见的问题是新发现的假定基序是否与现有数据库中任何先前发现的基序相似。为了回答这个问题，我们定义了一种基序-基序相似性的统计量度，并描述了一种名为Tomtom的算法，用于使用给定的查询基序搜索基序数据库。实验模拟证明了Tomtom的E值的准确性及其在寻找相似基序方面的有效性。