Hull R D, Fluder E M, Singh S B, Nachbar R B, Kearsley S K, Sheridan R P
Department of Molecular Systems, RY50S-100, Merck Research Laboratories, P.O. Box 2000, Rahway, New Jersey 07065, USA.
J Med Chem. 2001 Apr 12;44(8):1185-91. doi: 10.1021/jm000392k.
Similarity searches based on chemical descriptors have proven extremely useful in aiding large-scale drug screening. Here we present results of similarity searching using Latent Semantic Structure Indexing (LaSSI). LaSSI uses a singular value decomposition on chemical descriptors to project molecules into a k-dimensional descriptor space, where k is the number of retained singular values. The effect of the projection is that certain descriptors are emphasized over others and some descriptors may count as partially equivalent to others. We compare LaSSI searches to searches done with TOPOSIM, our standard in-house method, which uses the Dice similarity definition. Standard descriptor-based methods such as TOPOSIM count all descriptors equally and treat all descriptors as independent. For this work we use atom pairs and topological torsions as examples of chemical descriptors. Using objective criteria to determine how effective one similarity method is versus another in selecting active compounds from a large database, we find for a series of 16 drug-like probes that LaSSI is as good as or better than TOPOSIM in selecting active compounds from the MDDR database, if the user is allowed to treat k as an adjustable parameter. Typically, LaSSI selects very different sets of actives than does TOPOSIM, so it can find classes of actives that TOPOSIM would miss.
基于化学描述符的相似性搜索在大规模药物筛选中已被证明非常有用。在此,我们展示了使用潜在语义结构索引(LaSSI)进行相似性搜索的结果。LaSSI对化学描述符进行奇异值分解,将分子投影到k维描述符空间,其中k是保留的奇异值数量。投影的效果是某些描述符比其他描述符更受强调,并且一些描述符可能被视为部分等同于其他描述符。我们将LaSSI搜索与使用我们内部标准方法TOPOSIM进行的搜索进行比较,TOPOSIM使用Dice相似性定义。基于标准描述符的方法(如TOPOSIM)平等地计算所有描述符,并将所有描述符视为独立的。在这项工作中,我们使用原子对和拓扑扭转作为化学描述符的示例。使用客观标准来确定一种相似性方法相对于另一种在从大型数据库中选择活性化合物方面的有效性,我们发现对于一系列16种类药物探针,如果允许用户将k视为可调参数,那么在从MDDR数据库中选择活性化合物方面,LaSSI与TOPOSIM一样好或更好。通常,LaSSI选择的活性化合物集与TOPOSIM非常不同,因此它可以找到TOPOSIM会错过的活性化合物类别。