ZBH - Center for Bioinformatics , Bundesstraße 43 , 20146 Hamburg , Germany.
J Chem Inf Model. 2019 Jun 24;59(6):2560-2571. doi: 10.1021/acs.jcim.9b00250. Epub 2019 May 23.
Molecular patterns are widely used for compound filtering in molecular design endeavors. They describe structural properties that are connected with unwanted physical or chemical properties like reactivity or toxicity. With filter sets comprising hundreds of structural filters, an analytic approach to compare those patterns is needed. Here we present a novel approach to solve the generic pattern comparison problem. We introduce chemically inspired fingerprints for pattern nodes and edges to derive an easy-to-compare pattern representation. On two annotated pattern graphs we apply a maximum common subgraph algorithm enabling the calculation of pattern inclusion and similarity. The resulting algorithm can be used in many different ways. We can automatically derive pattern hierarchies or search in large pattern collections for more general or more specific patterns. To the best of our knowledge, the presented algorithm is the first of its kind enabling these types of chemical pattern analytics. Our new tool named SMARTScompare is an implementation of the approach for the SMARTS language, which is the quasi-standard for structural filters. We demonstrate the capabilities of SMARTScompare on a large collection of SMARTS patterns from real applications.
分子模式被广泛应用于分子设计工作中的化合物筛选。它们描述了与反应性或毒性等不良物理或化学性质相关的结构特性。对于包含数百个结构过滤器的过滤器集,需要采用分析方法来比较这些模式。在这里,我们提出了一种解决通用模式比较问题的新方法。我们为模式节点和边引入了受化学启发的指纹,以得出易于比较的模式表示。我们在两个已注释的模式图上应用了最大公共子图算法,从而能够计算模式包含和相似性。由此产生的算法可以以许多不同的方式使用。我们可以自动推导出模式层次结构,或者在大型模式集中搜索更通用或更具体的模式。据我们所知,所提出的算法是能够实现这种类型的化学模式分析的首例算法。我们的新工具名为 SMARTScompare,它是针对 SMARTS 语言的实现,SMARTS 语言是结构过滤器的准标准。我们在来自实际应用的大型 SMARTS 模式集合上展示了 SMARTScompare 的功能。