Novartis Institutes for BioMedical Research, Basel, Switzerland.
J Cheminform. 2013 Sep 24;5(1):43. doi: 10.1186/1758-2946-5-43.
: Fingerprint similarity is a common method for comparing chemical structures. Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar. This transparency is partially lost with the fuzzier similarity methods that are often used for scaffold hopping and tends to vanish completely when molecular fingerprints are used as inputs to machine-learning (ML) models. Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model. We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes. An open-source implementation of the method is provided.
指纹相似性是比较化学结构的常用方法。相似性是一种很有吸引力的方法,因为有许多类型的指纹,它提供了直观的结果:化学家观察两个分子时,可以理解为什么它们被确定为相似。这种透明度在常用于支架跳跃的更模糊的相似性方法中部分丢失,并且当分子指纹用作机器学习 (ML) 模型的输入时,这种透明度完全消失。在这里,我们提出了相似性图,这是一种直观且通用的策略,可以可视化两个分子之间相似性的原子贡献或 ML 模型的预测概率。我们使用原子对和圆形指纹以及两种流行的 ML 方法(随机森林和朴素贝叶斯)展示了相似性图在一组多巴胺 D3 受体配体中的应用。该方法的开源实现也已提供。