Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, 53115, Bonn, Germany.
J Comput Aided Mol Des. 2021 Dec;35(12):1157-1164. doi: 10.1007/s10822-021-00380-y. Epub 2021 Mar 19.
An activity cliff (AC) is formed by a pair of structurally similar compounds with a large difference in potency. Accordingly, ACs reveal structure-activity relationship (SAR) discontinuity and provide SAR information for compound optimization. Herein, we have investigated the question if ACs could be predicted from image data. Therefore, pairs of structural analogs were extracted from different compound activity classes that formed or did not form ACs. From these compound pairs, consistently formatted images were generated. Image sets were used to train and test convolutional neural network (CNN) models to systematically distinguish between ACs and non-ACs. The CNN models were found to predict ACs with overall high accuracy, as assessed using alternative performance measures, hence establishing proof-of-principle. Moreover, gradient weights from convolutional layers were mapped to test compounds and identified characteristic structural features that contributed to successful predictions. Weight-based feature visualization revealed the ability of CNN models to learn chemistry from images at a high level of resolution and aided in the interpretation of model decisions with intrinsic black box character.
活性悬崖 (AC) 由一对结构相似但效力差异很大的化合物形成。因此,AC 揭示了结构-活性关系 (SAR) 的不连续性,并为化合物优化提供了 SAR 信息。在此,我们研究了从图像数据中预测 AC 是否可行的问题。因此,从形成或未形成 AC 的不同化合物活性类别中提取了结构类似物对。从这些化合物对中,生成了格式一致的图像。使用图像集来训练和测试卷积神经网络 (CNN) 模型,以系统地区分 AC 和非 AC。使用替代性能指标评估发现,CNN 模型能够以较高的整体准确度预测 AC,从而确立了原理验证。此外,从卷积层提取的梯度权重映射到测试化合物上,并确定了有助于成功预测的特征结构特征。基于权重的特征可视化揭示了 CNN 模型从图像中以高分辨率水平学习化学的能力,并有助于通过内在的黑盒特性对模型决策进行解释。