Data Science Center and Graduate School of Science and Technology , Nara Institute of Science and Technology , 8916-5 Takayama-cho , Ikoma , Nara 630-0192 , Japan.
Department of Chemical System Engineering, School of Engineering , The University of Tokyo , 7-3-1 Hongo , Bunkyo-ku , Tokyo 113-8656 , Japan.
J Chem Inf Model. 2019 Mar 25;59(3):993-1004. doi: 10.1021/acs.jcim.8b00661. Epub 2018 Dec 12.
Activity landscapes (ALs) integrate structural and potency data of active compounds and provide graphical access to structure-activity relationships (SARs) contained in compound data sets. Three-dimensional (3D) ALs can be conceptualized as a two-dimensional (2D) projection of chemical space with an interpolated activity surface added as a third dimension. Such 3D ALs are particularly intuitive for SAR visualization. In this work, 3D ALs were generated on the basis of different projection methods and fingerprint descriptors, and their topologies were compared. Moreover, going beyond qualitative analysis, the use of 3D ALs for semiquantitative and quantitative potency predictions was investigated. NeuroScale, a neural network variant of multidimensional scaling, combined with Gaussian process regression (GPR) was identified as a preferred approach for generating 3D ALs that accounted for training compounds and their SAR characteristics with high accuracy. On the other hand, GPR-induced overfitting generally limited the accuracy of potency value predictions regardless of the projection method applied. However, 3D ALs enabled reliable mapping of test compounds with varying potency levels to corresponding AL regions. The most accurate mapping was achieved with NeuroScale models. Taken together, the results of our analysis indicate the high potential of 3D ALs for graphical SAR exploration and the identification of potent test compounds.
活动景观 (AL) 整合了活性化合物的结构和效力数据,并提供了对化合物数据集中包含的结构-活性关系 (SAR) 的图形访问。三维 (3D) AL 可以被概念化为化学空间的二维 (2D) 投影,其中添加了一个插值活性表面作为第三个维度。这种 3D AL 对于 SAR 可视化特别直观。在这项工作中,基于不同的投影方法和指纹描述符生成了 3D AL,并对它们的拓扑结构进行了比较。此外,超越定性分析,还研究了 3D AL 在半定量和定量效力预测中的应用。神经尺度(NeuroScale)是多维尺度(multidimensional scaling)的神经网络变体,与高斯过程回归(Gaussian process regression,GPR)相结合,被确定为生成 3D AL 的首选方法,该方法可以高精度地解释训练化合物及其 SAR 特征。另一方面,无论应用何种投影方法,GPR 引起的过拟合通常会限制效力值预测的准确性。然而,3D AL 能够可靠地将具有不同效力水平的测试化合物映射到相应的 AL 区域。NeuroScale 模型实现了最准确的映射。总的来说,我们的分析结果表明,3D AL 具有很高的潜力,可以用于图形化 SAR 探索和识别有效测试化合物。