Suppr超能文献

星门 GTM:连接描述符和活动空间。

Stargate GTM: Bridging Descriptor and Activity Spaces.

机构信息

Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg , 1 rue Blaise Pascal, Strasbourg 67000, France.

Faculty of Physics, M.V. Lomonosov Moscow State University , Leninskie Gory, Moscow 119991, Russia.

出版信息

J Chem Inf Model. 2015 Nov 23;55(11):2403-10. doi: 10.1021/acs.jcim.5b00398. Epub 2015 Oct 20.

Abstract

Predicting the activity profile of a molecule or discovering structures possessing a specific activity profile are two important goals in chemoinformatics, which could be achieved by bridging activity and molecular descriptor spaces. In this paper, we introduce the "Stargate" version of the Generative Topographic Mapping approach (S-GTM) in which two different multidimensional spaces (e.g., structural descriptor space and activity space) are linked through a common 2D latent space. In the S-GTM algorithm, the manifolds are trained simultaneously in two initial spaces using the probabilities in the 2D latent space calculated as a weighted geometric mean of probability distributions in both spaces. S-GTM has the following interesting features: (1) activities are involved during the training procedure; therefore, the method is supervised, unlike conventional GTM; (2) using molecular descriptors of a given compound as input, the model predicts a whole activity profile, and (3) using an activity profile as input, areas populated by relevant chemical structures can be detected. To assess the performance of S-GTM prediction models, a descriptor space (ISIDA descriptors) of a set of 1325 GPCR ligands was related to a B-dimensional (B = 1 or 8) activity space corresponding to pKi values for eight different targets. S-GTM outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.

摘要

预测分子的活性分布或发现具有特定活性分布的结构是化学生信学中的两个重要目标,可以通过连接活性和分子描述符空间来实现。在本文中,我们引入了生成拓扑映射方法(GTM)的“Stargate”版本(S-GTM),其中两个不同的多维空间(例如,结构描述符空间和活性空间)通过一个共同的 2D 潜在空间连接。在 S-GTM 算法中,流形在两个初始空间中同时使用 2D 潜在空间中计算的概率进行训练,该概率是两个空间中概率分布的加权几何平均值。S-GTM 具有以下有趣的特点:(1)在训练过程中涉及活性,因此该方法是有监督的,与传统的 GTM 不同;(2)使用给定化合物的分子描述符作为输入,模型预测整个活性分布,以及(3)使用活性分布作为输入,可以检测到相关化学结构所在的区域。为了评估 S-GTM 预测模型的性能,一组 1325 个 GPCR 配体的描述符空间(ISIDA 描述符)与一个 B 维(B=1 或 8)活性空间相关联,该活性空间对应于八个不同靶标上的 pKi 值。对于单个活性,S-GTM 优于传统的 GTM,并且与 Lasso 多任务学习算法表现相似,尽管它仍然略低于随机森林方法的准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验