Suppr超能文献

基于拓扑距离的电子相互作用张量,用于在类药物化合物上应用卷积神经网络。

Topological Distance-Based Electron Interaction Tensor to Apply a Convolutional Neural Network on Drug-like Compounds.

作者信息

Shin Hyun Kil

机构信息

Department of Predictive Toxicology, Korea Institute of Toxicology, Daejeon 34114, Republic of Korea.

Human and Environmental Toxicology, University of Science and Technology, Daejeon 34113, Republic of Korea.

出版信息

ACS Omega. 2021 Dec 15;6(51):35757-35768. doi: 10.1021/acsomega.1c05693. eCollection 2021 Dec 28.

Abstract

Deep learning (DL) models in quantitative structure-activity relationship fed the molecular structure directly to the network without using human-designed descriptors by representing molecule as a graph or string (e.g., SMILES code). However, these two representations were oversimplification of real molecules to reflect chemical properties of molecular structures. Given that the choice of molecular representation determines the architecture of the DL model to apply, a novel way of molecular representation can open a way to apply diverse DL networks developed and used in other fields. A topological distance-based electron interaction (TDEi) tensor has been developed in this study inspired by the quantum mechanical model of the molecule, which defines a molecule with electrons and protons. In the TDEi tensor, the atomic orbital (AO) of each atom is represented by an electron configuration (EC) vector, which is a bit string based on the presence and absence of electrons in each AO according to spin indicated by positive and negative signs. Interactions between EC vectors were calculated based on the topological distance between atoms in a molecule. As a molecular structure was translated into 3D array, CNN models (modified VGGNet) were applied using a TDEi tensor to predict four physicochemical properties of drug-like compound datasets: MP (275,131), Lipop (4193), Esol (1127), and Freesolv (639). Models achieved good prediction accuracy. PCA showed that a stronger correlation was observed between the extracted features and the target endpoint as features were extracted from the deeper layer.

摘要

定量构效关系中的深度学习(DL)模型通过将分子表示为图形或字符串(例如SMILES编码),直接将分子结构输入网络,而无需使用人工设计的描述符。然而,这两种表示形式都是对真实分子的过度简化,无法反映分子结构的化学性质。鉴于分子表示形式的选择决定了要应用的DL模型的架构,一种新颖的分子表示方式可以为应用在其他领域开发和使用的各种DL网络开辟道路。本研究受分子量子力学模型的启发,开发了一种基于拓扑距离的电子相互作用(TDEi)张量,该模型用电子和质子定义分子。在TDEi张量中,每个原子的原子轨道(AO)由电子构型(EC)向量表示,EC向量是一个基于每个AO中电子的存在与否(根据正负号表示的自旋)的位串。基于分子中原子之间的拓扑距离计算EC向量之间的相互作用。当分子结构被转换为三维数组时,使用TDEi张量应用卷积神经网络(CNN)模型(改进的VGGNet)来预测类药物化合物数据集的四种物理化学性质:熔点(MP,275131个数据点)、脂水分配系数(Lipop,4193个数据点)、溶解度(Esol,1127个数据点)和自由能(Freesolv,639个数据点)。模型取得了良好的预测准确性。主成分分析(PCA)表明,随着从更深层提取特征,提取的特征与目标终点之间观察到更强的相关性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e57/8717557/ddc21ab2a577/ao1c05693_0002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验