Department of Computer Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, China, 200240.
Healthcare Intelligence, AI Center, Alibaba Group DAMO Academy, China, 310000.
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac231.
Graph neural networks (GNNs) are the most promising deep learning models that can revolutionize non-Euclidean data analysis. However, their full potential is severely curtailed by poorly represented molecular graphs and features. Here, we propose a multiphysical graph neural network (MP-GNN) model based on the developed multiphysical molecular graph representation and featurization. All kinds of molecular interactions, between different atom types and at different scales, are systematically represented by a series of scale-specific and element-specific graphs with distance-related node features. From these graphs, graph convolution network (GCN) models are constructed with specially designed weight-sharing architectures. Base learners are constructed from GCN models from different elements at different scales, and further consolidated together using both one-scale and multi-scale ensemble learning schemes. Our MP-GNN has two distinct properties. First, our MP-GNN incorporates multiscale interactions using more than one molecular graph. Atomic interactions from various different scales are not modeled by one specific graph (as in traditional GNNs), instead they are represented by a series of graphs at different scales. Second, it is free from the complicated feature generation process as in conventional GNN methods. In our MP-GNN, various atom interactions are embedded into element-specific graph representations with only distance-related node features. A unique GNN architecture is designed to incorporate all the information into a consolidated model. Our MP-GNN has been extensively validated on the widely used benchmark test datasets from PDBbind, including PDBbind-v2007, PDBbind-v2013 and PDBbind-v2016. Our model can outperform all existing models as far as we know. Further, our MP-GNN is used in coronavirus disease 2019 drug design. Based on a dataset with 185 complexes of inhibitors for severe acute respiratory syndrome coronavirus (SARS-CoV/SARS-CoV-2), we evaluate their binding affinities using our MP-GNN. It has been found that our MP-GNN is of high accuracy. This demonstrates the great potential of our MP-GNN for the screening of potential drugs for SARS-CoV-2. Availability: The Multiphysical graph neural network (MP-GNN) model can be found in https://github.com/Alibaba-DAMO-DrugAI/MGNN. Additional data or code will be available upon reasonable request.
图神经网络 (GNN) 是最有前途的深度学习模型,可以彻底改变非欧几里得数据分析。然而,它们的全部潜力受到分子图和特征表示不佳的严重限制。在这里,我们提出了一种基于开发的多物理分子图表示和特征化的多物理图神经网络 (MP-GNN) 模型。不同原子类型之间以及不同尺度之间的各种分子相互作用,通过一系列具有距离相关节点特征的特定于尺度和特定于元素的图系统地表示。从这些图中,使用专门设计的权重共享架构构建图卷积网络 (GCN) 模型。基学习者由来自不同尺度不同元素的 GCN 模型构建,并使用单尺度和多尺度集成学习方案进一步结合在一起。我们的 MP-GNN 具有两个独特的特性。首先,我们的 MP-GNN 使用多个分子图来合并多尺度相互作用。来自不同尺度的原子相互作用不是通过一个特定的图来建模(如传统的 GNN 那样),而是通过一系列不同尺度的图来表示。其次,它不受传统 GNN 方法中复杂特征生成过程的限制。在我们的 MP-GNN 中,各种原子相互作用被嵌入到具有仅距离相关节点特征的特定于元素的图表示中。一个独特的 GNN 架构被设计用来将所有信息合并到一个统一的模型中。我们的 MP-GNN 已经在广泛使用的 PDBbind 基准测试数据集上进行了广泛验证,包括 PDBbind-v2007、PDBbind-v2013 和 PDBbind-v2016。据我们所知,我们的模型可以优于所有现有的模型。此外,我们的 MP-GNN 用于 2019 年冠状病毒病药物设计。基于一个包含 185 个严重急性呼吸系统综合征冠状病毒 (SARS-CoV/SARS-CoV-2) 抑制剂复合物的数据集,我们使用我们的 MP-GNN 评估它们的结合亲和力。结果表明,我们的 MP-GNN 具有很高的准确性。这表明我们的 MP-GNN 具有很大的潜力,可用于筛选针对 SARS-CoV-2 的潜在药物。可用性:多物理图神经网络 (MP-GNN) 模型可在 https://github.com/Alibaba-DAMO-DrugAI/MGNN 找到。可根据合理要求提供其他数据或代码。