Suppr超能文献

OctSurf:用于蛋白质-配体亲和力预测的基于体素的高效分层分子表面表示法。

OctSurf: Efficient hierarchical voxel-based molecular surface representation for protein-ligand affinity prediction.

作者信息

Liu Qinqing, Wang Peng-Shuai, Zhu Chunjiang, Gaines Blake Blumenfeld, Zhu Tan, Bi Jinbo, Song Minghu

机构信息

Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06279, USA.

Microsoft Research Asia, Beijing, China.

出版信息

J Mol Graph Model. 2021 Jun;105:107865. doi: 10.1016/j.jmgm.2021.107865. Epub 2021 Feb 9.

Abstract

Voxel-based 3D convolutional neural networks (CNNs) have been applied to predict protein-ligand binding affinity. However, the memory usage and computation cost of these voxel-based approaches increase cubically with respect to spatial resolution and sometimes make volumetric CNNs intractable at higher resolutions. Therefore, it is necessary to develop memory-efficient alternatives that can accelerate the convolutional operation on 3D volumetric representations of the protein-ligand interaction. In this study, we implement a novel volumetric representation, OctSurf, to characterize the 3D molecular surface of protein binding pockets and bound ligands. The OctSurf surface representation is built based on the octree data structure, which has been widely used in computer graphics to efficiently represent and store 3D object data. Vanilla 3D-CNN approaches often divide the 3D space of objects into equal-sized voxels. In contrast, OctSurf recursively partitions the 3D space containing the protein-ligand pocket into eight subspaces called octants. Only those octants containing van der Waals surface points of protein or ligand atoms undergo the recursive subdivision process until they reach the predefined octree depth, whereas unoccupied octants are kept intact to reduce the memory cost. Resulting non-empty leaf octants approximate molecular surfaces of the protein pocket and bound ligands. These surface octants, along with their chemical and geometric features, are used as the input to 3D-CNNs. Two kinds of CNN architectures, VGG and ResNet, are applied to the OctSurf representation to predict binding affinity. The OctSurf representation consumes much less memory than the conventional voxel representation at the same resolution. By restricting the convolution operation to only octants of the smallest size, our method also alleviates the overall computational overhead of CNN. A series of experiments are performed to demonstrate the disk storage and computational efficiency of the proposed learning method. Our code is available at the following GitHub repository: https://github.uconn.edu/mldrugdiscovery/OctSurf.

摘要

基于体素的3D卷积神经网络(CNN)已被用于预测蛋白质-配体结合亲和力。然而,这些基于体素的方法的内存使用和计算成本随着空间分辨率呈三次方增长,有时会使体积CNN在较高分辨率下难以处理。因此,有必要开发内存高效的替代方法,以加速对蛋白质-配体相互作用的3D体积表示进行卷积运算。在本研究中,我们实现了一种新颖的体积表示方法OctSurf,以表征蛋白质结合口袋和结合配体的3D分子表面。OctSurf表面表示基于八叉树数据结构构建,该结构已在计算机图形学中广泛用于高效表示和存储3D对象数据。传统的3D-CNN方法通常将对象的3D空间划分为大小相等的体素。相比之下,OctSurf将包含蛋白质-配体口袋的3D空间递归划分为八个称为卦限的子空间。只有那些包含蛋白质或配体原子范德华表面点的卦限才会经历递归细分过程,直到达到预定义的八叉树深度,而未占用的卦限则保持不变以降低内存成本。最终得到的非空叶卦限近似于蛋白质口袋和结合配体的分子表面。这些表面卦限及其化学和几何特征被用作3D-CNN的输入。两种CNN架构,VGG和ResNet,被应用于OctSurf表示以预测结合亲和力。在相同分辨率下,OctSurf表示比传统体素表示消耗的内存少得多。通过将卷积运算限制在最小尺寸的卦限上,我们的方法还减轻了CNN的整体计算开销。进行了一系列实验来证明所提出的学习方法在磁盘存储和计算效率方面的优势。我们的代码可在以下GitHub仓库获取:https://github.uconn.edu/mldrugdiscovery/OctSurf。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验