Visani Gian Marco, Galvin William, Pun Michael N, Nourmohammad Armita
Paul G. Allen School of Computer Science and Engineering, University of Washington.
Department of Physics, University of Washington.
ArXiv. 2023 Nov 28:arXiv:2311.09312v2.
Accurately modeling protein 3D structure is essential for the design of functional proteins. An important sub-task of structure modeling is protein side-chain packing: predicting the conformation of side-chains (rotamers) given the protein's backbone structure and amino-acid sequence. Conventional approaches for this task rely on expensive sampling procedures over hand-crafted energy functions and rotamer libraries. Recently, several deep learning methods have been developed to tackle the problem in a data-driven way, albeit with vastly different formulations (from image-to-image translation to directly predicting atomic coordinates). Here, we frame the problem as a joint regression over the side-chains' true degrees of freedom: the dihedral angles. We carefully study possible objective functions for this task, while accounting for the underlying symmetries of the task. We propose (H-Packer), a novel two-stage algorithm for side-chain packing built on top of two light-weight rotationally equivariant neural networks. We evaluate our method on CASP13 and CASP14 targets. H-Packer is computationally efficient and shows favorable performance against conventional physics-based algorithms and is competitive against alternative deep learning solutions.
准确地对蛋白质三维结构进行建模对于功能性蛋白质的设计至关重要。结构建模的一个重要子任务是蛋白质侧链堆积:在给定蛋白质主链结构和氨基酸序列的情况下预测侧链(旋转异构体)的构象。解决此任务的传统方法依赖于对手工制作的能量函数和旋转异构体库进行昂贵的采样过程。最近,已经开发了几种深度学习方法以数据驱动的方式解决该问题,尽管其公式有很大不同(从图像到图像的转换到直接预测原子坐标)。在这里,我们将问题构建为对侧链的真实自由度:二面角的联合回归。我们仔细研究了此任务可能的目标函数,同时考虑了任务的潜在对称性。我们提出了(H-Packer),这是一种基于两个轻量级旋转等变神经网络构建的用于侧链堆积的新颖两阶段算法。我们在CASP13和CASP14目标上评估了我们的方法。H-Packer计算效率高,相对于传统的基于物理的算法表现出良好的性能,并且与其他深度学习解决方案具有竞争力。