Zhang Buzhong, Zheng Meili, Zhang Yuzhou, Quan Lijun
School of Computer and Information, Anqing Normal University, Anqing, China.
Jiangsu Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou, China.
Front Bioinform. 2024 Oct 18;4:1477909. doi: 10.3389/fbinf.2024.1477909. eCollection 2024.
The dihedral angle of the protein backbone can describe the main structure of the protein, which is of great significance for determining the protein structure. Many computational methods have been proposed to predict this critically important protein structure, including deep learning. However, these heavyweight methods require more computational resources, and the training time becomes intolerable. In this article, we introduce a novel lightweight method, named dilated convolution and multi-head attention (DCMA), that predicts protein backbone torsion dihedral angles . DCMA is stacked by five layers of two hybrid inception blocks and one multi-head attention block (I2A1) module. The hybrid inception blocks consisting of multi-scale convolutional neural networks and dilated convolutional neural networks are designed for capturing local and long-range sequence-based features. The multi-head attention block supplementally strengthens this operation. The proposed DCMA is validated on public critical assessment of protein structure prediction (CASP) benchmark datasets. Experimental results show that DCMA obtains better or comparable generalization performance. Compared to best-so-far methods, which are mostly ensemble models and constructed of recurrent neural networks, DCMA is an individual model that is more lightweight and has a shorter training time. The proposed model could be applied as an alternative method for predicting other protein structural features.
蛋白质主链的二面角可以描述蛋白质的主要结构,这对于确定蛋白质结构具有重要意义。已经提出了许多计算方法来预测这一极其重要的蛋白质结构,包括深度学习。然而,这些重量级方法需要更多的计算资源,并且训练时间变得令人难以忍受。在本文中,我们介绍了一种新颖的轻量级方法,称为扩张卷积和多头注意力(DCMA),它可以预测蛋白质主链扭转二面角。DCMA由五层两个混合 inception 块和一个多头注意力块(I2A1)模块堆叠而成。由多尺度卷积神经网络和扩张卷积神经网络组成的混合 inception 块旨在捕获基于局部和长序列的特征。多头注意力块对这一操作起到补充强化作用。所提出的 DCMA 在蛋白质结构预测关键评估(CASP)基准数据集上进行了验证。实验结果表明,DCMA 获得了更好或相当的泛化性能。与目前为止最好的方法相比,这些方法大多是集成模型且由递归神经网络构建,DCMA 是一个更轻量级且训练时间更短的单个模型。所提出的模型可以作为预测其他蛋白质结构特征的替代方法。