Elhefnawy Wessam, Li Min, Wang Jianxin, Li Yaohang
Department of Computer Science, Old Dominion University, Norfolk, U.S.A.
Department of Computer Science, Central South University, Changsha, China.
BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):203. doi: 10.1186/s12859-020-3504-z.
One of the most essential problems in structural bioinformatics is protein fold recognition. In this paper, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features at fragment level to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multi-modal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolutional neural network (CNN) to classify the fragment vector into the corresponding fold.
Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition.
There is a set of fragments that can serve as structural "keywords" distinguishing between major protein folds. The deep learning architecture in DeepFrag-k is able to accurately identify these fragments as structure features to improve protein fold recognition.
蛋白质折叠识别是结构生物信息学中最关键的问题之一。在本文中,我们设计了一种新颖的深度学习架构,即DeepFrag-k,它在片段水平上识别折叠判别特征,以提高蛋白质折叠识别的准确性。DeepFrag-k由两个阶段组成:第一阶段采用多模态深度信念网络(DBN),根据序列预测潜在的结构片段,将其表示为片段向量,然后第二阶段使用深度卷积神经网络(CNN)将片段向量分类到相应的折叠中。
我们的结果表明,DeepFrag-k在预测最流行的前100个片段时准确率达到92.98%,这些片段可用于生成判别性片段特征向量,以改善蛋白质折叠识别。
存在一组片段可作为区分主要蛋白质折叠的结构“关键词”。DeepFrag-k中的深度学习架构能够准确地将这些片段识别为结构特征,以改善蛋白质折叠识别。